understanding database terminology

10
 Understanding Dat abase T erminology A computer cannot process data unless it is organized in special ways; into characters, fields, records, files and databases. After reading this lesson, you should be able to: Define the key terms needed to understand what a database is and how it is used. Identify the purpose and role of characters in data processing. Identify the purpose and role of fields in data processing. Identify the purpose and role of records in data processing. Identify the purpose and role of database files in data processing. Identify the purpose and role of databases in data processing. Identify the purpose and role of data management systems in data processing. Identify the purpose and role of keys in data processing. Character A character is the most basic element of data that can be observed and manipulated. Behind it are the invisible data elements we call bits and bytes, referring to physical storage elements used by the computer hardware. A character is a single symbol such as a digit, letter, or other special character (e.g., $, #, and ?). Field A field contains an item of data; that is, a character, or group of characters that are related. For instance, a grouping of related text characters such as "John Smith" makes up a name in the name field. Let's look at another example. Suppose a political action group advocating gun control in Pennsylvania is compiling the names and addresses of potential supporters for their new mailing list. For each person, they must identify the name, address, city, state, zip code and telephone number. A field would be established for each type of information in the list. The name field would contain all of the letters of the first and last name. The zip code field would hold all of the digits of a person's zip code, and so on. In summary, a field may contain an attribute (e.g., employee salary) or the name of an entity (e.g., person, place, or event).

Upload: dilendra-bhatt

Post on 16-Jul-2015

24 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Understanding Database Terminology

5/13/2018 Understanding Database Terminology - slidepdf.com

http://slidepdf.com/reader/full/understanding-database-terminology 1/10

Understanding Database Terminology

A computer cannot process data unless it is organized in special ways; into characters, fields,records, files and databases.

After reading this lesson, you should be able to:

• Define the key terms needed to understand what a database is and how it is used.• Identify the purpose and role of characters in data processing.• Identify the purpose and role of fields in data processing.• Identify the purpose and role of records in data processing.• Identify the purpose and role of database files in data processing.• Identify the purpose and role of databases in data processing.• Identify the purpose and role of data management systems in data processing.• Identify the purpose and role of keys in data processing.

Character

A character is the most basic element of data that can be observed and manipulated. Behind it arethe invisible data elements we call bits and bytes, referring to physical storage elements used bythe computer hardware. A character is a single symbol such as a digit, letter, or other specialcharacter (e.g., $, #, and ?).

Field

A field contains an item of data; that is, a character, or group of characters that are related. For instance, a grouping of related text characters such as "John Smith" makes up a name in the

name field. Let's look at another example. Suppose a political action group advocating guncontrol in Pennsylvania is compiling the names and addresses of potential supporters for their new mailing list. For each person, they must identify the name, address, city, state, zip code andtelephone number. A field would be established for each type of information in the list. Thename field would contain all of the letters of the first and last name. The zip code field wouldhold all of the digits of a person's zip code, and so on. In summary, a field may contain anattribute (e.g., employee salary) or the name of an entity (e.g., person, place, or event).

Page 2: Understanding Database Terminology

5/13/2018 Understanding Database Terminology - slidepdf.com

http://slidepdf.com/reader/full/understanding-database-terminology 2/10

Record

A record is composed of a group of related fields. As another way of saying it, a record containsa collection of attributes related to an entity such as a person or product. Looking at the list of  potential gun control supporters, the name, address, zip code and telephone number of a single

individual would constitute a record. A payroll record would contain the name, address, socialsecurity number, and title of each employee.

Database File

As we move up the ladder, a database file is defined as a collection of related records. A database

file is sometimes called a table. A file may be composed of a complete list of individuals on amailing list, including their addresses and telephone numbers. Files are frequently categorized bythe purpose or application for which they are intended. Some common examples include mailinglists, quality control files, inventory files, or document files. Files may also be classified by thedegree of permanence they have. Transition files are only temporary, while master files are muchmore long-lived.

Database

Page 3: Understanding Database Terminology

5/13/2018 Understanding Database Terminology - slidepdf.com

http://slidepdf.com/reader/full/understanding-database-terminology 3/10

Organizations and individuals use databases to bring independent sources of data together andstore them electronically. Thus, a database is composed of related files that are consolidated,organized and stored together. One collection of related files might pertain to employeeinformation. Another collection of related files might contain sports statistics.

Organizations and individuals may have and use many different databases, depending on thenature of the work involved. For example, a library database might consist of several related, butseparate, databases including book titles and author names, book description, books on order, books checked out, and similar sets of information. Most organizations have product informationdatabases, customer databases, and human resource databases that contain information aboutemployees, salaries, home address, stock purchase plans, and tax deduction information. In eachcase, the data stored in a database is independent from the application programs which use and process the data.

Data Management System

Data management systems are used to access and manipulate data in a database. A databasemanagement system is a software package that enables users to edit, link, and update files asneeds dictate. Database management systems will be discussed in greater detail in another lesson.

Key

In order to track and analyze data effectively, each record requires a unique identifier or what iscalled a key. The key must be completely unique to a particular record just as each individual hasa unique social security number assigned to them. In fact, social security numbers are often usedas keys in large databases. You might think that the name field would be a good choice for a key

in a mailing list. However, this would not be a good choice because some people might have thesame name. A key must be identified or assigned to each record for computerized information processing to function correctly. An existing field may be used if the entries are entirely unique,such as a social security or telephone number. In most cases, a new field will be developed tohold a key, such as a customer number or product number.

Database Management System (DBMS)

Page 4: Understanding Database Terminology

5/13/2018 Understanding Database Terminology - slidepdf.com

http://slidepdf.com/reader/full/understanding-database-terminology 4/10

DBMSs are the technology tools that directly support managing organizational data. Witha DBMS you can create a database including its logical structure and constraints, you canmanipulate the data and information it contains, or you can directly create a simple databaseapplication or reporting tool. Human administrators, through a user interface, perform certaintasks with the tool such as creating a database, converting an existing database, or archiving a

large and growing database. Business applications, which perform the higher level tasks of managing business processes, interact with end users and other applications and, to store andmanage data, rely on and directly operate their own underlying database through a standard programming interface like ODBC.

The following diagram illustrates the five components of a DBMS.

 

Database Engine:

The Database Engine is the core service for storing, processing, and securing data. TheDatabase Engine provides controlled access and rapid transaction processing to meet therequirements of the most demanding data consuming applications within your enterprise. Use theDatabase Engine to create relational databases for online transaction processing or onlineanalytical processing data. This includes creating tables for storing data, and database objectssuch as indexes, views, and stored procedures for viewing, managing, and securing data. You

can use SQL Server Management Studio to manage the database objects, and SQL Server Profiler for capturing server events.

Data dictionary:

Page 5: Understanding Database Terminology

5/13/2018 Understanding Database Terminology - slidepdf.com

http://slidepdf.com/reader/full/understanding-database-terminology 5/10

A data dictionary is a reserved space within a database which is used to store informationabout the database itself. A data dictionary is a set of table and views which can only be read andnever altered. Most data dictionaries contain different information about the data used in theenterprise. In terms of the database representation of the data, the data table defines all schemaobjects including views, tables, clusters, indexes, sequences, synonyms, procedures, packages,

functions, triggers and many more. This will ensure that all these things follow one standarddefined in the dictionary. The data dictionary also defines how much space has been allocated for and / or currently in used by all the schema objects. A data dictionary is used when findinginformation about users, objects, schema and storage structures. Every time a data definitionlanguage (DDL) statement is issued, the data dictionary becomes modified.

A data dictionary may contain information such as:

• Database design information• Stored SQL procedures• User permissions•

User statistics• Database process information• Database growth statistics• Database performance statistics

Query Processor:

A relational database consists of many parts, but at its heart are two major components:the storage engine and the query processor. The storage engine writes data to and reads data fromthe disk. It manages records, controls concurrency, and maintains log files.The query processor accepts SQL syntax, selects a plan for executing the syntax, and then executes the chosen plan.

The user or program interacts with the query processor, and the query processor in turn interactswith the storage engine. The query processor isolates the user from the details of execution: Theuser specifies the result, and the query processor determines how this result is obtained.The query processor components include

• DDL interpreter • DML compiler • Query evaluation engine

Report writer:

Also called a report generator, a program, usually part of a database management system,that extracts information from one or more files and presents the information in a specifiedformat. Most report writers allow you to select records that meet certain conditions and todisplay selected fields in rows and columns. You can also format data into pie charts, bar charts,and other diagrams. Once you have created a format for a report, you can save the formatspecifications in a file and continue reusing it for new data.

Page 6: Understanding Database Terminology

5/13/2018 Understanding Database Terminology - slidepdf.com

http://slidepdf.com/reader/full/understanding-database-terminology 6/10

Lesson 5: Types of Database Management Systems

DBMSs come in many shapes and sizes. For a few hundred dollars, you can purchase a DBMSfor your desktop computer. For larger computer systems, much more expensive DBMSs arerequired. Many mainframe-based DBMSs are leased by organizations. DBMSs of this scale are

highly sophisticated and would be extremely expensive to develop from scratch. Therefore, it ischeaper for an organization to lease such a DBMS program than to develop it. Since there are avariety of DBMSs available, you should know some of the basic features, as well as strengthsand weaknesses, of the major types.

After reading this lesson, you should be able to:

• Compare and contrast the structure of different database management systems.• Define hierarchical databases.• Define network databases.• Define relational databases.•

Define object-oriented databases.

Types of DBMS: Hierarchical Databases

There are four structural types of database management systems: hierarchical, network,relational, and object-oriented.

Hierarchical Databases (DBMS), commonly used on mainframe computers, have been aroundfor a long time. It is one of the oldest methods of organizing and storing data, and it is still used by some organizations for making travel reservations. A hierarchical database is organized in pyramid fashion, like the branches of a tree extending downwards. Related fields or records aregrouped together so that there are higher-level records and lower-level records, just like the parents in a family tree sit above the subordinated children.

Page 7: Understanding Database Terminology

5/13/2018 Understanding Database Terminology - slidepdf.com

http://slidepdf.com/reader/full/understanding-database-terminology 7/10

Based on this analogy, the parent record at the top of the pyramid is called the root record. Achild record always has only one parent record to which it is linked, just like in a normal familytree. In contrast, a parent record may have more than one child record linked to it. Hierarchicaldatabases work by moving from the top down. A record search is conducted by starting at the topof the pyramid and working down through the tree from parent to child until the appropriate

child record is found. Furthermore, each child can also be a parent with children underneath it.

The advantage of hierarchical databases is that they can be accessed and updated rapidly becausethe tree-like structure and the relationships between records are defined in advance. However,this feature is a two-edged sword. The disadvantage of this type of database structure is that eachchild in the tree may have only one parent, and relationships or linkages between children are not permitted, even if they make sense from a logical standpoint. Hierarchical databases are so rigidin their design that adding a new field or record requires that the entire database be redefined.

Types of DBMS: Network Databases

Network databases are similar to hierarchical databases by also having a hierarchical structure.There are a few key differences, however. Instead of looking like an upside-down tree, a network database looks more like a cobweb or interconnected network of records. In network databases,children are called members and parents are called owners. The most important difference isthat each child or member can have more than one parent (or owner).

Like hierarchical databases, network databases are principally used on mainframe computers.Since more connections can be made between different types of data, network databases areconsidered more flexible. However, two limitations must be considered when using this kind of database. Similar to hierarchical databases, network databases must be defined in advance. Thereis also a limit to the number of connections that can be made between records.

Types of DBMS: Relational Databases

Page 8: Understanding Database Terminology

5/13/2018 Understanding Database Terminology - slidepdf.com

http://slidepdf.com/reader/full/understanding-database-terminology 8/10

In relational databases, the relationship between data files is relational, not hierarchical.Hierarchical and network databases require the user to pass down through a hierarchy in order toaccess needed data. Relational databases connect data in different files by using common dataelements or a key field. Data in relational databases is stored in different tables, each having akey field that uniquely identifies each row. Relational databases are more flexible than either thehierarchical or network database structures. In relational databases, tables or files filled with dataare called relations, tuples designates a row or record, and columns are referred to as attributes

or fields.

Relational databases work on the principle that each table has a key field that uniquely identifieseach row, and that these key fields can be used to connect one table of data to another. Thus, one

table might have a row consisting of a customer account number as the key field along withaddress and telephone number. The customer account number in this table could be linked toanother table of data that also includes customer account number (a key field), but in this case,contains information about product returns, including an item number (another key field). Thiskey field can be linked to another table that contains item numbers and other product informationsuch as production location, color, quality control person, and other data. Therefore, using thisdatabase, customer information can be linked to specific product information.

The relational database has become quite popular for two major reasons. First, relationaldatabases can be used with little or no training. Second, database entries can be modified withoutredefining the entire structure. The downside of using a relational database is that searching for 

data can take more time than if other methods are used.

Lesson 8: Data Mining, Data Warehousing, and Data Marts

Over the years, many large organizations have accumulated massive amounts of data about their customers, suppliers, products, and services. Even many new Web-based companies haveamassed large databases about people and products as they have grown. The WWW is itself alarge distributed data repository with untold potential. With the growing realization that these

Page 9: Understanding Database Terminology

5/13/2018 Understanding Database Terminology - slidepdf.com

http://slidepdf.com/reader/full/understanding-database-terminology 9/10

vast data resources can be tapped for significant commercial gain, interest in data mining, datawarehousing, and data marts has virtually exploded.

After reading this lesson, you should be able to:

Compare data mining, data warehousing, and data marts.• Describe the purpose and value of data mining.• Describe the purpose and value of data warehousing.• Describe the purpose and value of data marts.

Data Mining (DM)

Data mining, also known as "knowledge discovery," refers to computer-assisted tools andtechniques for sifting through and analyzing these vast data stores in order to find trends,  patterns, and correlations that can guide decision making and increase understanding. Datamining covers a wide variety of uses, from analyzing customer purchases to discovering

galaxies. In essence, data mining is the equivalent of finding gold nuggets in a mountain of data.The monumental task of finding hidden gold depends heavily upon the power of computers.

Applications of Data Mining

Data mining includes a variety of interesting applications. A few examples are listed below:

• By recording the activity of shoppers in an online store, such as Amazon.com, over time,retailers can use knowledge of these patterns to improve the placement of items in thelayout of a mail-order catalog page or Web page.

• Telephone companies mine customer billing data to identify customers who spend

considerably more than average on their monthly phone bill. The company can thentarget these customers to sell additional services.• Marketers can effectively target the wants and needs of specific consumer groups by

analyzing data about customer preferences and buying patterns.• Hospitals use data mining to identify groups of people whose healthcare costs are likely

to increase in the near future so that preventative steps can be taken.

Data Mining Summarized

In summary, the purpose of DM is to analyze and understand past trends and predict futuretrends. By predicting future trends, business organizations can better position their products and

services for financial gain. Nonprofit organizations have also achieved significant benefits fromdata mining, such as in the area of scientific progress.

The concept of data mining is simple yet powerful. The simplicity of the concept is deceiving,however. Traditional methods of analyzing data, involving query-and-report approaches, cannothandle tasks of such magnitude and complexity.

The Need for Data Warehousing and Data Marts

Page 10: Understanding Database Terminology

5/13/2018 Understanding Database Terminology - slidepdf.com

http://slidepdf.com/reader/full/understanding-database-terminology 10/10

The majority of databases are designed to hold the current data needed by an organization to perform its business activities. In a business organization, current data might include informationconcerning bills due, inventory levels, and product orders, and would most likely be contained ina billing/inventory/order database. In most cases, the minute that data become outdated, they aredeleted from the database. For example, once a bill is paid, data about the bill is removed.

Fortunately, many organizations have realized the value of being able to analyze historical datain order to discover patterns of behavior and predict future trends. For example, analyzinghistorical data can tell a retailer what items were ordered, in what quantities, and by whichcustomers.

One of the keys to understanding the value of databases is to understand how one database,whether it is current or historical, can be related to another. If you think about it, it makes good business sense to relate customer data to inventory data (because customers place orders thataffect inventory), and inventory data to supplier data (because suppliers provide inventoryitems). We could name many more examples like this. The problem with most databases is theyare not designed to be accessed simultaneously in this fashion.

Data Warehousing and Data Marts

Many organizations now use data warehouses to bring multiple databases together and makethem available for data mining and other forms of analysis. A data warehouse is a collection of data, usually current and historical, from multiple databases that the organization can use for analysis and decision making. The purpose, of course, is to bring key sets of data about or used by the organization into one place.

Bringing together so much data into a data warehouse makes analysis very difficult. To addressthis problem, organizations use what are called data marts. Data marts are related sets of data that

are grouped together and separated out from the main body of data in the data warehouse. Datamarts are designed to be made available to specific sets of users. For example, data aboutmanufacturing can be put into a data mart and be made available to the production department.Human resource data can be put into another data mart and be provided to the human resourcesemployees. This approach makes it easier for each group or constituency in the organization toaccess the data they need.