fy06 sql intro

242
Community inTouch | SQL Server 2005 Community inTouch | SQL Server 2005 Developing World Class Database Developing World Class Database Applications Applications Risman Adnan Risman Adnan Developer and Platform Developer and Platform Evangelist Evangelist Microsoft Indonesia Microsoft Indonesia [email protected] [email protected]

Upload: blob232

Post on 25-Jan-2016

32 views

Category:

Documents


2 download

DESCRIPTION

sql

TRANSCRIPT

Page 1: Fy06 SQL Intro

Community inTouch | SQL Server 2005Community inTouch | SQL Server 2005Developing World Class Database Developing World Class Database

ApplicationsApplications

Risman AdnanRisman AdnanDeveloper and Platform EvangelistDeveloper and Platform EvangelistMicrosoft IndonesiaMicrosoft [email protected]@microsoft.com

Page 2: Fy06 SQL Intro

AgendaAgenda

InstallationInstallation

Optimal DatabaseOptimal Database

Database DesignDatabase Design

Developing on SQL Server 2005Developing on SQL Server 2005

CachingCaching

Tuning, Monitoring, and Diagnosing Tuning, Monitoring, and Diagnosing System ProblemsSystem Problems

Page 3: Fy06 SQL Intro

InstallationInstallation

Which version has the features you Which version has the features you need?need?

Server InstallationServer Installation

Database InstallationDatabase Installation

Page 4: Fy06 SQL Intro

XML supportXML support SELECT … FOR XMLSELECT … FOR XML OpenXML OpenXML XML ViewsXML Views XML UpdategramsXML Updategrams XML View MapperXML View Mapper XML Bulk LoadXML Bulk Load

URL and HTTP db accessURL and HTTP db access HTTP access to cubesHTTP access to cubes Multi-instance supportMulti-instance support Integrated Data MiningIntegrated Data Mining Full-Text Search in formatted Full-Text Search in formatted

docsdocs English Query for the WebEnglish Query for the Web C2 security rating (NSA)C2 security rating (NSA) Installation disk imagingInstallation disk imaging Active Directory integrationActive Directory integration Self-management and tuningSelf-management and tuning Distributed Partitioned ViewsDistributed Partitioned Views

Log ShippingLog Shipping Parallel CREATE INDEXParallel CREATE INDEX Parallel scanParallel scan Parallel DBCCParallel DBCC Failover clusteringFailover clustering Failover cluster managementFailover cluster management 32 CPU SMP system support32 CPU SMP system support 64 GB RAM support64 GB RAM support VIA SAN supportVIA SAN support Indexed viewsIndexed views ROLAP dimension storage ROLAP dimension storage Distributed Partitioned CubesDistributed Partitioned Cubes Online index reorganizationOnline index reorganization Differential backupDifferential backup User-defined functionsUser-defined functions Server-less snapshot backup Server-less snapshot backup  SQL Query Analyzer debuggerSQL Query Analyzer debugger New data typesNew data types

Column-level Column-level collationscollations

Virtual Cube EditorVirtual Cube Editor Linked cubesLinked cubes MDX BuilderMDX Builder DimensionsDimensions Security in Analysis Security in Analysis

ServicesServices OLAP ActionsOLAP Actions Custom rollupsCustom rollups Cascading Cascading

referential referential integrity and actionsintegrity and actions

INSTEAD OF INSTEAD OF triggerstriggers

Indexes on Indexes on computed columnscomputed columns

Queued replicationQueued replication DTS enhancementsDTS enhancements Copy Database Copy Database

WizardWizard

SQL Server 2000SQL Server 2000A Major ReleaseA Major Release

Page 5: Fy06 SQL Intro

SQL Server 2005SQL Server 2005.NET Framework.NET Framework

Common Language Runtime IntegrationCommon Language Runtime IntegrationUser-defined AggregatesUser-defined AggregatesUser-defined Data TypesUser-defined Data TypesUser-defined FunctionsUser-defined FunctionsSQL Server .NET Data ProviderSQL Server .NET Data ProviderExtended TriggersExtended Triggers

Data TypesData TypesNew XML DatatypeNew XML Datatypen/varchar(max)n/varchar(max)

SQL Server EngineSQL Server EngineNew Message Service BrokerNew Message Service BrokerHTTP Support (Native HTTP)HTTP Support (Native HTTP)Database Tuning Advisor Database Tuning Advisor Enhanced Read ahead & scanEnhanced Read ahead & scanOnline Index RebuildOnline Index RebuildMultiple Active Result Sets Multiple Active Result Sets Persisted Computed ColumnsPersisted Computed ColumnsQueuing SupportQueuing SupportSnapshot Isolation LevelSnapshot Isolation LevelNon-blocking Read Committed Non-blocking Read Committed Table and Index PartitioningTable and Index PartitioningNUMA supportNUMA support

Database Availability and RedundancyDatabase Availability and RedundancyFail-over Clustering (up to 8 node)Fail-over Clustering (up to 8 node)Enhanced Multi-instance SupportEnhanced Multi-instance SupportDatabase MirroringDatabase MirroringDatabase SnapshotsDatabase Snapshots

XMLXMLXQUERY Support (Server & Mid Tier)XQUERY Support (Server & Mid Tier)XML Data Manipulation Language XML Data Manipulation Language FOR XML EnhancementsFOR XML EnhancementsXML Schema (XSD) Support XML Schema (XSD) Support MSXML 6.0 (Native)MSXML 6.0 (Native) .Net XML Framework Support for .Net XML Framework Support for

XML and CLR integrationXML and CLR integrationNotification ServicesNotification Services

Management ToolsManagement ToolsSSMSSSMSMDX Query EditorMDX Query Editor

Version Control SupportVersion Control SupportSQLCMD Command Line ToolSQLCMD Command Line ToolSQL Service ManagerSQL Service Manager

Database MaintenanceDatabase MaintenanceBackup and Restore EnhancementsBackup and Restore EnhancementsChecksum Integrity ChecksChecksum Integrity ChecksDedicated Administrator ConnectionDedicated Administrator ConnectionDynamic AWEDynamic AWEFast RecoveryFast RecoveryHighly-available UpgradeHighly-available UpgradeOnline Index OperationsOnline Index OperationsOnline RestoreOnline RestoreParallel DBCC\Index OperationsParallel DBCC\Index Operations

Performance TuningPerformance Tuning Profiler EnhancementsProfiler EnhancementsProfiler/Perfmon IntegrationProfiler/Perfmon IntegrationProfiling Analysis ServicesProfiling Analysis ServicesExportable XML Showplan Exportable XML Showplan Exportable Deadlock TracesExportable Deadlock Traces

Full-text SearchFull-text Search Indexing of XML DatatypeIndexing of XML Datatype Index files integrated with SQL backupIndex files integrated with SQL backup

MDAC MDAC Side by Side installationSide by Side installationMicrosoft Installer base setupMicrosoft Installer base setupSupport for Active Directory DeploymentSupport for Active Directory Deployment

SQL Client .NET Data ProviderSQL Client .NET Data ProviderServer Cursor SupportServer Cursor SupportBulk load supportBulk load support

SecuritySecurityNo default metadata accessNo default metadata accessFine Grained Administration RightsFine Grained Administration RightsSeparation of Users and SchemaSeparation of Users and SchemaSSL Encryption at installationSSL Encryption at installationData EncryptionData EncryptionCode-signingCode-signing

ReplicationReplicationAuto-tuning Replication AgentsAuto-tuning Replication AgentsOracle PublicationOracle Publication Improved Blob Change TrackingImproved Blob Change Tracking

OLAP and Data MiningOLAP and Data MiningAnalysis Management Objects Analysis Management Objects Windows Integrated Backup and RestoreWindows Integrated Backup and RestoreUDM, multi-fact table supportUDM, multi-fact table supportDTS and DM IntegrationDTS and DM IntegrationEight new DM algorithmsEight new DM algorithmsAuto Packaging and DeploymentAuto Packaging and Deployment

SQL Integration Services (SQLIS)SQL Integration Services (SQLIS)New Architecture (DTR + DTP)New Architecture (DTR + DTP)Complex Control FlowsComplex Control FlowsControl Flow DebuggingControl Flow DebuggingFor Each EnumerationsFor Each EnumerationsProperty MappingsProperty MappingsFull Data Flow DesignerFull Data Flow DesignerFull DTS Control Flow DesignerFull DTS Control Flow DesignerGraphical Package ExecutionGraphical Package Execution Immediate Mode and Project ModeImmediate Mode and Project ModePackage (Advanced) Deployment ToolsPackage (Advanced) Deployment ToolsCustom Tasks and TransformationsCustom Tasks and Transformations

Reporting ServicesReporting ServicesMultiple Output Formats Multiple Output Formats Parameters (Static, Dynamic, Hierarchical)Parameters (Static, Dynamic, Hierarchical)Bulk Delivery of Personalized ContentBulk Delivery of Personalized ContentSupport Multiple Data Sources Support Multiple Data Sources STS (Web Parts, Doc Libraries)STS (Web Parts, Doc Libraries)Visual Design ToolVisual Design ToolCharting, Sorting, Filtering, Drill-ThroughCharting, Sorting, Filtering, Drill-ThroughScheduling, CachingScheduling, CachingComplete Scripting EngineComplete Scripting EngineScale Out architectureScale Out architectureOpen XML Report DefinitionOpen XML Report Definition

Dynamic Management ViewsDynamic Management Views

Page 6: Fy06 SQL Intro

Developer and DBA ConvergenceDeveloper and DBA Convergence

Developers and DBA’s have traditionally Developers and DBA’s have traditionally worked to an interface, defined by:worked to an interface, defined by:

Table or View definitionTable or View definitionStored Procedure & result set definitionStored Procedure & result set definition

UIUIDevelopmentDevelopment

Web ServiceWeb ServiceDevelopmentDevelopment

DatabaseDatabaseDevelopmentDevelopment

Client/ServerClient/ServerDevelopmentDevelopment

DatabaseDatabaseDevelopmentDevelopment

CLR Development- Procedures- Functions- Triggers- Types- Aggregates

XML Development

InterfaceInterface

Page 7: Fy06 SQL Intro

Communication BarriersCommunication Barriers

Pockets of information within disciplinesPockets of information within disciplines

Unclear delineation of responsibilitiesUnclear delineation of responsibilities

Conflicting best practices and architecturesConflicting best practices and architectures

Conflicting strategic goals and objectivesConflicting strategic goals and objectives

DatabaseDatabaseArchitectArchitectSolutionSolution

ArchitectArchitect

Database Administrationdoesn’t understand our development priorities

Development doesn’tunderstand good querydesign

Reduce complexityReduce complexitythrough clear releasethrough clear releasemanagement processesmanagement processes

Increase communication Increase communication and collaborationand collaboration

via product integrationvia product integration

Page 8: Fy06 SQL Intro

Developer and DBA Developer and DBA ConvergenceConvergenceDevelopers:Developers:

Get productive with Get productive with the new toolsthe new tools

Closer integration of Closer integration of DBAs into the DBAs into the development processdevelopment process

Design application to Design application to be:be:

SupportableSupportable

Resilient to changeResilient to change

Ensure secure, robust Ensure secure, robust implementationimplementation

DBAs:DBAs:Maintain role as Maintain role as data stewarddata steward

Understand & Understand & advise on advise on appropriate appropriate technology choicestechnology choices

Factor XML & CLR Factor XML & CLR into schema designinto schema design

Ensure secure, Ensure secure, robust robust implementationimplementation

Bridge the gap!Bridge the gap!

Page 9: Fy06 SQL Intro

OverviewOverview

Installing SQL Server 2005Installing SQL Server 2005Clean installClean install

ComponentsComponents

Security Settings during InstallSecurity Settings during InstallSetting the startup Service Account Setting the startup Service Account

Choosing AuthenticationChoosing Authentication

Verifying Initial ConfigurationVerifying Initial ConfigurationSQL Server Surface Area ConfigurationSQL Server Surface Area Configuration

Services Services

FeaturesFeatures

SQL Server Configuration ManagerSQL Server Configuration ManagerService Account ChangesService Account Changes

EncryptionEncryption

Page 10: Fy06 SQL Intro

Installing SQL Server 2005Installing SQL Server 2005

Getting a current buildGetting a current buildInstallation through Windows InstallerInstallation through Windows InstallerPre-installation checklistPre-installation checklist

Operating System ChoicesOperating System ChoicesVisual Studio – Beta?Visual Studio – Beta?

Versions from which to chooseVersions from which to choose32-bit v 64-bit (Itanium and x64)32-bit v 64-bit (Itanium and x64)Developer, Enterprise, Express…Developer, Enterprise, Express…

Licensing and featuresLicensing and featuresSCC (System Configuration Checker)SCC (System Configuration Checker)InstallationInstallation

Page 11: Fy06 SQL Intro

Build Naming ConventionsBuild Naming Conventions

Stability ImprovesStability ImprovesInterim builds/Internal ReleasesInterim builds/Internal ReleasesIDW – Internal Developer WorkstationIDW – Internal Developer WorkstationCTP – Community Tech PreviewCTP – Community Tech PreviewBeta – Wider ReleaseBeta – Wider ReleaseRC – Release CandidateRC – Release CandidateRTM – Release to Manufacturing RTM – Release to Manufacturing (“golden”)(“golden”)RTM + Hotfix – Only to specific RTM + Hotfix – Only to specific customerscustomersRTM + SP Beta – Service Pack BetaRTM + SP Beta – Service Pack BetaRTM + SP – Service Pack ReleaseRTM + SP – Service Pack Release

Page 12: Fy06 SQL Intro

Choosing a Product to InstallChoosing a Product to Install

SQL Server 2005 Developer EditionSQL Server 2005 Developer Edition32-bit and 64-bit32-bit and 64-bit

64-bit (Itanium and x64 for AMD Athlon and 64-bit (Itanium and x64 for AMD Athlon and Intel EMT64)Intel EMT64)

SQL Server 2005 Standard EditionSQL Server 2005 Standard Edition32-bit and 64-bit32-bit and 64-bit

SQL Server 2005 Workgroup EditionSQL Server 2005 Workgroup Edition32-bit only32-bit only

SQL Server 2005 Express EditionSQL Server 2005 Express Edition32-bit only32-bit onlyTools in separate MSI (XM Download)Tools in separate MSI (XM Download)

Can run multiple versions side-by-Can run multiple versions side-by-sideside

Page 13: Fy06 SQL Intro
Page 14: Fy06 SQL Intro

Microsoft Express Product Microsoft Express Product RangeRange

Brian’s article: Brian’s article: Get a Lean, Mean Dev Machine with the Express Editions of Visual Basic and SQL Server 2005Get a Lean, Mean Dev Machine with the Express Editions of Visual Basic and SQL Server 2005 http://msdn.microsoft.com/msdnmag/issues/04/09/ExpressEditions/toc.asp?frame=true

Page 15: Fy06 SQL Intro

Installation as an “Application”Installation as an “Application”

Not using Install ShieldNot using Install ShieldThird party productThird party productHard to “control”Hard to “control”Only stopped installation before file copy: Only stopped installation before file copy: “Installation has enough information to start “Installation has enough information to start copying files…”copying files…”Limited rollback capabilities, uninstalling partial Limited rollback capabilities, uninstalling partial and failed install was messyand failed install was messy

Uses Windows InstallerUses Windows InstallerMicrosoft productMicrosoft productCan download/review Windows Installer SDKCan download/review Windows Installer SDK

MSDN, Win32 and COM Development, MSDN, Win32 and COM Development, Administration and Management, Setup, Windows Administration and Management, Setup, Windows InstallerInstaller

Components have an MSI fileComponents have an MSI fileInstallation can be rolled back until the very Installation can be rolled back until the very endend

Page 16: Fy06 SQL Intro

Options for InstallationOptions for Installation

Local Installation – using GUILocal Installation – using GUI

Local “unattended” and/or “silent” Local “unattended” and/or “silent” installationinstallation

Even for clusters!Even for clusters!

Remote installationRemote installation

Page 17: Fy06 SQL Intro

Component UpdateComponent Update

.NET .NET FrameworFramework 2.0k 2.0

SQL SQL Native Native Client Client (SNAC)(SNAC)

Support Support FilesFiles

Page 18: Fy06 SQL Intro

.NET Framework 2.0.NET Framework 2.0

Required for:Required for:SQL Server Management StudioSQL Server Management StudioExpress Manager (used for SQL Express)Express Manager (used for SQL Express)Engine (used by SQLCLR)Engine (used by SQLCLR)

How to Identify .NET Framework How to Identify .NET Framework Version on your MachineVersion on your Machine

%WINDIR%\Microsoft.NET\Framework\%WINDIR%\Microsoft.NET\Framework\version Right-click mscorlib.dll, click version Right-click mscorlib.dll, click Properties, and then click Version.Properties, and then click Version.Administrative Tools, .NET Framework 2.0 Administrative Tools, .NET Framework 2.0 Configuration. At the top of the right Configuration. At the top of the right pane, .NET Framework version displays.pane, .NET Framework version displays.

Page 19: Fy06 SQL Intro

SQL Native ClientSQL Native Client

Why SNAC?Why SNAC?Separates “infrastructure” from “drivers”Separates “infrastructure” from “drivers”SNAC is owned by SQL and only updated SNAC is owned by SQL and only updated by SQL (during install/service packs)by SQL (during install/service packs)SNAC is on the “drivers” sideSNAC is on the “drivers” sideInfrastructure owned by OS and only Infrastructure owned by OS and only updated by OSupdated by OSEnables side-by-side driver installation Enables side-by-side driver installation with minimal downtimewith minimal downtimeMDAC doesn’t allow side-by-side MDAC doesn’t allow side-by-side therefore requires stopping the apptherefore requires stopping the appMDAC to be removed on 64-bit as wellMDAC to be removed on 64-bit as well

Page 20: Fy06 SQL Intro

System System Consistency Consistency

CheckerChecker

Directly Directly impacts the impacts the likelihood of likelihood of successful successful installinstall

Doesn’t Doesn’t mean you’ll mean you’ll have have success success with with upgrade…upgrade…

Page 21: Fy06 SQL Intro

Hardware RecommendationsHardware Recommendations

File System—NTFS, no compression for installation File System—NTFS, no compression for installation drive or primary database file or log filedrive or primary database file or log file

More smaller/faster disks over fewer/larger disksMore smaller/faster disks over fewer/larger disks

Be sure to update all drivers and firmware!Be sure to update all drivers and firmware!

OS should be on redundant disksOS should be on redundant disks

SQL installation should be on redundant diskSQL installation should be on redundant disk

Transaction logs should be on fast, redundant and Transaction logs should be on fast, redundant and preferably isolated diskspreferably isolated disks

TempDB should be isolated and if on multi-proc TempDB should be isolated and if on multi-proc use multiple files (Q328551)use multiple files (Q328551)

Databases should be “moved” to support better Databases should be “moved” to support better performance (Q224071)performance (Q224071)

KB Articles KB Articles Q224071 and Q328551 both on SQL Q224071 and Q328551 both on SQL Server 2000; best practices still applyServer 2000; best practices still apply

Page 22: Fy06 SQL Intro

Components to InstallComponents to Install

Separate Separate “package“packages” (.MSI s” (.MSI files)files)High High Level is Level is for for “typical” “typical” set of set of featuresfeaturesClick Click Advanced Advanced for for feature feature treetree

Page 23: Fy06 SQL Intro

Components to InstallComponents to Install

SQL Server Database SQL Server Database ServicesServices

Database Engine, Replication, Full-TextDatabase Engine, Replication, Full-Text

Analysis ServicesAnalysis Services Analysis Services server components Analysis Services server components for Business Intelligencefor Business Intelligence

Reporting Services Reporting Services Reporting Services and Report ManagerReporting Services and Report Manager

Notification Services Notification Services Notification Services (Engine and Notification Services (Engine and Client), Bulk Event for XMLClient), Bulk Event for XML

Data Transformation Data Transformation Services (Integration Services (Integration Services)Services)

SQL IS replaces SQL DTSSQL IS replaces SQL DTS

Workstation components, Workstation components, Books Online and Books Online and development toolsdevelopment tools

Management Tools, Command Prompt Management Tools, Command Prompt Tools, Reporting Services Tools, Tools, Reporting Services Tools, Connectivity Components, Connectivity Components, Programming Models, Management Programming Models, Management Studio, Computer Manager, Profiler, Studio, Computer Manager, Profiler, and Replication Monitorand Replication Monitor

Business Intelligence Development Business Intelligence Development Studio Studio

Documentation and samples - Books Documentation and samples - Books Online, code samplesOnline, code samples

Page 24: Fy06 SQL Intro

Default and Named InstancesDefault and Named Instances

Default InstanceDefault InstanceAccessed by “computername”Accessed by “computername”

SQL Server Service = mssqlserverSQL Server Service = mssqlserver

SQL Server Agent Service = sqlserveragentSQL Server Agent Service = sqlserveragent

Full-Text = MSFTESQLFull-Text = MSFTESQL

Analysis Server = MSSQLServerOLAPService Analysis Server = MSSQLServerOLAPService

Named InstanceNamed InstanceAccessed by computername\instancenameAccessed by computername\instancename

SQL Server Service = mssql$instancenameSQL Server Service = mssql$instancename

SQL Server Agent Service = SQL Server Agent Service = sqlagent$instancenamesqlagent$instancename

Full-Text = msftesql$instancenameFull-Text = msftesql$instancename

Analysis Server = MSOLAP$instancenameAnalysis Server = MSOLAP$instancename

Page 25: Fy06 SQL Intro

Directory StructureDirectory Structure

No longer dependent on instance No longer dependent on instance namename

All directories MSSQL.n even for other All directories MSSQL.n even for other components such as Analysis components such as Analysis ServicesServices

How can you detect?How can you detect?Registry: HKEY_LOCAL_MACHINE\Registry: HKEY_LOCAL_MACHINE\SOFTWARESOFTWARE\Microsoft\Microsoft SQL Server\Microsoft\Microsoft SQL Server\Instance Names\SQL\Instance Names\SQL

Name = Instance name (without MSSQL)Name = Instance name (without MSSQL)

Data = Install Directory (mssql.n)Data = Install Directory (mssql.n)

Page 26: Fy06 SQL Intro

Installation Directories Installation Directories

2000

2005

Page 27: Fy06 SQL Intro

Programmatically Detecting DirProgrammatically Detecting DirDECLARE @InstanceName sql_variant,

@InstanceDir sql_variant,@SQLDataRoot nvarchar(512),@ExecStr nvarchar(max)

SELECT @InstanceName = ISNULL(SERVERPROPERTY('InstanceName'), 'MSSQLServer')

EXECUTE master.dbo.xp_regread 'HKEY_LOCAL_MACHINE', 'SOFTWARE\Microsoft\Microsoft SQL Server\Instance Names\

SQL', @InstanceName, @InstanceDir OUTPUT

SELECT @ExecStr = 'EXECUTE master.dbo.xp_regread '+ '''HKEY_LOCAL_MACHINE'', ' + '''SOFTWARE\Microsoft\Microsoft SQL Server\' + convert(varchar, @InstanceDir) + '\Setup'', ''SQLDataRoot'', @SQLDataRoot OUTPUT'

EXEC master.dbo.sp_executesql @ExecStr, N'@SQLDataRoot nvarchar(512) OUTPUT', @SQLDataRoot OUTPUT

DECLARE @InstanceName sql_variant,@InstanceDir sql_variant,@SQLDataRoot nvarchar(512),@ExecStr nvarchar(max)

SELECT @InstanceName = ISNULL(SERVERPROPERTY('InstanceName'), 'MSSQLServer')

EXECUTE master.dbo.xp_regread 'HKEY_LOCAL_MACHINE', 'SOFTWARE\Microsoft\Microsoft SQL Server\Instance Names\

SQL', @InstanceName, @InstanceDir OUTPUT

SELECT @ExecStr = 'EXECUTE master.dbo.xp_regread '+ '''HKEY_LOCAL_MACHINE'', ' + '''SOFTWARE\Microsoft\Microsoft SQL Server\' + convert(varchar, @InstanceDir) + '\Setup'', ''SQLDataRoot'', @SQLDataRoot OUTPUT'

EXEC master.dbo.sp_executesql @ExecStr, N'@SQLDataRoot nvarchar(512) OUTPUT', @SQLDataRoot OUTPUT

Page 28: Fy06 SQL Intro

Startup Service Startup Service

CustomizCustomize allows e allows you to you to set each set each individualindividually and to ly and to different different accountsaccounts

Local Local System System oror

Domain Domain AccountAccount……

Page 29: Fy06 SQL Intro

Service Account ChoicesService Account Choices

Starting with most secure/controlledStarting with most secure/controlledDedicated Local account* (not local Dedicated Local account* (not local administrator)administrator)Local service (do other services use Local service (do other services use this?)this?)Dedicated Domain account (not local Dedicated Domain account (not local administrator)administrator)Network service (do other services use Network service (do other services use this?)this?)Local account* which is a local Local account* which is a local administratoradministratorDomain account which is a local Domain account which is a local administratoradministrator

NEVER use:NEVER use:Local System Local System Domain AdministratorDomain Administrator

* Local account information is stored in the local SAM. Domain accounts are likely to be far more secure froma physical perspective.

Page 30: Fy06 SQL Intro

AuthenticationAuthentication

WindowsWindowsMixedMixedPassword Password only only specified specified for Mixed for Mixed (change (change as of as of IDW13) IDW13) due to due to unattendeunattended d installatioinstallation n requiremerequirementnt

Page 31: Fy06 SQL Intro

Windows AuthenticationWindows Authentication

More SecureMore SecureStrong Password requirementsStrong Password requirementsAccount Lockout policiesAccount Lockout policiesPassword expirationPassword expiration

Can be changed laterCan be changed laterAfter installation be sure to manually After installation be sure to manually set the password of the sa account set the password of the sa account otherwise internally set to a otherwise internally set to a randomly generated password—if you randomly generated password—if you change to mixed mode you will change to mixed mode you will “enable” your sa account but not “enable” your sa account but not know the password. However, you know the password. However, you can easily change it.can easily change it.

Page 32: Fy06 SQL Intro

Mixed Mode AuthenticationMixed Mode Authentication

Windows Authentication plusWindows Authentication plusSQL Server AuthenticationSQL Server Authentication

Uses Windows 2003 Server password Uses Windows 2003 Server password policies, if installed on Windows 2003 policies, if installed on Windows 2003 ServerServerRequires sa password to be set during Requires sa password to be set during installation if Mixed mode is choseninstallation if Mixed mode is chosenStronger password requirements in 2005Stronger password requirements in 2005

All passwords must be at least 6 characters All passwords must be at least 6 characters long and satisfy at least three of the four long and satisfy at least three of the four criteria:criteria:

It must contain uppercase letters (A-Z)It must contain uppercase letters (A-Z)It must contain lowercase letters (a-z)It must contain lowercase letters (a-z)It must contain numbers (0-9)It must contain numbers (0-9)It must contain non-alphanumeric characters (!, #, It must contain non-alphanumeric characters (!, #, %, or $)%, or $)

See BOL for complete list of requirementsSee BOL for complete list of requirements

Page 33: Fy06 SQL Intro

Collation SettingCollation Setting

Affects Affects searches, searches, lookups and lookups and ORDER BYORDER BYCharacter Character data is data is affected:affected:

ASCiiASCiicharcharvarcharvarchartexttext

UnicodeUnicodeNcharNcharNvarcharNvarcharNtextNtext

Page 34: Fy06 SQL Intro

Collation ConceptsCollation Concepts

May be changed at any point in chainMay be changed at any point in chain

Different options offer different gains – most in Different options offer different gains – most in language sorting, some in performancelanguage sorting, some in performance

Can assign on the fly for just a single query – or Can assign on the fly for just a single query – or within a view for subsequent usewithin a view for subsequent use

Computed Columns and Indexed Views can aid in Computed Columns and Indexed Views can aid in performance for lookups with changed collations performance for lookups with changed collations (there are likely to be version specific restrictions)(there are likely to be version specific restrictions)

Windows Install

SQL Server Install

Database Creation

Table Creation

Query Usage

I n h e r i t e d

} Use COLLATE option to change

Page 35: Fy06 SQL Intro

Collation OptionsCollation OptionsTo see the list of collationsTo see the list of collationsSELECT * FROM ::fn_helpcollations()SELECT * FROM ::fn_helpcollations()

To see the server's settingTo see the server's settingsp_helpsortsp_helpsort

To see the database's settingTo see the database's settingsp_helpdb dbnamesp_helpdb dbname

To see the table's setting (for each To see the table's setting (for each column)column)sp_help tnamesp_help tname

For more information, check out these For more information, check out these BOL topics:BOL topics:

COLLATECOLLATE

"Specifying Collations""Specifying Collations"

Page 36: Fy06 SQL Intro

Initial Configuration Initial Configuration

Off by defaultOff by default

SSL Encrypted CommunicationsSSL Encrypted Communications

Using SQL Server Configuration Using SQL Server Configuration ManagerManager

Running SQL Server 2000 and SQL Running SQL Server 2000 and SQL Server 2005 side-by-sideServer 2005 side-by-side

Special Feature: ImagingSpecial Feature: ImagingRenaming a Windows computername – Renaming a Windows computername – without ever having started the SQL without ever having started the SQL Server instance, special caseServer instance, special case

Database Servers/InstancesDatabase Servers/Instances

Analysis ServicesAnalysis Services

Page 37: Fy06 SQL Intro

Off by DefaultOff by Default

Principle of least privilegePrinciple of least privilege

Minimal Configuration to run – many Minimal Configuration to run – many installed services (even when chosen installed services (even when chosen during installation) are “off by default”during installation) are “off by default”

Secure by DesignSecure by Design

Secure by DefaultSecure by Default

Secure by DeploymentSecure by Deployment

Where to enable, where to view?Where to enable, where to view?sp_configure, Computer sp_configure, Computer

Management, catalog views, no single Management, catalog views, no single central waycentral way

Enter SQL Surface Area ConfigurationEnter SQL Surface Area Configuration

Page 38: Fy06 SQL Intro

Surface Area ConfigurationSurface Area Configuration

SQL Server Program Group, SQL Server Program Group, Configuration ToolsConfiguration Tools

Review SQL Server Setup Help from Review SQL Server Setup Help from link on main dialoglink on main dialog

Configure Services and ProtocolsConfigure Services and ProtocolsIn this first release more options in In this first release more options in Configuration ManagerConfiguration Manager

Surface Area Configuration for Surface Area Configuration for FeaturesFeatures

““Lockdown” most common/vulnerable Lockdown” most common/vulnerable featuresfeatures

sp_configure settingssp_configure settings

Catalog View queries to view (endpoints)Catalog View queries to view (endpoints)

Page 39: Fy06 SQL Intro

Services – Local Only?Services – Local Only?

Page 40: Fy06 SQL Intro

SQLSAC – HTTP EndpointsSQLSAC – HTTP Endpoints

Page 41: Fy06 SQL Intro

Encrypted CommunicationsEncrypted Communications

SSL is used even if not present at SSL is used even if not present at installationinstallation

Checked at installation, if present usedChecked at installation, if present used

If not present, SQL Server generates a If not present, SQL Server generates a 1024 bit certificate1024 bit certificate

Logins/passwords encrypted alwaysLogins/passwords encrypted always

Uses certificate created at installation Uses certificate created at installation if SSL not presentif SSL not present

Use “Certificate Picker” in SQL Server Use “Certificate Picker” in SQL Server Configuration Manager to explicitly Configuration Manager to explicitly choose SSL certificatechoose SSL certificate

Page 42: Fy06 SQL Intro

SSL Encryption/Certificate SSL Encryption/Certificate PickerPickerProtocol Protocol PropertiPropertieses

Login/Login/pwd pwd encrypteencryptedd

ALL ALL traffic traffic encrypteencrypted = YESd = YES

CertificaCertificate Picker te Picker tabtab

Page 43: Fy06 SQL Intro

Configuration ManagerConfiguration ManagerStart and Stop ServicesStart and Stop ServicesChange auto start and service Change auto start and service propertiespropertiesDoes not require your services Does not require your services to be started to change to be started to change propertiespropertiesService Account password Service Account password changes don’t require restart of changes don’t require restart of serviceserviceControl Supported Network Control Supported Network libraries and SSL encryptionlibraries and SSL encryptionCreate aliasesCreate aliasesLaunch from:Launch from:

SSMS, Right click Configure SSMS, Right click Configure ServicesServicesSQL Configuration ManagerSQL Configuration ManagerComputer ManagementComputer Management

ServiceServiceControl ManagerControl Manager

Server NetworkServer NetworkUtilityUtility

SQL ServerSQL Server“Properties”“Properties”

SQL ConfigurationSQL ConfigurationManagerManager

Page 44: Fy06 SQL Intro

Computer Manager – Computer Manager – 2000/20052000/2005

Page 45: Fy06 SQL Intro

Running 2000/2005 “Side-by-Running 2000/2005 “Side-by-side”side”

Possible combination:Possible combination:SQL Server 2000 default/named SQL Server 2000 default/named instance(s)instance(s)SQL Server 2005 default/named SQL Server 2005 default/named instance(s)instance(s)

SQL Server 2005 can support SQL Server 2005 can support multiple instances of multiple versions multiple instances of multiple versions (Developer and Express)(Developer and Express)

SQL Server 2000SQL Server 2000Directory structure was related to Directory structure was related to instance type and nameinstance type and name

SQL Server 2005SQL Server 2005Directory structure NOT related to Directory structure NOT related to instance type or nameinstance type or name

Re-register COM Components?Re-register COM Components?

Page 46: Fy06 SQL Intro

Renaming A Server After Renaming A Server After StartupStartup

Server name is NOT detected by SQL ServerServer name is NOT detected by SQL Server

Review the list of servernamesReview the list of servernamessp_helpserversp_helpserver

Drop the old servernameDrop the old servernamesp_dropserver 'oldservername'sp_dropserver 'oldservername'

Add the new servername – make sure you Add the new servername – make sure you define it as the “local” instance of SQL define it as the “local” instance of SQL ServerServer

sp_addserver 'newservername', localsp_addserver 'newservername', local

May have problems with servername related May have problems with servername related objects/jobs – i.e. msdb.dbo.sysjobsobjects/jobs – i.e. msdb.dbo.sysjobs

update sysjobs update sysjobs set originating_server = ' set originating_server = 'newnewservername'servername'where originating_server = 'where originating_server = 'oldoldservername'servername'

Page 47: Fy06 SQL Intro

Optimal Database StructuresOptimal Database Structures

Anatomy of Data ModificationsAnatomy of Data Modifications

Optimizing Data FilesOptimizing Data Files

Logging and RecoveryLogging and Recovery

TransactionsTransactions

Page 48: Fy06 SQL Intro

Database StructureDatabase Structure

Up to 32,767 Databases per Instance

Up to 32,767 Files per Database (Total is for both Data and Log files)

Minimum of 2 files – one for data, one for log

MDF = Primary (MAIN) Data File

NDF = Non-Primary Data File

LDF = Log Data File

Should be only 1 log file

Data FileData File*.mdf (1)*.mdf (1)

*.ndf (0-*.ndf (0-nn))

Log FileLog File*.ldf (1-*.ldf (1-nn))

DatabaseDatabase

Page 49: Fy06 SQL Intro

1.1. User sends UPDATEUser sends UPDATE Update is highly selective (only 5 rows)Update is highly selective (only 5 rows) Indexes exist to aid in finding these rows Indexes exist to aid in finding these rows

efficientlyefficiently The update is a SINGLE statement batch The update is a SINGLE statement batch

NOT enclosed in BEGIN TRAN…COMMIT NOT enclosed in BEGIN TRAN…COMMIT TRAN block therefore this is IMPLICIT TRAN block therefore this is IMPLICIT transactiontransaction

The Anatomy of a Data The Anatomy of a Data ModificationModification

Page 50: Fy06 SQL Intro

2.2. Server receives the request and Server receives the request and locates the data in cache OR reads locates the data in cache OR reads the data from disk into cachethe data from disk into cache

Since this is highly selective only the Since this is highly selective only the necessary pages are read into cache necessary pages are read into cache (maybe a few extra but that’s not (maybe a few extra but that’s not important here)important here)

Let’s use an example where the 5 rows Let’s use an example where the 5 rows being modified are located on 3 different being modified are located on 3 different data pagesdata pages

The Anatomy of a Data The Anatomy of a Data ModificationModificationThe Anatomy of a Data The Anatomy of a Data ModificationModification

Page 51: Fy06 SQL Intro

What it looks like:What it looks like:DataData

UPDATE…

Data

Log

Server…

Cache

Page 52: Fy06 SQL Intro

3.3. SQL Server proceeds to lock the SQL Server proceeds to lock the necessary datanecessary data

Locks are necessary to give us a Locks are necessary to give us a consistent point FOR ALL rows from which consistent point FOR ALL rows from which to startto start

If any other transaction(s) have ANY of If any other transaction(s) have ANY of these rows locked we will wait until ALL these rows locked we will wait until ALL locks have been acquired before we can locks have been acquired before we can proceed.proceed.

In the case of this update In the case of this update (because it’s (because it’s highly selective and because indexes highly selective and because indexes exist to make this possible) exist to make this possible) SQL Server SQL Server will use row level locking.will use row level locking.

The Anatomy of a Data The Anatomy of a Data ModificationModificationThe Anatomy of a Data The Anatomy of a Data ModificationModification

Page 53: Fy06 SQL Intro

What it looks like: What it looks like: LocksLocks

Cache

Page

Page

Page

Row

Row

Row

Row

Row

Update Lock

Update Lock

Update Lock

Update Lock

Update Lock

Page 54: Fy06 SQL Intro

4.4. SQL Server can now begin to make the SQL Server can now begin to make the modifications – for EVERY row the process modifications – for EVERY row the process will include:will include:1.1. Change to a stricter lock (eXclusive lock) Change to a stricter lock (eXclusive lock)

An update lock helps to allow better concurrency An update lock helps to allow better concurrency by being compatible with other shared locks by being compatible with other shared locks (readers). Readers can read the pre-modified data (readers). Readers can read the pre-modified data as it is transactionally consistentas it is transactionally consistent

The eXclusive lock is required to make the change The eXclusive lock is required to make the change because once modified no other reads should be because once modified no other reads should be able to see this un-committed changeable to see this un-committed change

2.2. Make the modification (in cache)Make the modification (in cache)3.3. Log the modification to the transaction log pages Log the modification to the transaction log pages

(also in cache)(also in cache)

The Anatomy of a Data The Anatomy of a Data ModificationModificationThe Anatomy of a Data The Anatomy of a Data ModificationModification

Page 55: Fy06 SQL Intro

What it looks like:What it looks like:ModificationsModifications

Cache

Page

Page

Page

Row

Row

Row

Row

Row

Update LockExclusive Lockxx

Update LockExclusive Lockxx

Update LockExclusive Lockxx

Update LockExclusive Lockxx

Update LockExclusive Lockx

x

L

Page 56: Fy06 SQL Intro

5.5. Finally, the transaction is complete – this is Finally, the transaction is complete – this is the MOST critical stepthe MOST critical step

All rows have been modifiedAll rows have been modified There are no other statements in this transaction There are no other statements in this transaction

– i.e. Implicit transaction– i.e. Implicit transaction Steps are:Steps are:

1.1. Write all log pages to transaction log ON Write all log pages to transaction log ON DISKDISK

2.2. Release the locksRelease the locks

3.3. Send a message to the user:Send a message to the user:

(5 Rows Affected)(5 Rows Affected)

The Anatomy of a Data The Anatomy of a Data ModificationModificationThe Anatomy of a Data The Anatomy of a Data ModificationModification

Page 57: Fy06 SQL Intro

What it looks like:What it looks like:Write-Ahead LoggingWrite-Ahead Logging

LData

Log

Server…

Cache~~~~~~~~~~~~~~~~~~~~~~~~~~~

Sequential writesChangeChangeChangeChange…

Log

5 Rows Affected

After the log entries are made and the locks are

released…

Page 58: Fy06 SQL Intro

So now what?So now what?

The transaction log ON DISK – is up to The transaction log ON DISK – is up to datedate

The data in CACHE – is up to dateThe data in CACHE – is up to date

But when does the data get written But when does the data get written from cache to disk?from cache to disk?

CheckpointCheckpoint

Page 59: Fy06 SQL Intro

CheckpointCheckpoint

It’s important to realize that a checkpoint It’s important to realize that a checkpoint does NOT just write committed pages… does NOT just write committed pages… Instead a checkpoint writes Instead a checkpoint writes ALLALL pages pages which have changed since they were which have changed since they were brought into cache – brought into cache – regardlessregardless of the state of the state of the transaction which changed them!of the transaction which changed them!

Why?Why?To reduce roll-forward recovery time To reduce roll-forward recovery time during restart recoveryduring restart recovery

To batch I/Os to disk and reduce disk To batch I/Os to disk and reduce disk thrashing for data writesthrashing for data writes

Page 60: Fy06 SQL Intro

Transaction Recovery and Transaction Recovery and Checkpoints Checkpoints

Transactions… Action Requiredif restart recovery

None

Checkpoint System Failure

11

22

33

44

55

Roll forward

Roll back

Roll forward

Roll back

L D

L/D

L/D

L

L

Time

Page 61: Fy06 SQL Intro

Restart RecoveryRestart Recovery

Automatically on system restartAutomatically on system restartSQL Server 2000SQL Server 2000

RedoRedoUndoUndoUsers allowed access to databaseUsers allowed access to database

SQL Server 2005 (Enterprise Edition Only)SQL Server 2005 (Enterprise Edition Only)RedoRedoUsers allowed access to databaseUsers allowed access to databaseUndoUndo

Faster access to database – including faster Faster access to database – including faster failover on Clusterfailover on ClusterRecords are protected (i.e. Locked)Records are protected (i.e. Locked)

Page 62: Fy06 SQL Intro

Key PointsKey Points

Data Portion mostly random reads – Data Portion mostly random reads – except at checkpointexcept at checkpoint

Log Portion mostly sequential writesLog Portion mostly sequential writes

Separate physical disks minimizes Separate physical disks minimizes contention at the drive level – first contention at the drive level – first choice in tuningchoice in tuning

Log is critical in recoveryLog is critical in recovery

Protect the logProtect the log

Minimize impact to logMinimize impact to log

Be aware of environmental changesBe aware of environmental changes

Page 63: Fy06 SQL Intro

Optimizing Data FilesOptimizing Data Files

Defrag the physical disksDefrag the physical disks

Effective RAID array configurationEffective RAID array configuration

Pre-allocate to a reasonable initial sizePre-allocate to a reasonable initial size

Don’t let auto-growth get out of controlDon’t let auto-growth get out of controlConsider allowing Fast File Initialization Consider allowing Fast File Initialization (Enterprise Edition Only—Perform Volume (Enterprise Edition Only—Perform Volume Maintenance Tasks to SQL Server service account)Maintenance Tasks to SQL Server service account)

While a lot of these will help for the log While a lot of these will help for the log (and I’ll explain why in a moment) there (and I’ll explain why in a moment) there are more important things to be aware of are more important things to be aware of IN the data portion – tables and indexes IN the data portion – tables and indexes (rebuilding)(rebuilding)

Page 64: Fy06 SQL Intro

Growing Files AutomaticallyGrowing Files Automatically

Enable Data Files and Log File to Grow Enable Data Files and Log File to Grow AutomaticallyAutomatically

Allocate Reasonable Initial Sizes Allocate Reasonable Initial Sizes

Set Maximum Size for Data Files so you Set Maximum Size for Data Files so you don’t run out of space when you have no don’t run out of space when you have no disk space!disk space!

Especially important for server consolidation Especially important for server consolidation and/or shared disks re: multiple databasesand/or shared disks re: multiple databases

Set File Growth Increment to Reasonable Set File Growth Increment to Reasonable SizeSize

Preference – fixed increment less than or equal Preference – fixed increment less than or equal to 1GBto 1GB

Helped by fast file initialization (SQL Server Helped by fast file initialization (SQL Server 2005 only)2005 only)

Page 65: Fy06 SQL Intro

Optimizing Log Files Optimizing Log Files

Isolate Isolate thethe Transaction Log (only one!) Transaction Log (only one!)Defrag the physical disksDefrag the physical disksEffective RAID array configurationEffective RAID array configurationPre-allocate to a reasonable initial sizePre-allocate to a reasonable initial sizeDon’t let auto-growth get out of controlDon’t let auto-growth get out of controlCheck and fix your internal Check and fix your internal fragmentationfragmentationDon’t be caught up in nothing but Don’t be caught up in nothing but speed!speed!See Kim’s blog entry: See Kim’s blog entry: 8 Steps to Better Transaction Log Throu8 Steps to Better Transaction Log Throughputghput, June 25, 2005, June 25, 2005

Page 66: Fy06 SQL Intro

Log filesLog files

Limited read activityLimited read activityReplicationReplication

RollbackRollback

Optimize for write activityOptimize for write activityRAID 0+1 w/hot spareRAID 0+1 w/hot spare

RAID 1RAID 1

RAID 1+0 better redundancyRAID 1+0 better redundancy

preferably NOT RAID 5 preferably NOT RAID 5

Optimize for SQL Server “logging” activityOptimize for SQL Server “logging” activityCapacity Planning (initial and reasonable size)Capacity Planning (initial and reasonable size)

Minimize excessive autogrowths (clean up Minimize excessive autogrowths (clean up excessive VLFs)excessive VLFs)

Page 67: Fy06 SQL Intro

How the Transaction Log WorksHow the Transaction Log Works

On commit, activity is written to the log – On commit, activity is written to the log – active and likely sequentialactive and likely sequentialActivity moves through log sequentially, Activity moves through log sequentially, fills and then goes to a second file or fills and then goes to a second file or autogrowsautogrowsExcessive autogrowth causes:Excessive autogrowth causes:

Slight pause on autogrowSlight pause on autogrowWindows call to extend a file (may cause file Windows call to extend a file (may cause file fragmentation)fragmentation)Adds “VLFs” to the file on each autogrowthAdds “VLFs” to the file on each autogrowth

Virtual Log File Virtual Log File 44

Virtual Log File Virtual Log File 55

Virtual Log File Virtual Log File 11

Virtual Log File Virtual Log File 22

Inactive Virtual Log File

Min LSN Min LSN (first open(first open

tran)tran)

Active Virtual Log File

Inactive/UnusedLog File

StartStartLogical LogLogical Log

EndEndLogical LogLogical Log

Virtual Log File Virtual Log File 33

Inactive Virtual Log File

Inactive Virtual Log File

Page 68: Fy06 SQL Intro

Transaction Log OptimizationTransaction Log Optimization

Pre-allocate the log to appropriate Pre-allocate the log to appropriate sizesizeMinimize autogrowthsMinimize autogrowthsLog might NOT show 0 Percent Log Log might NOT show 0 Percent Log Used after backup for example - the Used after backup for example - the % Log Used will be ~6%% Log Used will be ~6%Disk Bound Systems might Disk Bound Systems might experience some performance experience some performance degradation at log backupdegradation at log backupConsider RAID 1+0 instead of RAID 1, Consider RAID 1+0 instead of RAID 1, RAID 5 generally is not recommendedRAID 5 generally is not recommendedOptimize existing file – minimize VLFsOptimize existing file – minimize VLFs

Page 69: Fy06 SQL Intro

The Effects of LoggingThe Effects of Logging

Log is written AHEAD of the data Log is written AHEAD of the data portionportionLog is the ONLY place where Log is the ONLY place where transactional consistency is known transactional consistency is known (i.e. guaranteed)(i.e. guaranteed)Once a checkpoint occurs SQL Server Once a checkpoint occurs SQL Server doesn’t need the information in the doesn’t need the information in the log – for committed (or inactive) log – for committed (or inactive) transactions (the log could even be transactions (the log could even be cleared however…)cleared however…)Without the transaction log a Without the transaction log a database cannot function (i.e. database cannot function (i.e. marked suspect)marked suspect)Need to make sure this is redundant Need to make sure this is redundant AND optimal…AND optimal…What can effect the logging?What can effect the logging?

Page 70: Fy06 SQL Intro

Minimize impact of loggingMinimize impact of logging

Cannot clear the log unless the Cannot clear the log unless the transactional information is inactive – if a transactional information is inactive – if a transaction is pending then it stays in the transaction is pending then it stays in the log until it’s completed…log until it’s completed…Restart recovery will take longer if there Restart recovery will take longer if there were a lot of long running transactions were a lot of long running transactions pending at the time of failurepending at the time of failureAlways try to avoid:Always try to avoid:

Long running transactionsLong running transactionsTransactions which span more than one batchTransactions which span more than one batchTransactions that require user interactionTransactions that require user interactionNested Transactions (complex and usually longer Nested Transactions (complex and usually longer running)running)

Consider changing recovery modelConsider changing recovery model

Page 71: Fy06 SQL Intro

What can impact your ability to What can impact your ability to recover?recover?Recovery ModelsRecovery Models

Full Full Everything is Fully LoggedEverything is Fully LoggedAll operations allowed and FULLY loggedAll operations allowed and FULLY loggedOperations like creating or rebuilding an Operations like creating or rebuilding an INDEX takes as large a log as the size of INDEX takes as large a log as the size of the operationthe operation

Bulk_Logged Bulk_Logged Minimal Logging for Minimal Logging for SOMESOME Operations (not NON-LOGGED)Operations (not NON-LOGGED)

Operations whose performance is Operations whose performance is affected, see BOL Topic affected, see BOL Topic Minimally Logged OperationsMinimally Logged Operations ALL other operations (i.e. updates, ALL other operations (i.e. updates, inserts, etc. take the same log space and inserts, etc. take the same log space and time as the FULL recovery modeltime as the FULL recovery model

Simple Simple Log Truncation on Log Truncation on CheckpointCheckpoint

Page 72: Fy06 SQL Intro

Logging, Recovery Models, Logging, Recovery Models, Recovery and PerformanceRecovery and Performance

Acceptable to Switch?Acceptable to Switch?

How, Why and Why Not?How, Why and Why Not?

Batch Processing ScenarioBatch Processing Scenario

Batch Processing ScriptBatch Processing Script

Switching Recovery Models – Switching Recovery Models – SummarySummary

Page 73: Fy06 SQL Intro

Acceptable to Switch?Acceptable to Switch?

YesYesIf users are not allowed in database If users are not allowed in database during batch processingduring batch processingNo modifications are made during batch No modifications are made during batch process that are not recoverable through process that are not recoverable through some other means (some other means (i.e. re-running the i.e. re-running the batch processbatch process))Data loss during the batch process is OKData loss during the batch process is OK

NoNoModifications are made by Modifications are made by users/processes that are not recoverable users/processes that are not recoverable OLTP System requiring 24x7x365OLTP System requiring 24x7x365Data loss is NOT acceptableData loss is NOT acceptableDatabase Mirroring Database Mirroring

Page 74: Fy06 SQL Intro

BATCH OPERATIONBATCH OPERATION

BULK_LOGGEDBULK_LOGGED

Advantages:Advantages:Minimal loggingMinimal loggingFaster Bulk OperationFaster Bulk Operation

Disadvantages:Disadvantages:No No STOPATSTOPATData loss possible if media Data loss possible if media failure…failure…If database becomes If database becomes suspect, cannot back up suspect, cannot back up tail of log.tail of log.

Advantages:Advantages:Up-to-the-Up-to-the-minute minute recoveryrecovery

Point-in-time Point-in-time recoveryrecovery

Access to Access to “tail” of the “tail” of the loglog

LOG BACKUP

FULLFULL

ALTER DATABASE <name>SET RECOVERY BULK_LOGGED

ALTER DATABASE <name>SET RECOVERY FULL

LOG BACKUP11

22

33

44

Switching Recovery ModelsSwitching Recovery Models

FULLFULL

Page 75: Fy06 SQL Intro

-- Check the status before changing it!IF DATABASEPROPERTYEX('dbname', 'Updateability') = 'READ_ONLY'

ALTER DATABASE dbnameSET RESTRICTED_USER,READ_WRITE WITH ROLLBACK AFTER 60

go-- Perform Batch Operations…goALTER DATABASE dbname

SET READ_ONLY, MULTI_USERWITH ROLLBACK AFTER 30

go

-- Check the status before changing it!IF DATABASEPROPERTYEX('dbname', 'Updateability') = 'READ_ONLY'

ALTER DATABASE dbnameSET RESTRICTED_USER,READ_WRITE WITH ROLLBACK AFTER 60

go-- Perform Batch Operations…goALTER DATABASE dbname

SET READ_ONLY, MULTI_USERWITH ROLLBACK AFTER 30

go

Make sure to review the

sample script with tons of comments!

Switching Recovery ModelsSwitching Recovery ModelsUsing ALTER DATABASEUsing ALTER DATABASE

Page 76: Fy06 SQL Intro

Availability, What About Availability, What About Locking?Locking?

ACID Transaction Design ACID Transaction Design RequirementsRequirements

Atomicity Consistency Atomicity Consistency IsolationIsolation DurabilityDurability

Isolation LevelsIsolation LevelsLevel 0 – Read UncommittedLevel 0 – Read Uncommitted

Level 1 – Read CommittedLevel 1 – Read Committed

Level 2 – Repeatable ReadsLevel 2 – Repeatable Reads

Level 3 – SerializableLevel 3 – Serializable

DefaultDefault Isolation Level in BOTH Isolation Level in BOTH 2000/2005 is ANSI/ISO Level 1, Read 2000/2005 is ANSI/ISO Level 1, Read CommittedCommitted

In implementation this default level In implementation this default level uses lockinguses locking

Page 77: Fy06 SQL Intro

ACID PropertiesACID Properties

AtomicityAtomicityA transaction must be an atomic unit of work; A transaction must be an atomic unit of work; either all of its modifications are performed, or either all of its modifications are performed, or nonenone

ConsistencyConsistencyWhen completed, a transaction must leave all When completed, a transaction must leave all data and all related structures in a consistent data and all related structures in a consistent state state

IsolationIsolationA transaction either sees data in the state it A transaction either sees data in the state it was in before another concurrent transaction was in before another concurrent transaction modified it, or it sees the data after the second modified it, or it sees the data after the second transaction has completed, but it does not see transaction has completed, but it does not see an intermediate statean intermediate state

DurabilityDurabilityTransaction should persist despite system Transaction should persist despite system failurefailure

Page 78: Fy06 SQL Intro

Isolation Level 0 – Read Isolation Level 0 – Read UncommittedUncommittedPhenomenon: Dirty readsPhenomenon: Dirty readsA read transaction can read another A read transaction can read another

transaction’s uncommitted (or in-flight) transaction’s uncommitted (or in-flight) changes – resulting in “dirty reads”changes – resulting in “dirty reads”

DML statements always use exclusive DML statements always use exclusive lockinglocking

In Implementation:In Implementation: row locks are not used row locks are not used (SCH_S locks are used) for the dirty read (SCH_S locks are used) for the dirty read transaction and locks against data being transaction and locks against data being accessed are not honoredaccessed are not honored

Resulting Phenomenon:Resulting Phenomenon: statements statements execute with the possibility of inaccurate execute with the possibility of inaccurate data since the “in-flight” data read may data since the “in-flight” data read may continue to change or even be invalidated continue to change or even be invalidated (rolled back)(rolled back)

Page 79: Fy06 SQL Intro

Isolation Level 1 – Read Isolation Level 1 – Read CommittedCommittedPhenomenon: Inconsistent analysisPhenomenon: Inconsistent analysisThe default behavior in ALL releasesThe default behavior in ALL releases

In-flight transaction’s data cannot be read In-flight transaction’s data cannot be read by a read committed transaction – only by a read committed transaction – only committed changes are visiblecommitted changes are visible

DML statements always use exclusive DML statements always use exclusive lockinglocking

In ImplementationIn Implementation: locks are released (for : locks are released (for readers – not modifiers) as resources are readers – not modifiers) as resources are read, a row may be read more than once in read, a row may be read more than once in some scenariossome scenarios

Resulting PhenomenonResulting Phenomenon: Reads are not : Reads are not repeatable through the life of a transaction repeatable through the life of a transaction – as a result a row may not be read – as a result a row may not be read consistently during the life of a transactionconsistently during the life of a transaction

Page 80: Fy06 SQL Intro

Isolation Level 2 – Repeatable ReadIsolation Level 2 – Repeatable ReadPhenomenon: PhantomsPhenomenon: Phantoms

In-flight transaction’s data cannot be read In-flight transaction’s data cannot be read and data modified is accessible only to the and data modified is accessible only to the repeatable read transactionrepeatable read transaction

Data read, but not modified is accessible Data read, but not modified is accessible to other transactions for reads, but not to other transactions for reads, but not DMLDML

In Implementation:In Implementation: locks are held for the locks are held for the life of a transaction, rows which are read life of a transaction, rows which are read are locked and can be repeatably read are locked and can be repeatably read during the life of a transactionduring the life of a transaction

Resulting Phenomena:Resulting Phenomena: rows which were rows which were not present at the beginning of the not present at the beginning of the transaction can appear – in the resulttransaction can appear – in the result

Page 81: Fy06 SQL Intro

Isolation Level 3 – SerializableIsolation Level 3 – SerializablePhenomenon: NonePhenomenon: None

In-flight transaction’s data cannot be read In-flight transaction’s data cannot be read and data modified is accessible only to the and data modified is accessible only to the serializable transactionserializable transactionData read, but not modified is accessible to Data read, but not modified is accessible to other transactions for reads, but not DMLother transactions for reads, but not DML In Implementation:In Implementation: locks are held for the life locks are held for the life of a transaction and held at higher levels of a transaction and held at higher levels within indexes to prevent rows from entering within indexes to prevent rows from entering the “set”the “set”Implementation side effect:Implementation side effect: to prevent rows to prevent rows from entering the “set” of data, the “set” of from entering the “set” of data, the “set” of data needs to be locked. If appropriate data needs to be locked. If appropriate indexes do not exist then higher levels of indexes do not exist then higher levels of locking might be necessary – i.e., table-level locking might be necessary – i.e., table-level locking.locking.

Key range Key range and index and index reminderreminder

Page 82: Fy06 SQL Intro

Isolation in ImplementationIsolation in Implementation

All versions of SQL Server – including All versions of SQL Server – including SQL Server 2005 – use a locking-SQL Server 2005 – use a locking-based implementation based implementation by defaultby default

In the default environment and to In the default environment and to reduce the possibility of various reduce the possibility of various anomalies, transactional duration and anomalies, transactional duration and possibly higher levels of locking are possibly higher levels of locking are necessary – resulting in blockingnecessary – resulting in blocking

There are benefits of locking, if true There are benefits of locking, if true “isolation” is desired“isolation” is desired

Page 83: Fy06 SQL Intro

Isolation in Implementation Isolation in Implementation (cont’d)(cont’d)

SQL Server 2005 combines an SQL Server 2005 combines an optional row versioning-based optional row versioning-based implementation with this locking-implementation with this locking-based implementation based implementation

Two end results (and methods) for Two end results (and methods) for implementing row-level versioningimplementing row-level versioning

Read Committed using Statement-level Read Committed using Statement-level SnapshotSnapshotAlso known as: Also known as:

Statement-level Read ConsistencyStatement-level Read Consistency

Transaction-level Read ConsistencyTransaction-level Read ConsistencySnapshot IsolationSnapshot Isolation

Page 84: Fy06 SQL Intro

SQL Server 2005 Isolation SQL Server 2005 Isolation OptionsOptions

DefaultDefault – Uses Locking, no options set – Uses Locking, no options setRead Committed with Statement-level SnapshotRead Committed with Statement-level Snapshot

DB Option: DB Option: READ_COMMITTED_SNAPSHOTREAD_COMMITTED_SNAPSHOT enabled enabledUses locking for writes, versioning for reads for statement-Uses locking for writes, versioning for reads for statement-level consistencylevel consistency

Transaction-level Snapshot IsolationTransaction-level Snapshot IsolationDB Option: DB Option: ALLOW_SNAPSHOT_ISOLATIONALLOW_SNAPSHOT_ISOLATION enabled enabledUses default locking for everything, ALLOWS snapshot for Uses default locking for everything, ALLOWS snapshot for transactions when requested through transactions when requested through SET TRANSACTION ISOLATION LEVEL SNAPSHOTSET TRANSACTION ISOLATION LEVEL SNAPSHOTIf requested, transaction-level consistency through row If requested, transaction-level consistency through row versioningversioning

Both Options ConfiguredBoth Options ConfiguredStatement-level consistency through versioning Statement-level consistency through versioning automaticallyautomaticallyTransaction-level consistency where requestedTransaction-level consistency where requested

Page 85: Fy06 SQL Intro

Setting Isolation BehaviorSetting Isolation Behavior

Default BehaviorDefault Behavior

Read Committed using Statement-Level Read Committed using Statement-Level SnapshotSnapshot

Snapshot IsolationSnapshot Isolation

Both RCSI and Snapshot IsolationBoth RCSI and Snapshot Isolation

ALTER DATABASE <database_name> SET READ_COMMITTED_SNAPSHOT ONWITH ROLLBACK AFTER 5

ALTER DATABASE <database_name>SET ALLOW_SNAPSHOT_ISOLATION ON

ALTER DATABASE <database_name> SET RCSLS...ALTER DATABASE <database_name> SET Snapshot...

Page 86: Fy06 SQL Intro

Read CommittedRead CommittedDefault BehaviorDefault Behavior

All phenomena – except dirty reads All phenomena – except dirty reads are possible, even in the bounds of a are possible, even in the bounds of a single select querysingle select query

In volatile databases, a long running In volatile databases, a long running query may produce inconsistent query may produce inconsistent resultsresults

Can increase isolation to remove Can increase isolation to remove phenomenaphenomena

Increasing isolation requires locks to be Increasing isolation requires locks to be held longerheld longer

This can create blockingThis can create blocking

Page 87: Fy06 SQL Intro

Tips to Minimize BlockingTips to Minimize Blocking

Write efficient transactions – keep Write efficient transactions – keep them short and in one batchthem short and in one batchDo not allow interaction in the midst Do not allow interaction in the midst of the batchof the batchUse indexes to help SQL Server find – Use indexes to help SQL Server find – and lock – only the necessary dataand lock – only the necessary dataConsider estimates for long running Consider estimates for long running queries and/or migrating data to a queries and/or migrating data to a secondary analysis serversecondary analysis serverProblems with locking becoming Problems with locking becoming blocking are often when long running blocking are often when long running (and conflicting) transactions execute(and conflicting) transactions execute

Page 88: Fy06 SQL Intro

Locking Prevents ConflictsLocking Prevents ConflictsAt the expense of blockingAt the expense of blocking

No other transactions can invalidate No other transactions can invalidate the data set through modificationsthe data set through modificationsIs this always what you want or Is this always what you want or need?need?

Queuing applications typically want Queuing applications typically want locks, if someone is modifying that row – locks, if someone is modifying that row – we want to go to another not see last we want to go to another not see last transactional statetransactional stateVery volatile prices – don’t want to give Very volatile prices – don’t want to give them the last price if it’s currently being them the last price if it’s currently being updated, so wait… (updated, so wait… (but the update’s but the update’s fastfast))

What about long running transactions What about long running transactions or cases where you don’t need or cases where you don’t need absolute current…absolute current…

Page 89: Fy06 SQL Intro

Read Committed Snapshot IsolationRead Committed Snapshot IsolationDatabase Changed to READ_COMMITTED_SNAPSHOTDatabase Changed to READ_COMMITTED_SNAPSHOT

No phenomena are possible in the bounds No phenomena are possible in the bounds of a single statementof a single statement

In volatile databases, a multi-statement In volatile databases, a multi-statement transaction may yield different results for transaction may yield different results for different statements which access the different statements which access the same datasame data

Each statement is consistent but only for Each statement is consistent but only for the execution of that statement, not for the execution of that statement, not for the life of the transaction (the life of the transaction (if the if the transaction has multiple statementstransaction has multiple statements))

Each time data is read by a new statement Each time data is read by a new statement the latest version is usedthe latest version is used

Page 90: Fy06 SQL Intro

Snapshot IsolationSnapshot IsolationDatabase Changed to ALLOW_SNAPSHOT_ISOLATIONDatabase Changed to ALLOW_SNAPSHOT_ISOLATION

Setting ALLOWS users to ask for Snapshot Setting ALLOWS users to ask for Snapshot Isolation – NOT on by defaultIsolation – NOT on by defaultALL phenomena and ALL default locking ALL phenomena and ALL default locking behavior is EXACTLY the same unless you behavior is EXACTLY the same unless you explicitly ask for Snapshot Isolationexplicitly ask for Snapshot IsolationOnce requests, no phenomena are possible Once requests, no phenomena are possible in the bounds of a transaction running in the bounds of a transaction running under snapshot isolationunder snapshot isolationIn volatile databases, a multi-statement In volatile databases, a multi-statement transaction will always see the transaction will always see the transactionally accurate version which transactionally accurate version which existed when the transaction startedexisted when the transaction startedVersions must stick around longer, multi-Versions must stick around longer, multi-statement transactions may have conflictsstatement transactions may have conflicts

Page 91: Fy06 SQL Intro

Isolation Levels: In SummaryIsolation Levels: In Summary

READ UNCOMMITTED (Level 0)READ UNCOMMITTED (Level 0)““Dirty Reads” – An option ONLY for readersDirty Reads” – An option ONLY for readersAny data (even that which is Any data (even that which is in-flight/locked) can be viewedin-flight/locked) can be viewed

READ COMMITTED (Level 1 – Default)READ COMMITTED (Level 1 – Default)Only committed changes are visibleOnly committed changes are visibleData in an intermediate state cannot be Data in an intermediate state cannot be accessedaccessed

Read Committed w/statement-level Read Committed w/statement-level SnapshotSnapshot

Statement-level read consistencyStatement-level read consistencyNew non-blocking, non-locking (ex. New non-blocking, non-locking (ex. SCH_S), SCH_S), version-based Level 1version-based Level 1

Page 92: Fy06 SQL Intro

Isolation Levels: In Summary Isolation Levels: In Summary (cont’d)(cont’d)

REPEATABLE READS (Level 2)REPEATABLE READS (Level 2)All reads are consistent for the life of a All reads are consistent for the life of a transactiontransactionShared locks are NOT released after the data is Shared locks are NOT released after the data is processed – does not allow writers (does allow processed – does not allow writers (does allow other readers)other readers)Does not protect entire set (phantoms may Does not protect entire set (phantoms may occur)occur)

SERIALIZEABLE (Level 3)SERIALIZEABLE (Level 3)All reads are consistent for the life of a All reads are consistent for the life of a transactiontransactionAvoids phantoms – no new recordsAvoids phantoms – no new records

Snapshot Isolation – 2005Snapshot Isolation – 2005 Transaction-Level consistency using snapshot Transaction-Level consistency using snapshot New non-blocking, non-locking, version-based New non-blocking, non-locking, version-based transactionstransactions

Page 93: Fy06 SQL Intro

Controlling Isolation LevelsControlling Isolation LevelsTable-level changesTable-level changes

From Clause, per table (no spaces)From Clause, per table (no spaces)Level 0 – READUNCOMMITTED, NOLOCKLevel 0 – READUNCOMMITTED, NOLOCKLevel 1 – READCOMMITTED (locking)Level 1 – READCOMMITTED (locking)Level 1 – READCOMMITTED (versioning)Level 1 – READCOMMITTED (versioning)

Only in 2005 and Only in 2005 and onlyonly if the database option to if the database option to READ_COMMITTED_SNAPSHOT is onREAD_COMMITTED_SNAPSHOT is onCan be overridden with READCOMMITTEDLOCKCan be overridden with READCOMMITTEDLOCK

Level 2 – REPEATABLEREAD Level 2 – REPEATABLEREAD Level 3 – SERIALIZABLE, HOLDLOCKLevel 3 – SERIALIZABLE, HOLDLOCK

FROM dbo.titles WITH(READUNCOMMITTED)FROM dbo.titles WITH(READUNCOMMITTED) JOIN dbo.publishers WITH(SERIALIZABLE)JOIN dbo.publishers WITH(SERIALIZABLE)

Page 94: Fy06 SQL Intro

Controlling Isolation Levels Controlling Isolation Levels Session-level changes (cont’d)Session-level changes (cont’d)

Session level settings impact entire session Session level settings impact entire session but can be overridden with table-level but can be overridden with table-level settingssettings

Level 0 – READ UNCOMMITTEDLevel 0 – READ UNCOMMITTEDLevel 1 – READ COMMITTEDLevel 1 – READ COMMITTEDLevel 2 – REPEATABLE READLevel 2 – REPEATABLE READLevel 3 – SERIALIZABLELevel 3 – SERIALIZABLE

SET TRANSACTION ISOLATION LEVEL SET TRANSACTION ISOLATION LEVEL optoptREAD UNCOMMITTEDREAD UNCOMMITTEDREAD COMMITTEDREAD COMMITTEDREPEATABLE READREPEATABLE READSERIALIZABLESERIALIZABLE

SNAPSHOTSNAPSHOTOnly in 2005 and Only in 2005 and onlyonly if the database option is on if the database option is on

Page 95: Fy06 SQL Intro

Allowing Read Committed using Allowing Read Committed using Statement-level SnapshotStatement-level Snapshot

Database optionDatabase optionALTER DATABASE <database_name> ALTER DATABASE <database_name>

SET READ_COMMITTED_SNAPSHOT ONSET READ_COMMITTED_SNAPSHOT ONWITH ROLLBACK AFTER 5WITH ROLLBACK AFTER 5

No other changes necessary…No other changes necessary…No changes to your queries or your applications – No changes to your queries or your applications – they automatically return with statement-level they automatically return with statement-level consistency consistency ((NOTE: You may need to modify your applications if you NOTE: You may need to modify your applications if you depend on locking – re: queues, readers to wait for changedepend on locking – re: queues, readers to wait for change))

Changes to blocking…Changes to blocking…However, if this is NOT your performance problem However, if this is NOT your performance problem (meaning concurrency isn’t your bottleneck) then (meaning concurrency isn’t your bottleneck) then you may hinder performance not improve you may hinder performance not improve Expect this change in behavior at a costExpect this change in behavior at a cost

Page 96: Fy06 SQL Intro

Allowing Snapshot IsolationAllowing Snapshot Isolation

Database optionDatabase optionALTER DATABASE <database_name>ALTER DATABASE <database_name>SET ALLOW_SNAPSHOT_ISOLATION ONSET ALLOW_SNAPSHOT_ISOLATION ON

Session settingSession settingSET TRANSACTION ISOLATION LEVEL SET TRANSACTION ISOLATION LEVEL SNAPSHOTSNAPSHOT

Changes to applications:Changes to applications:Request snapshot isolationRequest snapshot isolation

Test for conflict detectionTest for conflict detection

Expect this change in behavior at a Expect this change in behavior at a higherhigher cost cost

Page 97: Fy06 SQL Intro

Potential IssuesPotential Issues

Cost in row overhead – when Cost in row overhead – when enabled, enabled, 14 bytes added to row14 bytes added to row

If snapshot, do you depend on If snapshot, do you depend on locking?locking?

OK, most of you will say no but what OK, most of you will say no but what about about status queues…status queues…

Tip: Use READCOMMITTEDLOCK hintTip: Use READCOMMITTEDLOCK hint

If Snapshot Isolation, could you have If Snapshot Isolation, could you have conflicts?conflicts?

Be sure to have proper conflict detection Be sure to have proper conflict detection and error handling, see Kim’s Snapshot and error handling, see Kim’s Snapshot Isolation whitepaper for details and Isolation whitepaper for details and examplesexamples

Page 98: Fy06 SQL Intro

Management/MonitoringManagement/Monitoring

Version Store in TempDBVersion Store in TempDB

Versions removed when no longer neededVersions removed when no longer needed

Read committed using statement-level Read committed using statement-level snapshot won’t hold versions as long – in snapshot won’t hold versions as long – in theory because only statement-leveltheory because only statement-level

Snapshot isolation may have more impact Snapshot isolation may have more impact on TempDB as versions held for life of on TempDB as versions held for life of transactiontransaction

Lots of long running transactions may Lots of long running transactions may stress TempDBstress TempDB

See Kim’s Snapshot Isolation whitepaper See Kim’s Snapshot Isolation whitepaper for monitoring with DMVs (Dynamic for monitoring with DMVs (Dynamic Management Views)Management Views)

Page 99: Fy06 SQL Intro

Database DesignDatabase Design

Data TypesData Types

The Data RowThe Data Row

IndexesIndexes

PartitioningPartitioning

Security in DesignSecurity in Design

Page 100: Fy06 SQL Intro

EntitiesEntities

Basic principle behind a table – the Basic principle behind a table – the way we describe some specific way we describe some specific person, place or thing as a base person, place or thing as a base object within the databaseobject within the databaseFocus on a single element - Person, Focus on a single element - Person, place, thing…place, thing…

Authors – Each author is an Entity Authors – Each author is an Entity described by a row in the authors table.described by a row in the authors table.Titles – Each title, separate from the Titles – Each title, separate from the author(s) that wrote the title is described author(s) that wrote the title is described by a row in the titles table.by a row in the titles table.Publishers – Each publisher, separate Publishers – Each publisher, separate from any title(s) they may have from any title(s) they may have published is described by a row in the published is described by a row in the publishers table.publishers table.

Page 101: Fy06 SQL Intro

AttributesAttributes

Basic principle behind a column – the Basic principle behind a column – the way we describe some specific way we describe some specific element about the entityelement about the entity

Each column of a table should Each column of a table should describe ONLY that entity…describe ONLY that entity…

Core elements of an attributeCore elements of an attributeAttribute NameAttribute Namepriceprice

Data typeData type moneymoney

NullabilityNullability NULLNULL

Every attribute should describe the primary key, the whole primary key, and nothing but the primary key.

Page 102: Fy06 SQL Intro

Data TypesData Types

System Supplied Data TypesSystem Supplied Data TypesCharacterCharacter

Unicode CharacterUnicode Character

NumericNumeric

Date and TimeDate and Time

Exact NumericExact Numeric

UniqueidentifierUniqueidentifier

BinaryBinary

MiscellaneousMiscellaneous

Find the right data type for the job:Find the right data type for the job:

Use Unicode over ASCiiUse Unicode over ASCii

If the datatype varies:If the datatype varies: < 5 chars should be nchar< 5 chars should be nchar

5-20 chars – questionable5-20 chars – questionable

> 20 char – lean more towards> 20 char – lean more towardsnvarcharnvarchar

For numeric data:For numeric data: Find the right rangeFind the right range Standardize on decimal or Standardize on decimal or

numericnumericUse uniqueidentifier sparinglyUse uniqueidentifier sparinglyUnderstand precision and rangeUnderstand precision and range

Page 103: Fy06 SQL Intro

Data TypeData Type Max/Exact Max/Exact StorageStorage NotesNotes

char(n)char(n)varchar(n)varchar(n)

8000 8000 bytesbytes

1 byte per char - 8000 Characters1 byte per char - 8000 Characters

nchar(n)nchar(n)nvarchar(n)nvarchar(n)

8000 8000 bytesbytes

2 bytes per char - 4000 Characters2 bytes per char - 4000 Characters

tinyinttinyint 1 byte1 byte 0-2550-255

smallintsmallint 2 bytes2 bytes -32768 to 32767-32768 to 32767

intint 4 bytes4 bytes -2-23131 to 2 to 23131-1-1

bigintbigint 8 bytes8 bytes -2-26363 to 2 to 26363-1-1

uniqueidentifuniqueidentifierier

16 bytes16 bytes 98E94963-F193-4E69-9262-98E94963-F193-4E69-9262-7B692125557F7B692125557F

smalldatetimesmalldatetime 4 bytes4 bytes Jan 1, 1900 with precision to the Jan 1, 1900 with precision to the minuteminute

datetimedatetime 8 bytes8 bytes Jan 1, 1753 with precision to a Jan 1, 1753 with precision to a timetick (3.33 ms)timetick (3.33 ms)

smallmoneysmallmoney 4 bytes4 bytes - - 214,748.3648 through 214,748.3648 through +214,748.3647+214,748.3647

moneymoney 8 bytes8 bytes -922,337,203,685,477.5808 through -922,337,203,685,477.5808 through +922,337,203,685,477.5807+922,337,203,685,477.5807

Page 104: Fy06 SQL Intro

SQL Server 2005 Data TypesSQL Server 2005 Data Types

Large Value Data Types, Large Value Data Types, n/varchar(max) and varbinary(max)n/varchar(max) and varbinary(max)

In row storage limit 8000 bytesIn row storage limit 8000 bytes

Work like typical varchar but support Work like typical varchar but support LOB values (231 bytes)LOB values (231 bytes)

Custom Types – CLR basedCustom Types – CLR based

Be aware that LOB types added to a Be aware that LOB types added to a table prevent online index rebuildstable prevent online index rebuilds

May want to consider vertical May want to consider vertical partitioning for special typespartitioning for special types

Page 105: Fy06 SQL Intro

Character DataCharacter Data

Unicode – nchar, nvarchar, ntextUnicode – nchar, nvarchar, ntextApplication (web data) is stored nativelyApplication (web data) is stored natively

Supports wide variety of client environmentsSupports wide variety of client environments

Make sure to ALWAYS use 'N'Make sure to ALWAYS use 'N'

Examples:Examples:SELECT * FROM tname WHERE col = N'value'SELECT * FROM tname WHERE col = N'value'DECLARE @variable nvarchar(100)DECLARE @variable nvarchar(100)SELECT @variable = N'value'SELECT @variable = N'value'SELECT @Str = N'string' + N'string' + N'string'SELECT @Str = N'string' + N'string' + N'string'

CLR Extensibility requires unicode!CLR Extensibility requires unicode!

ASCii – char, varchar, textASCii – char, varchar, textThink before using if new data/tables use Unicode Think before using if new data/tables use Unicode insteadinstead

Page 106: Fy06 SQL Intro

NullabilityNullability

No specific value required at INSERTNo specific value required at INSERTNULL values DO NOT mean empty NULL values DO NOT mean empty space (stored separately from the space (stored separately from the column data)column data)Working with NULLsWorking with NULLs

Accessing columns which allow NULL Accessing columns which allow NULL values can cause inconsistencies when values can cause inconsistencies when developers/users are not aware of themdevelopers/users are not aware of themMath with NULL values can produce Math with NULL values can produce interesting results (? – NULL = NULL)interesting results (? – NULL = NULL)ANSI session settings can affect results ANSI session settings can affect results sets when accessing columns that allow sets when accessing columns that allow nullsnulls

NULL values v. DEFAULT valueNULL values v. DEFAULT value

Page 107: Fy06 SQL Intro

Creating TablesCreating Tables

Creating and Dropping a basic TableCreating and Dropping a basic Table

Determining Which Type of Determining Which Type of Constraint to UseConstraint to Use

Enforcing Data IntegrityEnforcing Data Integrity

How SQL Server Organizes Data in How SQL Server Organizes Data in RowsRows

How SQL Server Organizes Rows on a How SQL Server Organizes Rows on a PagePage

Initial Table Structure with Clustered Initial Table Structure with Clustered IndexIndex

Intro to Vertical PartitioningIntro to Vertical Partitioning

Generating ScriptsGenerating Scripts

Page 108: Fy06 SQL Intro

Creating a TableCreating a Table

Specifying NULL or NOT NULLSpecifying NULL or NOT NULLNot requiredNot required

Defaults to session settingDefaults to session settingSET ANSI_NULL_DFLT_ON ONSET ANSI_NULL_DFLT_ON ON

Creating and Dropping a TableCreating and Dropping a Table

Column nameColumn name Data typeData typeNULL or NULL or NOT NULLNOT NULL

CREATE TABLE dbo.publishers ( pub_id

pub_namecitystatecountry

)

char(4) varchar(40) varchar(20)char(2)varchar(30)

NOT NULL,NULL,NULL,NULL,NULL

Page 109: Fy06 SQL Intro

The Data RowThe Data Row

Header Fixed Data NB VB Variable Data

NullBlock

VariableBlock

4 bytes

Data

Page 110: Fy06 SQL Intro

Rows on a PageRows on a Page

Page HeaderPage Header

Row ARow ARow ARow A

Row CRow CRow CRow C

Row BRow BRow BRow B

ABC

Data rows

Row Offset Table2 bytes each

96bytes

8,096bytes

Page 111: Fy06 SQL Intro

Optimal Row WidthOptimal Row Width

Consider Table Usage above all elseConsider Table Usage above all elseEstimate Average Row LengthEstimate Average Row Length

OverheadOverheadFixed Width ColumnsFixed Width ColumnsEstimate Average from realistic sample Estimate Average from realistic sample datadataSELECT avg(datalength(columnname)) FROM SELECT avg(datalength(columnname)) FROM

tnametname

Calculate Rows/PageCalculate Rows/Page8096 Bytes/Page8096 Bytes/Page

??? Bytes/Row??? Bytes/Row

Calculate Wasted Bytes—on Disk Calculate Wasted Bytes—on Disk in Memoryin Memory

= Rows/Page

Page 112: Fy06 SQL Intro

Row Size/Table StructureRow Size/Table Structure

Rows CAN span pages in SQL Server Rows CAN span pages in SQL Server 20052005Is that always the best choice?Is that always the best choice?Table access/usage patterns should Table access/usage patterns should dictate structuredictate structureKeep active/OLTP columns in primary Keep active/OLTP columns in primary tabletableSeparate LOB types to secondary Separate LOB types to secondary table for online table/index rebuildstable for online table/index rebuildsCreate initial table structure to Create initial table structure to minimize the need for rebuilds minimize the need for rebuilds ((Designing for Performance – more Designing for Performance – more info coming up!info coming up!))

Page 113: Fy06 SQL Intro

Design Techniques SummaryDesign Techniques Summary

Rows CAN span pages however, Rows CAN span pages however, record size impacts disk space, cache record size impacts disk space, cache utilization, concurrency and utilization, concurrency and maintenance restrictionsmaintenance restrictionsThe lowest granularity of locking The lowest granularity of locking holds an entire row – partitioning a holds an entire row – partitioning a table vertically or horizontally can table vertically or horizontally can improve concurrency, minimize improve concurrency, minimize restrictions on online rebuilds and restrictions on online rebuilds and improve cache utilizationimprove cache utilizationOptimizer is aware of constraints – Optimizer is aware of constraints – efficient horizontal partitioning efficient horizontal partitioning requires constraintsrequires constraintsDuplicating Keys which define Duplicating Keys which define relationships can improve join relationships can improve join performance by giving the optimizer performance by giving the optimizer more options for performing a joinmore options for performing a join

Page 114: Fy06 SQL Intro

Key PointsKey Points

Minimize data redundancy by using Minimize data redundancy by using attributes specific only to the entity table attributes specific only to the entity table definitiondefinitionSpecify Appropriate Data Types and Data Specify Appropriate Data Types and Data Type Sizes – try to keep row sizes compactType Sizes – try to keep row sizes compactAlways Specify Column Characteristics in Always Specify Column Characteristics in CREATE TABLE – especially column CREATE TABLE – especially column NullabilityNullabilityGenerate Scripts as backup scripts and for Generate Scripts as backup scripts and for learninglearningTake into account the usage of the data Take into account the usage of the data when defining tables, consider partitioning when defining tables, consider partitioning tables based on usagetables based on usageDefine Constraints on your tables – for Define Constraints on your tables – for performance as well as centralizationperformance as well as centralization

Page 115: Fy06 SQL Intro

Why partition?Why partition?

Resource Contention LimitationsResource Contention Limitations

Varying Access patternsVarying Access patterns

Maintenance RestrictionsMaintenance Restrictions

Availability RequirementsAvailability Requirements

To remove resource blocking or To remove resource blocking or minimize maintenance costsminimize maintenance costs

But:But:Partitioning does not automatically mean Partitioning does not automatically mean Distributed Partitioned Views (DPVs)Distributed Partitioned Views (DPVs)

DPVs are a form of scale-out partitioningDPVs are a form of scale-out partitioning

Page 116: Fy06 SQL Intro

Table’s Shared Data/IndexesTable’s Shared Data/Indexes

A Single Large Table…A Single Large Table… Presents different types of problemsPresents different types of problems

ManagementManagement

Index create/build or rebuildsIndex create/build or rebuilds

Backup/Restore and RecoveryBackup/Restore and Recovery

Escalation Escalation

May have different access patternsMay have different access patterns Some data new – OLTP (inserts/updates for Some data new – OLTP (inserts/updates for new and current sales)new and current sales)

Some data old for historical lookups – small Some data old for historical lookups – small singleton for OTLP lookups or larger for singleton for OTLP lookups or larger for analysisanalysis

Does NOT have to be one table!Does NOT have to be one table!

Page 117: Fy06 SQL Intro

Indexes – Smaller TablesIndexes – Smaller Tables

Index Creation/BuildingIndex Creation/BuildingSmaller Tables – take less timeSmaller Tables – take less time

Different index decisionsDifferent index decisionsOLTP – less indexesOLTP – less indexes

DSS – more indexesDSS – more indexes

Index Maintenance Index Maintenance Smaller Tables – takes less time to Smaller Tables – takes less time to rebuild/defragrebuild/defrag

Different Access patterns – require less Different Access patterns – require less frequent maintenance or NONEfrequent maintenance or NONE

OLTP – rebuild/defrag more frequentlyOLTP – rebuild/defrag more frequently

DSS – build once!DSS – build once!

Page 118: Fy06 SQL Intro

Backup Strategies VaryBackup Strategies Vary

Transaction ProcessingTransaction ProcessingBackups (should be) more frequentBackups (should be) more frequent

More critical dataMore critical data

Should be highly availableShould be highly available

Usually smaller portion of VLDB Usually smaller portion of VLDB

Decision SupportDecision SupportBackups needed less frequentlyBackups needed less frequently

Less critical dataLess critical data

Should be available but not “as” criticalShould be available but not “as” critical

Usually a larger portion of VLDBUsually a larger portion of VLDB

Page 119: Fy06 SQL Intro

AvailabilityAvailability

SQL Server 2000SQL Server 2000Database online or notDatabase online or notEntire database offline, if damaged…Entire database offline, if damaged…

SQL Server 2005SQL Server 2005Database has an associated “state”Database has an associated “state”

Online, restoring, recovering, …Online, restoring, recovering, …Database in only one state at any given timeDatabase in only one state at any given time

Files/Filegroups also have statesFiles/Filegroups also have statesSecondary filegroups have independent stateSecondary filegroups have independent stateFilegroups can be offline and/or recovering while Filegroups can be offline and/or recovering while remainder of database is availableremainder of database is available

State of the database State of the database State of the primary filegroupState of the primary filegroupState of the transaction logState of the transaction log

Page 120: Fy06 SQL Intro

Access PatternsAccess Patterns

Transaction ProcessingTransaction ProcessingSmaller INSERT/UPDATE/DELETE activitySmaller INSERT/UPDATE/DELETE activity

Speed/durability of highest importanceSpeed/durability of highest importance

Decision SupportDecision SupportLarger Range QueriesLarger Range Queries

Less critical dataLess critical data

More indexesMore indexes

Query access patterns may Query access patterns may conflict with write activityconflict with write activity

Page 121: Fy06 SQL Intro

Partitioning StrategiesPartitioning Strategies

Vertical PartitioningVertical PartitioningMultiple tables – “subsets” of columnsMultiple tables – “subsets” of columns

Different column setsDifferent column setsSeparate sets based on usage patternsSeparate sets based on usage patterns

Primary key is redundant, possibly other Primary key is redundant, possibly other columnscolumns

Horizontal Range Partitioning Horizontal Range Partitioning Multiple tables – all columnsMultiple tables – all columns

Different row setsDifferent row setsSeparate sets based on date rangeSeparate sets based on date range

Separate sets based on lists – also “range” Separate sets based on lists – also “range” partitionspartitions

Page 122: Fy06 SQL Intro

Consider Customer Table Consider Customer Table with 1,600,000 Rowswith 1,600,000 Rows

14 Columns1000 Bytes/Row8 Rows/Page200,000 Pages1.6GB Table

CustomerPersonal

18 Columns*1600 Bytes/Row5 Rows/Page320,000 Pages2.5GB Table

CustomerProfessional

17 Columns*2000 Bytes/Row4 Rows/Page400,000 Pages3.2 GB Table

CustomerMisc

47 Columns4600 Bytes/RowOnly 1 Row/Page

3400+ Bytes Wasted1.6 Million Pages12.8 GB Table

Customer

*The Primary key column must be made redundant for these two additional tables. 47 Columns in Employee. 49 Columns total between 3 tables.

Customer= 12.8 GB

Partitioned Tables= 7.3 GB

•Savings in Overall Disk Space (5.5 GB Saved)•Not reading data into cache when not necessary•Locks are table specific therefore less contention at the row level

Page 123: Fy06 SQL Intro

Vertical PartitioningVertical Partitioning

Optimizing row size for…Optimizing row size for…CachingCaching

More rows on a page, more rows in memoryMore rows on a page, more rows in memory

LockingLockingOnly locking the columns that are of interest. Only locking the columns that are of interest. Minimize row based conflictsMinimize row based conflicts

Usage Defines Vertical PartitionsUsage Defines Vertical PartitionsLogically Group Columns to minimize Logically Group Columns to minimize joinsjoins

Consider Read Only vs. OLTP ColumnsConsider Read Only vs. OLTP Columns

Consider Columns often used togetherConsider Columns often used together

Page 124: Fy06 SQL Intro

Sales2002_01

Sales2002_02

Sales2002_03

Sales2002_04...

Table or View

Sales2000

Table or View

Sales2001

Table or View

Sales2002

Horizontal PartitioningHorizontal Partitioning

874 Million RowsAll Sales Since January 2000•Current Sales (INSERTs) often

blocked by DSS queries. •Index Rebuilds take quite a

bit of time•Weekly FULL Database Backups are backing up the same sales

(2000 & 2001) repeatedly

Sales

Sales2000_01

Sales2000_02

Sales2000_03

Sales2000_04...

Sales2000_12

Sales2001_01

Sales2001_02

Sales2001_03

Sales2001_04...

Sales2001_12

Using CHECK ConstraintsCHECK (SalesDate >= '20010101'

AND SalesDate < '20010201')Sales2002_12

Page 125: Fy06 SQL Intro

Horizontal PartitioningHorizontal Partitioning

More manageable chunksMore manageable chunksMinimize the impact of maintenance Minimize the impact of maintenance

Improve loading options for sliding range Improve loading options for sliding range scenarioscenarioIndex Maintenance (Smaller/Less Index Maintenance (Smaller/Less Frequent)Frequent)Backup/Restore (File/Filegroup strategies Backup/Restore (File/Filegroup strategies to minimize backup frequency of to minimize backup frequency of predominantly ReadOnly tables – more predominantly ReadOnly tables – more options for restore when isolated options for restore when isolated disk/RAID array corruption)disk/RAID array corruption)Lock Escalation (Modifications to one Lock Escalation (Modifications to one partition do not cause escalation against partition do not cause escalation against the others – as they likely would if the others – as they likely would if everything were in one table!)everything were in one table!)

Page 126: Fy06 SQL Intro

““Partitioning” StrategiesPartitioning” Strategies

Proportional Fill using FilegroupsProportional Fill using Filegroups

One table on one filegroup with four One table on one filegroup with four filesfiles

Easy to administerEasy to administer

Data is proportionally filled among four Data is proportionally filled among four filesfiles

Data on each file is not “similar” in the Data on each file is not “similar” in the sense that it’s not 1st Q, 2nd Q, etc.sense that it’s not 1st Q, 2nd Q, etc.

All Objects and all dates are interleavedAll Objects and all dates are interleaved

Page 127: Fy06 SQL Intro

First Glance…First Glance…

Proportionally filledProportionally filled

Single Single Table Table

Single Single FGFG

Four Four Tables Tables

Four Four FGsFGs

Page 128: Fy06 SQL Intro

Functional Partitioning Functional Partitioning StrategiesStrategiesData-based partitioningData-based partitioning

SQL Server 2000 – Partitioned Views Four tables created on four separate Four tables created on four separate filegroupsfilegroupsMore complex to administer/createMore complex to administer/createData is NOT proportionally filledData is NOT proportionally filledData on each file IS “similar” as the data Data on each file IS “similar” as the data is filled based on date – i.e. 1is filled based on date – i.e. 1stst Q, 2 Q, 2ndnd Q, Q, etc.etc.The same quarter’s dates are on the The same quarter’s dates are on the same file (index and table are storage same file (index and table are storage aligned)aligned)

SQL Server 2005 – Partitioned TablesOne table created on a partition schemeOne table created on a partition schemeData placement and breakdown is same Data placement and breakdown is same as PVsas PVsEasier to setup!Easier to setup!

Page 129: Fy06 SQL Intro

Functional PartitioningFunctional PartitioningMore granular controlMore granular control – both 2000/2005 – both 2000/2005

Varying access patternsVarying access patterns

Varying index defrag patternsVarying index defrag patterns

Very fast sliding window control (Very fast sliding window control (more details more details next!next!))

More backup/restore choicesMore backup/restore choices – in 2000 – in 2000 but many restrictions removed in 2005but many restrictions removed in 2005

File/filegroup backupsFile/filegroup backups

Even support for file/filegroup in Simple Even support for file/filegroup in Simple Recovery Model (backups separated between Recovery Model (backups separated between RW and RO backups and RO backups can be RW and RO backups and RO backups can be performed at any frequency!)performed at any frequency!)

Higher AvailabilityHigher Availability – in 2005 because of – in 2005 because of partial database availability featurespartial database availability features

Page 130: Fy06 SQL Intro

Active Sales in OLTP systemActive Sales in OLTP system

Secondary VLDB used for reportingSecondary VLDB used for reporting

Range of two year’s dataRange of two year’s data

Monthly the active OLTP data is copied Monthly the active OLTP data is copied over and the oldest month is backed over and the oldest month is backed up and archivedup and archived

SepSep20022002

OctOct20022002

NovNov20022002

DecDec20022002 ……

AprApr20032003

MayMay20032003 ……

JunJun20042004

JulJul20042004

AugAug20042004

SepSep20042004

SepSep20022002

OctOct20022002

NovNov20022002

DecDec20022002 ……

AprApr20032003

MayMay20032003 ……

JunJun20042004

JulJul20042004

AugAug20042004

SepSep20042004

On October 1, 2004On October 1, 2004

OctOct20042004

VLDB – Sales ReportingVLDB – Sales Reporting

Page 131: Fy06 SQL Intro

Rolling Range (or Sliding Rolling Range (or Sliding Window)Window)Key ComponentsKey ComponentsData LoadData Load

Single TableSingle TableActive Table impactedActive Table impacted

Indexes need to be updatedIndexes need to be updated

Partitioned Object (PV in 2000/PT in 2005)Partitioned Object (PV in 2000/PT in 2005)Table outside of active view manipulatedTable outside of active view manipulated

Indexes can be built separately of active tablesIndexes can be built separately of active tables

Data RemovalData RemovalSingle Table – same problemSingle Table – same problem

Active Table impactedActive Table impacted

Indexes need to be updatedIndexes need to be updated

Partitioned Object (PV in 2000/PT in 2005)Partitioned Object (PV in 2000/PT in 2005)Table can be removed from PO and then droppedTable can be removed from PO and then dropped

Page 132: Fy06 SQL Intro

"Single" table process:"Single" table process:1.1. Alter the table's check constraint to support new Alter the table's check constraint to support new

quarter (drop the old constraint)quarter (drop the old constraint)2.2. Load the data using BULK INSERTLoad the data using BULK INSERT3.3. Delete the first quarter of F04Delete the first quarter of F044.4. Alter the table's check constraint to show new Alter the table's check constraint to show new

year range (drop the old constraint)year range (drop the old constraint)

Partitioned table process:Partitioned table process:1.1. Create New table - with constraintsCreate New table - with constraints2.2. Load the data using BULK INSERTLoad the data using BULK INSERT3.3. Build Indexes – CL + 3 NC on FYQ1 FilegroupBuild Indexes – CL + 3 NC on FYQ1 Filegroup4.4. Change the view to use FY05Q1 and remove Change the view to use FY05Q1 and remove

FY04Q1FY04Q15.5. Drop the table which represents FY04Q1 (could Drop the table which represents FY04Q1 (could

archived/backup)archived/backup)

1 min 36 sec1 min 36 sec

40+ minutes40+ minutes

Total TimesTotal Times

Page 133: Fy06 SQL Intro

Range Partitioned TablesRange Partitioned Tables

Scripts from Whitepaper on Partitioned Scripts from Whitepaper on Partitioned TablesTables

Step 1: Create FilegroupsStep 1: Create Filegroups

Step 2: Create Files in FilegroupsStep 2: Create Files in Filegroups

Step 3: Create Partition FunctionStep 3: Create Partition Function

Step 4: Create Partition SchemeStep 4: Create Partition Scheme

Step 5: Create Table(s) on SchemeStep 5: Create Table(s) on Scheme

Step 6: Verify Data using system table Step 6: Verify Data using system table (optional)(optional)

Step 7: Add data to tables – SQL Server Step 7: Add data to tables – SQL Server redirects data and queries to appropriate redirects data and queries to appropriate partitionpartition

Page 134: Fy06 SQL Intro

New Data must come inNew Data must come in

Old data must be removedOld data must be removed

The table should only change it’s overall The table should only change it’s overall data set… with simple metadata changesdata set… with simple metadata changes

To create metadata ONLY changesTo create metadata ONLY changesCreate a new table on the same filegroup where Create a new table on the same filegroup where the partition residesthe partition resides

Structure it IDENTICALLY to that of the Structure it IDENTICALLY to that of the partitioned tablepartitioned table

SWITCH the partition to live in the new table and SWITCH the partition to live in the new table and the table to become the partition… this either the table to become the partition… this either brings data IN or OUT of the tablebrings data IN or OUT of the table

The Sliding Window ScenarioThe Sliding Window Scenario

Page 135: Fy06 SQL Intro

Developing on SQL Server 2005Developing on SQL Server 2005

Choices, Choices and More ChoicesChoices, Choices and More Choices

Secure DevelopmentSecure Development

Performance and ScalabilityPerformance and Scalability

Page 136: Fy06 SQL Intro

Development DilemmaDevelopment Dilemma

T-SQLT-SQL

XMLXML

CLRCLR

Computation Computation & Framework & Framework

accessaccess

Relational Relational data accessdata access

Semi-structuredSemi-structureddata accessdata access

Page 137: Fy06 SQL Intro

SQL Server 2005 ToolkitSQL Server 2005 Toolkit

SQLSQL

ServiceServiceBrokerBroker

TransactTransactSQLSQL

CLRCLRPrograms & TypesPrograms & Types

Relational DataRelational Data

SQLSQL XMLXML FTSFTS

Page 138: Fy06 SQL Intro

SQL Server 2005 ToolkitSQL Server 2005 Toolkit

SQLSQL XMLXML FTSFTSXMLXML

XMLXMLtypetype

X/QueryX/Query XML IndexesXML Indexes

Semi-Structured Semi-Structured DataData

XML XML SchemaSchema

Page 139: Fy06 SQL Intro

SQL Server 2005 ToolkitSQL Server 2005 Toolkit

SQLSQL XMLXML FTSFTSFTSFTS

MicrosoftMicrosoftSearchSearch

OfficeOfficeDocumentsDocuments

Integrated with Integrated with database backupdatabase backup

Unstructured Unstructured DataData

Page 140: Fy06 SQL Intro

Good Scenario for CLR UsageGood Scenario for CLR Usage

Data validation & network traffic reductionData validation & network traffic reduction

Writing general purpose functions:Writing general purpose functions:Data passed as argumentsData passed as arguments

Little/no additional data accessLittle/no additional data access

Complex computation applied on a row by row basis to Complex computation applied on a row by row basis to the datathe data

Scalar types & custom aggregationsScalar types & custom aggregations

Leveraging the power of the .NET FrameworkLeveraging the power of the .NET FrameworkAccess to a rich set of pre-built functionalityAccess to a rich set of pre-built functionality

Replacing Extended Stored Procedures (XP)Replacing Extended Stored Procedures (XP)The CLR is safer: The CLR is safer:

No access violations making SQL Server crash No access violations making SQL Server crash

No leaks making SQL Server slow down & crashNo leaks making SQL Server slow down & crash

Better performance & scalability (managed memory model)Better performance & scalability (managed memory model)

No security issues…No security issues…

Page 141: Fy06 SQL Intro

Bad Scenario for CLR UsageBad Scenario for CLR Usage

Heavy data access – Transact-SQL set Heavy data access – Transact-SQL set based access will be fasterbased access will be faster

Don’t write SELECT statements as CLR Don’t write SELECT statements as CLR procedures!procedures!

Complex typesComplex types8K size limitation8K size limitation

All data is read/re-written when updatedAll data is read/re-written when updated

Pre-Aggregation for ReportsPre-Aggregation for ReportsCLR Aggregates cannot be used in Indexed CLR Aggregates cannot be used in Indexed ViewsViews

Your application must support previous Your application must support previous versions of SQL Serverversions of SQL Server

Technology for technology’s sake…Technology for technology’s sake…

Page 142: Fy06 SQL Intro

XML DevelopmentXML Development<xs:complexType name ="myelement" <xs:complexType name ="myelement"

mixed="true">mixed="true">  <xs:sequence minOccurs="0">  <xs:sequence minOccurs="0">    <xs:element name="child1"     <xs:element name="child1" type="xs:integer" />type="xs:integer" />    <xs:element name="child2"     <xs:element name="child2" type="xs:integer" />type="xs:integer" />  </xs:sequence>  </xs:sequence></xs:complexType></xs:complexType>

<xs:element name="myelement" <xs:element name="myelement" type="myelement" />type="myelement" />

  <myelement><myelement>  Some text  Some text  <child1>3</child1>  <child1>3</child1>  Some more text  Some more text  <child2>4</child2>  <child2>4</child2>  Some more text  Some more text</myelement></myelement>

<xs:complexType name <xs:complexType name ="myelementWithChildren">="myelementWithChildren">  <xs:sequence>  <xs:sequence>    <xs:element name="child1"     <xs:element name="child1" type="xs:integer" />type="xs:integer" />    <xs:element name="child2"     <xs:element name="child2" type="xs:integer" />type="xs:integer" />  </xs:sequence>  </xs:sequence></xs:complexType></xs:complexType>

<xs:element name="myelement" <xs:element name="myelement" type="xs:anyType" />type="xs:anyType" />

Native XML datatypeNative XML datatypeOptimized on-disk Optimized on-disk structure (not “just a structure (not “just a blob”)blob”)

XML IndexesXML IndexesEntities & AttributesEntities & Attributes

X/PATH Query Engine X/PATH Query Engine Integrated with core Integrated with core relational operatorsrelational operators

Schema SupportSchema SupportStrongly typed XMLStrongly typed XML

CLR/XML IntegrationCLR/XML IntegrationXMLTextReaderXMLTextReaderAccess to XSLTAccess to XSLT

Full Text SearchFull Text Search

Page 143: Fy06 SQL Intro

XML IndexesXML Indexes

Primary Index:Primary Index:Requires a clustered primary key on the base Requires a clustered primary key on the base tabletable

B-Tree of element and attribute names, node B-Tree of element and attribute names, node values, and node types, retains document order values, and node types, retains document order and structure, as well as the path from the root and structure, as well as the path from the root of the XML instance to each node for efficient of the XML instance to each node for efficient evaluation of path expressionsevaluation of path expressions

Secondary Indexes off of the primary XML Secondary Indexes off of the primary XML index index

PATH PATH columns (path, value) columns (path, value)

PROPERTY PROPERTY columns (PK, path, value) columns (PK, path, value)

VALUE VALUE columns (value, path) columns (value, path)

Page 144: Fy06 SQL Intro

Scenario for XML DevelopmentScenario for XML Development

Good Scenario:Good Scenario:Data is semi-structured, Data is semi-structured, small core of fixed data small core of fixed data with many, sparsely with many, sparsely populated extended populated extended attributes attributes

Multi-value Property Multi-value Property bagsbags

Complex Property bagsComplex Property bags

““WordXML”WordXML”

Documents are large but Documents are large but rarely updatedrarely updated Indexing will pay off Indexing will pay off

Data is hierarchicalData is hierarchical regular expressions regular expressions are well suited for finding are well suited for finding datadata

Bad Scenario:Bad Scenario:““Database in a Cell”Database in a Cell”

Documents are large and Documents are large and updated frequentlyupdated frequently

Document update Document update contention is likelycontention is likely

Data is fully structured & Data is fully structured & populated populated candidate candidate for conversion to for conversion to relational schemarelational schema

Data contains large Data contains large binary objects (2GB binary objects (2GB limitation)limitation)

Page 145: Fy06 SQL Intro

Transact-SQL EnhancementsTransact-SQL Enhancements

ROW_NUMBERROW_NUMBER

RANK, DENSE_RANK RANK, DENSE_RANK

Common Table ExpressionsCommon Table Expressions

PIVOT/UNPIVOTPIVOT/UNPIVOT

CROSS APPLY and OUTER APPLYCROSS APPLY and OUTER APPLY

Error handling – TRY/CATCH Error handling – TRY/CATCH

DDL Triggers (synchronous)DDL Triggers (synchronous)

Event Notifications (asynchronous)Event Notifications (asynchronous)

Parameterized TOPParameterized TOP

Page 146: Fy06 SQL Intro

Whose Fault was SQL Injection?Whose Fault was SQL Injection?

OK. We got your attention. It was not OK. We got your attention. It was not you anyway, of course.you anyway, of course.

Security for a database developer Security for a database developer was a very rude wake-up call.was a very rude wake-up call.

We We feel feel your pain.your pain.

We can help you – in this session!We can help you – in this session!

Page 147: Fy06 SQL Intro

Defense in DepthDefense in DepthUsing a layered approach:Using a layered approach:

Increases an attacker’s risk of detection Increases an attacker’s risk of detection

Reduces an attacker’s probability of Reduces an attacker’s probability of successsuccess

Policies, Procedures, & Awareness

Policies, Procedures, & Awareness

OS hardening, SQL Server patchingOS hardening, SQL Server patching

Firewalls, packet filtersFirewalls, packet filters

Guards, locks, tracking devices, Guards, locks, tracking devices, HSM, tamper-evident labelsHSM, tamper-evident labels

SSL, IPSec, No problemsSSL, IPSec, No problems

Application hardening, authorizationApplication hardening, authorization

Permissions, encryptionPermissions, encryption

DBA education against social DBA education against social engineeringengineering

Physical SecurityPhysical Security

PerimeterPerimeter

Internal NetworkInternal Network

HostHost

ApplicationApplication

DataData

Page 148: Fy06 SQL Intro

Application ProtectionApplication Protection

Protection against attacks against the Protection against attacks against the applicationapplication

Techniques and practices used by the Techniques and practices used by the database code (modules, stored database code (modules, stored procedures) to avoid being exploited procedures) to avoid being exploited by callersby callers

Permissions and ACLs on the data are Permissions and ACLs on the data are also part of application protection from also part of application protection from the perspective of the calling applicationthe perspective of the calling application

Page 149: Fy06 SQL Intro

Secure DevelopmentSecure Development

How things used to work?How things used to work?

User-schema separationUser-schema separation

Granular permissionsGranular permissions

EXECUTE ASEXECUTE AS

SQL Injection (Classic)SQL Injection (Classic)

SQL Injection (Future)SQL Injection (Future)

Page 150: Fy06 SQL Intro

Security through EncapsulationSecurity through Encapsulation

Granting access to a process or a Granting access to a process or a method without direct access to base method without direct access to base objectsobjects

How?How?Grant access to the stored procedures, Grant access to the stored procedures, views and/or functions without granting views and/or functions without granting access to the base objectaccess to the base object

RequiresRequiresThe objects in the dependency chain The objects in the dependency chain cannot be broken (object ownership cannot be broken (object ownership chaining)chaining)

The user has to have direct base object The user has to have direct base object access (which is what you’re trying to access (which is what you’re trying to avoid)avoid)

Page 151: Fy06 SQL Intro

Do NOT Give Base Object Do NOT Give Base Object AccessAccessThe SQL Server 2000 WayThe SQL Server 2000 Way

Only works when the Only works when the object dependency object dependency chain is NOT brokenchain is NOT brokenOnly works when the Only works when the commands inside of commands inside of the stored procedure the stored procedure are not built/executed are not built/executed dynamicallydynamically

dbo.Salesdbo.Salesdbo.Salesdbo.Sales

UserUserUserUser

dbo.DeleteASalesdbo.DeleteASalesdbo.DeleteASalesdbo.DeleteASales

Granted execute accessto procedure

has NO direct permissionsmay even be DENIED permissions to Sales

Page 152: Fy06 SQL Intro

User-Schema SeparationUser-Schema Separation

Database can contain Database can contain multiple schemasmultiple schemasEach schema has an Each schema has an owning principal – user owning principal – user or roleor roleEach user has a default Each user has a default schema schema for name resolutionfor name resolutionDatabase objects live Database objects live in schemasin schemasObject creation inside Object creation inside schema requires schema requires CREATE permission CREATE permission andand ALTER or CONTROL ALTER or CONTROL permission permission on the schemaon the schemaOwnership chaining still Ownership chaining still based on owners not based on owners not schemas schemas

Role1Role1

OwnsOwns has default has default schema2schema2

OwnsOwns

OwnsOwns

Schema1Schema1 Schema2Schema2

Schema3Schema3

SP1SP1Fn1Fn1

Tab1Tab1

DatabaseDatabase

User1User1 Approle1Approle1

Page 153: Fy06 SQL Intro

Default SchemaDefault Schema

Used for name resolution purposesUsed for name resolution purposes

Common name resolution across multiple Common name resolution across multiple usersusers

No need to rely on DBO schemaNo need to rely on DBO schema

Using DBO schema may result in security Using DBO schema may result in security issuesissues

Object Creation requires higher privilegesObject Creation requires higher privileges

Mitigates concerns resulting from ownership Mitigates concerns resulting from ownership chainingchaining

Instead create “buckets” of objects Instead create “buckets” of objects through “schemas” where schemas have through “schemas” where schemas have owners and developers have default owners and developers have default schemas and/or control on needed schemas and/or control on needed schemasschemas

Page 154: Fy06 SQL Intro

Do NOT Give Base Object Do NOT Give Base Object AccessAccessThe SQL Server 2005 WayCreate a schema – for example one Create a schema – for example one

which contains procedures (and which contains procedures (and possibly even base tables) and then possibly even base tables) and then GRANT EXECUTE at the schema levelGRANT EXECUTE at the schema level

Add objects to appropriate schemaAdd objects to appropriate schema

Grant access to the schema or the Grant access to the schema or the individual objects – if chaining is individual objects – if chaining is required, it is still based on the required, it is still based on the owner… the owner of the schemaowner… the owner of the schema

Page 155: Fy06 SQL Intro

What if Dynamic String What if Dynamic String ExecutionExecution

By default – and for better security – By default – and for better security – if the stored procedure has a if the stored procedure has a statement which is built dynamically statement which is built dynamically (using EXEC('string') or (using EXEC('string') or EXEC(@variable)) then the context EXEC(@variable)) then the context under which the dynamically under which the dynamically constructed string executes is constructed string executes is ALWAYS the callerALWAYS the caller

Which is what helps to prevent some Which is what helps to prevent some forms of SQL Injectionforms of SQL Injection

This is really a This is really a goodgood thing BUT… thing BUT…

Can be limitingCan be limitingEnter: Enter: EXECUTE ASEXECUTE AS

Page 156: Fy06 SQL Intro

Module Execution ContextModule Execution Context

Execute AS Execute AS CALLERCALLERDefault behavior, same as SQL Server 2000Default behavior, same as SQL Server 2000Use when caller’s permission needs to be checked – or Use when caller’s permission needs to be checked – or when ownership chaining will sufficewhen ownership chaining will suffice

Execute AS Execute AS 'UserName''UserName'Statements execute as the username specifiedStatements execute as the username specifiedImpersonate permission required on user specified Impersonate permission required on user specified

Execute AS Execute AS OWNEROWNERStatements execute as the current owner of the moduleStatements execute as the current owner of the moduleImpersonate privileges on owner required, at setting Impersonate privileges on owner required, at setting timetimeOn ownership change, context is new ownerOn ownership change, context is new owner

Execute AS Execute AS SELFSELFStatements execute as the person specifying the execute Statements execute as the person specifying the execute as clause for the module – even if the ownership changesas clause for the module – even if the ownership changesMay be useful in application scenarios where calling May be useful in application scenarios where calling context may changecontext may change

Page 157: Fy06 SQL Intro

EXECUTE AS…EXECUTE AS…

Solves problemsSolves problemsAllows permissions to be granted where Allows permissions to be granted where never possible (e.g. granting truncate never possible (e.g. granting truncate table)table)

Wrap ANYTHING inside a stored Wrap ANYTHING inside a stored procedure and set the context to run as procedure and set the context to run as someone who has permissions – even someone who has permissions – even dynamic string execution – then give dynamic string execution – then give execute permissionexecute permission

Creates potential for further SQL Creates potential for further SQL InjectionInjection

What if you’re code is not well tested What if you’re code is not well tested and uses dynamically executed stringsand uses dynamically executed strings

Page 158: Fy06 SQL Intro

SQL InjectionSQL Injection

SQL statement(s) or clause(s) SQL statement(s) or clause(s) injected into an existing SQL injected into an existing SQL commandcommand

Injected string appended to Injected string appended to application inputapplication input

Text boxesText boxes

Query stringsQuery strings

Manipulated values in HTML Manipulated values in HTML

Why SQL injection works?Why SQL injection works?Application accepts arbitrary user inputApplication accepts arbitrary user input

Connection made in context of higher Connection made in context of higher privileged accountprivileged account

Page 159: Fy06 SQL Intro

SQL Injection MitigationSQL Injection Mitigation

Do not trust the user’s input!Do not trust the user’s input!Look for valid input and reject everything Look for valid input and reject everything elseelse

Protect identifiers with QUOTENAME()Protect identifiers with QUOTENAME()

Regular expressions are your friend!Regular expressions are your friend!

Do not use string concatenation Do not use string concatenation Use parameterized queries to build Use parameterized queries to build queriesqueries

Restrict information in error Restrict information in error messagesmessages

Use a low-privileged accountUse a low-privileged accountDO NOT use sa or sysadmin role memberDO NOT use sa or sysadmin role member

Page 160: Fy06 SQL Intro

Data ProtectionData Protection

Use of cryptography to protect the Use of cryptography to protect the data layer from theft or manipulationdata layer from theft or manipulation

Particularly important if offline data is in Particularly important if offline data is in transittransitImportant for regulatory reasonsImportant for regulatory reasons

Prevent admin from accessPrevent admin from accessSarbanes OxleySarbanes Oxley

Manipulation (integrity) protection Manipulation (integrity) protection uses digital signaturesuses digital signatures

Not implemented for data in SQL Server Not implemented for data in SQL Server 20052005

Can overcome this with judicious use of Can overcome this with judicious use of encryption aloneencryption alone

Implemented for code signingImplemented for code signing

Page 161: Fy06 SQL Intro

Check Your OSCheck Your OS

Verify what algorithms Verify what algorithms and key lengths are and key lengths are supported by your OS, supported by your OS, as this depends on as this depends on your CSP your CSP (Cryptographic (Cryptographic Services Provider)Services Provider)

Generate keys for all Generate keys for all algorithms then look in algorithms then look in the sys.symmetric_keysthe sys.symmetric_keys

For example, we For example, we found the following found the following during testing – this during testing – this may change in the may change in the futurefuture

At present, there is no At present, there is no way to change the way to change the CSP…CSP…

XP SP2XP SP2 WS200WS20033

DESDES 56 (64)56 (64) 56 (64)56 (64)3DES3DES 128128 128128AES12AES1288

-- 128128

AES19AES1922

-- 192192

AES25AES2566

-- 256256

RC2RC2 128128 128128RC4RC4 4040 4040

Page 162: Fy06 SQL Intro

Before You BeginBefore You Begin

Your DBA must ensure that the Your DBA must ensure that the server has a Service Master Key server has a Service Master Key (SMK) that agrees with your disaster (SMK) that agrees with your disaster recovery strategy and a Database recovery strategy and a Database Master Key (DMK) has been created Master Key (DMK) has been created for each database that will use for each database that will use encryptionencryption

There are serious implications on There are serious implications on security (use strong password for SQL security (use strong password for SQL service account) and for disaster service account) and for disaster recoveryrecovery

ALTER SERVICE MASTER KEY can be used to ALTER SERVICE MASTER KEY can be used to change recovery optionschange recovery options

CREATE MASTER KEY ENCRYPTION BY CREATE MASTER KEY ENCRYPTION BY PASSWORD =PASSWORD =

Use a strong password, write it down, keep it Use a strong password, write it down, keep it safesafe

Page 163: Fy06 SQL Intro

Key GenerationKey Generation

The key should be impossible to The key should be impossible to guess. Preferably random.guess. Preferably random.

CREATE SYMMETRIC… will generate a CREATE SYMMETRIC… will generate a fairly random key for you – good!fairly random key for you – good!

You can base the key on data supplied You can base the key on data supplied by you, use KEY_SOURCE* clause – good by you, use KEY_SOURCE* clause – good for generating identical keys from a for generating identical keys from a high-quality passwordhigh-quality password

Note: KEY_SOURCE may be renamed to Note: KEY_SOURCE may be renamed to DERIVED_FROM in the released versionDERIVED_FROM in the released version

Page 164: Fy06 SQL Intro

PasswordsPasswords

Make sure passwords used to protect or create Make sure passwords used to protect or create keys are very strongkeys are very strongIn our opinion, it is better to create a very long In our opinion, it is better to create a very long and complex password that you will have to write and complex password that you will have to write down and store in a well-protected safe in your down and store in a well-protected safe in your companycompany

You can divide it into two halves and store in separate You can divide it into two halves and store in separate envelopes in different safesenvelopes in different safes

E.g.: *87(HyfdlkRM?_764#{(**%GRtj*(NS£”_+^$(E.g.: *87(HyfdlkRM?_764#{(**%GRtj*(NS£”_+^$(No dictionary words, more than 20 characters, many No dictionary words, more than 20 characters, many non-printing charactersnon-printing charactersChallenge: usability!Challenge: usability!

Consider also:Consider also:Not keeping passwords as text in your code, but store Not keeping passwords as text in your code, but store and retrieve them through DPAPI and a .NET componentand retrieve them through DPAPI and a .NET componentUsing good quality password generatorsUsing good quality password generators

Page 165: Fy06 SQL Intro

Key ProtectionKey Protection

SQL Server 2005 insists that the key you SQL Server 2005 insists that the key you create is further encrypted for its create is further encrypted for its protectionprotection

Yes, that’s double-encryption, but that is not Yes, that’s double-encryption, but that is not double-security. Actually, it reduces security a double-security. Actually, it reduces security a little in some caseslittle in some cases

CREATE SYMMETRIC…ENCRYPT BYCREATE SYMMETRIC…ENCRYPT BYPASSWORDPASSWORD

Your (v. good) password generates a key to 3DES Your (v. good) password generates a key to 3DES encrypt the key you are protectingencrypt the key you are protectingNote, 3DES is less secure than AES, so this Note, 3DES is less secure than AES, so this

CERTIFICATECERTIFICATEYour key is encrypted using the public key of a Your key is encrypted using the public key of a certificatecertificateThis, in essence, is hybrid encryptionThis, in essence, is hybrid encryptionIf private key is kept secure (and offline), this is a very If private key is kept secure (and offline), this is a very good way to protect a symmetric keygood way to protect a symmetric key

Or another SYMMETRIC or ASYMMETRIC key – Or another SYMMETRIC or ASYMMETRIC key – less useful but interestingless useful but interesting

Page 166: Fy06 SQL Intro

EncryptionEncryption

1.1. Create or retrieve the keyCreate or retrieve the key

2.2. Open the key – this means decrypt it Open the key – this means decrypt it with its (secondary) password or with its (secondary) password or certificate or other keycertificate or other key

3.3. Use function ENCRYPTBYKEY inside Use function ENCRYPTBYKEY inside DMLDML

Page 167: Fy06 SQL Intro

DecryptionDecryption

1.1. Create or retrieve the keyCreate or retrieve the key

2.2. Open the keyOpen the key

3.3. Use function DECRYPTBYKEY inside Use function DECRYPTBYKEY inside SELECT and all other DMLSELECT and all other DML

Page 168: Fy06 SQL Intro

Performance IssuesPerformance Issues

As you know, security is expensiveAs you know, security is expensive

AsymmetricAsymmetric encryption alone is not viable for encryption alone is not viable for large amounts of datalarge amounts of data

SymmetricSymmetric cryptography can be both strong and cryptography can be both strong and fast: AESfast: AES

HybridHybrid encryption is fast enough and solves the encryption is fast enough and solves the problem of symmetric key distribution logisticsproblem of symmetric key distribution logistics

Don’t encrypt everything, this rarely makes sense Don’t encrypt everything, this rarely makes sense and would be very costlyand would be very costly

But, our (suspect) performance tests showed that But, our (suspect) performance tests showed that all algorithms were as fast as each other. This is all algorithms were as fast as each other. This is not what we expected, and either signifies other not what we expected, and either signifies other overheads which cancel out the differences, or we overheads which cancel out the differences, or we need to re-test on a released product (RTM).need to re-test on a released product (RTM).

Page 169: Fy06 SQL Intro

Top 10 Best Practices in Top 10 Best Practices in SecuritySecurity1.1. Risk assessment and threat analysisRisk assessment and threat analysis2.2. Trap “developer quality” error Trap “developer quality” error

messages – avoid indecent disclosuremessages – avoid indecent disclosure3.3. Avoid using dynamic string execution Avoid using dynamic string execution 4.4. Think encryption for sensitive dataThink encryption for sensitive data5.5. Do not create objects in DBO schemaDo not create objects in DBO schema6.6. Hide underlying application schemaHide underlying application schema7.7. Use Roles for managing permissionsUse Roles for managing permissions8.8. Account for catalog securityAccount for catalog security9.9. Avoid encryption as it precludes Avoid encryption as it precludes

indexing and performanceindexing and performance10.10.Know your CSP. Check allowed key Know your CSP. Check allowed key

lengths if needed. Chose carefully!lengths if needed. Chose carefully!

Page 170: Fy06 SQL Intro

Performance and ScalabilityPerformance and Scalability

Using IndexesUsing Indexes

Index ConceptsIndex Concepts

Table and Index InternalsTable and Index Internals

Initial Table DesignInitial Table DesignThe Clustered Index DebateThe Clustered Index Debate

Secondary Non-clustered IndexesSecondary Non-clustered Indexes

Letting ITW/DTA Help!Letting ITW/DTA Help!

Page 171: Fy06 SQL Intro

Index UsageIndex UsageConceptualConceptual

Allows faster access to data, improving:Allows faster access to data, improving:Lookup/query timeLookup/query time

Insert/Update time – record location definedInsert/Update time – record location defined

Delete time – record location also definedDelete time – record location also defined

Con: overhead for modificationsCon: overhead for modifications

Index Strategy ConceptsIndex Strategy ConceptsFewer indexes are better than lots of indexesFewer indexes are better than lots of indexes

Wider indexes have more usesWider indexes have more usesCan be used for point queriesCan be used for point queries

Can be used by NARROW low selectivity queriesCan be used by NARROW low selectivity queries

Write Effective Queries to better take Write Effective Queries to better take advantage of indexes…advantage of indexes…

Page 172: Fy06 SQL Intro

Accessing Data: Limit DataAccessing Data: Limit Data

Subset of columns = ProjectionSubset of columns = ProjectionDo not use * (unless against view)Do not use * (unless against view)Optimizer has more chances for optimizing Optimizer has more chances for optimizing query when result set is NARROW (only the query when result set is NARROW (only the required columns)required columns)

Subset of rows = SelectionSubset of rows = SelectionUse positive search argumentsUse positive search argumentsIsolate the column to one side of the expressionIsolate the column to one side of the expression

USEUSE: MonthlySalary > value/12 (constant, seekable): MonthlySalary > value/12 (constant, seekable)DO NOT USEDO NOT USE: MonthlySalary * 12 > value (must scan): MonthlySalary * 12 > value (must scan)

Be cautious with LEADING wildcardsBe cautious with LEADING wildcardsUSEUSE: LastName LIKE 'S%': LastName LIKE 'S%'Try not to just append %val% to every applicationTry not to just append %val% to every application

Consider using Views, Stored Procedures Consider using Views, Stored Procedures and Functions to limit the columns/rowsand Functions to limit the columns/rows

Page 173: Fy06 SQL Intro

Index ConceptsIndex ConceptsBook analogyBook analogy

Think of a book—with indexes in the backThink of a book—with indexes in the back

The book has one form of logical orderingThe book has one form of logical ordering

For references, you use the indexes in the back…For references, you use the indexes in the back…to find the data in which you are interested you to find the data in which you are interested you look up the keylook up the key

When you find the key, you must lookup the data When you find the key, you must lookup the data based on its location… i.e., a “bookmark” lookupbased on its location… i.e., a “bookmark” lookup

The bookmark always depends on the (book) The bookmark always depends on the (book) content ordercontent order

IndexIndex – Species Common Name

IndexIndex – Species Scientific Name

IndexIndex – Animal by Country, Name

IndexIndex – Animal by Continent, Country, Name

IndexIndex – Animal by Type, Name Bird, Mammal, Reptile,

etc…

IndexIndex – Animals by Habitat, Name Air, Land, Water

Page 174: Fy06 SQL Intro

Index ConceptsIndex ConceptsTree AnalogyTree Analogy

If a tree were data and you were looking for If a tree were data and you were looking for leaves with a certain property, you would leaves with a certain property, you would have two options to find that data:have two options to find that data:

1) Touch every leaf – interrogating 1) Touch every leaf – interrogating each one to determine if they each one to determine if they held that property…held that property…SCANSCAN

2) If those leaves (which 2) If those leaves (which had that property) were had that property) were grouped such that you grouped such that you could start at the root, move could start at the root, move to the branch and then directly to the branch and then directly to those leaves…to those leaves…SEEKSEEK

Page 175: Fy06 SQL Intro

Table StructureTable Structure

HEAP:HEAP: A table without a clustered A table without a clustered index index

Clustered Table:Clustered Table: A table with a A table with a clustered indexclustered index

Non-clustered Indexes Non-clustered Indexes do notdo not affect affect the base table’s structurethe base table’s structureHowever, Non-clustered Indexes are However, Non-clustered Indexes are affected by whether or not the table affected by whether or not the table is Clustered…is Clustered…

Hint: The non-clustered index dependency on the clustered Hint: The non-clustered index dependency on the clustered index should impact your choice for the clustering key!index should impact your choice for the clustering key!

Page 176: Fy06 SQL Intro

Heap: ProsHeap: Pros

Effective for Staging DataEffective for Staging Data

Indexes can be created after load and Indexes can be created after load and index creation can be parallelizedindex creation can be parallelized

Efficient for SCANs ONLY – when no Efficient for SCANs ONLY – when no UPDATEs (otherwise, forwarding pointers – UPDATEs (otherwise, forwarding pointers – scans become scans become significantlysignificantly less efficient) less efficient)

Space Efficient – Space from Deletes is Space Efficient – Space from Deletes is reused on subsequent Inserts (at the cost reused on subsequent Inserts (at the cost of performance)of performance)

Best used when table is not typical OLTP or Best used when table is not typical OLTP or combo OLTP/DSS tablecombo OLTP/DSS table

Forward only “logging” tableForward only “logging” table

Page 177: Fy06 SQL Intro

Table StructureTable StructureClustered tableClustered table

Clustered Index defines order – applied at Clustered Index defines order – applied at CREATIONCREATION

Expensive—in time and space—to build Expensive—in time and space—to build (an additional 1-2 times the index size for (an additional 1-2 times the index size for (re)build)(re)build)

Table is a doubly-linked list – order Table is a doubly-linked list – order maintained LOGICALLYmaintained LOGICALLY

MightMight be expensive to maintain be expensive to maintain

…4000 Pages of Employees in Clustering Key Order

1, Griffith, …

2, Ulaska, …

3, Johnson, …

20, Morrisson, …

21, Ambers, …

22, Johany, …

23, Smith, …

40, Griffen, …

41, Shen, …

42, Alberts, …

43, Landon, …

60, Lynne, …

79981, Geller, …

79982, Smith, …

79983, Jones, …

80000, Kirkert, …

79961, Morris, …

79962, Simon, …

79963, Jones, …

79980, Debry, …

79941, Baker, …

79942, Shehy, …

79943, Laws, …

79960, Miller, …

File1, Page 5982 File1, Page 5983 File1, Page 5984 File1, Page 11231 File1, Page 11232 File1, Page 11233

Page 178: Fy06 SQL Intro

Clustered EmployeeIDClustered EmployeeIDEmployee TableEmployee Table

Consider a clustered index on an identity column (even Consider a clustered index on an identity column (even better if this is the table’s Primary Key)better if this is the table’s Primary Key)

…1, Griffith, …

2, Ulaska, …

3, Johnson, …

20, Morrisson, …

21, Ambers, …

22, Johany, …

23, Smith, …

40, Griffen, …

41, Shen, …

42, Alberts, …

43, Landon, …

60, Lynne, …

79981, Geller, …

79982, Smith, …

79983, Jones, …

80000, Kirkert, …

79961, Kiesow, …

79962, Simon, …

79963, Gellock, …

79980, Debry, …

79941, Baker, …

79942, Shehy, …

79943, Laws, …

79960, Miller, …

File1, Page 5982 File1, Page 5983 File1, Page 5984 File1, Page 11231 File1, Page 11232 File1, Page 11233

…1, 1, 5982

21, 1, 5983

41, 1, 5984

~16000

~64001

79941, 1, 11231

79961, 1, 11232

79981, 1, 11233

File1, Page 12982 File1, Page 12986

1, 1, 12982

16001, 1, 12983

32001, 1, 12984

48001, 1, 12985

64001, 1, 12986

File1, Page 12987

Root = 1 Page

Intermediate Level = 5 Pages

Balanced-TreeTotal Overhead in terms of Disk

Space

= 6 Pages or < 1%

Leaf Level4000

Pages

Page 179: Fy06 SQL Intro

Non-Clustered IndexesNon-Clustered IndexesPhysicalPhysical

Depend on whether the table is a Heap or Depend on whether the table is a Heap or ClusteredClustered

Clustered TableClustered TableRows use the Clustering KeyRows use the Clustering Key

No additional OH to add this column – may use No additional OH to add this column – may use actual data to define the clustering keyactual data to define the clustering key

Minimizes NC Index manipulation if rows move Minimizes NC Index manipulation if rows move (the NC still points to the CL Key)(the NC still points to the CL Key)

Clustering key should be static and narrowClustering key should be static and narrow

HeapHeapGenerally, not recommendedGenerally, not recommended

Page 180: Fy06 SQL Intro

…000-00-0001, 497

000-00-0002, 349

000-00-0003, 5643

000-00-0539, 12

999-99-9999, 42

File1, Page 16897 File1, Page 16898 File1, Page 16899 File1, Page 11810 File1, Page 18111 File1, Page 18112

539 entries

File1, Page 19197

539 entries

539 entries

539 entries

228 entries

149 entries

Root = 1 Page

Leaf Level149

Pages

Total Overhead in terms of Disk Space

= 150 Pages or < 4%

Non-Clustered IndexNon-Clustered IndexUnique Key SSNUnique Key SSN

Leaf level contains the non-clustered key(s) to Leaf level contains the non-clustered key(s) to define the orderdefine the order

Also includes either the Heap’s Fixed RID or the Also includes either the Heap’s Fixed RID or the Table’s Clustering KeyTable’s Clustering Key

Page 181: Fy06 SQL Intro

Clustered Index CriteriaClustered Index Criteria

UniqueUniqueYes – No overhead, data takes care of this criteria Yes – No overhead, data takes care of this criteria

NO – SQL Server must “uniquify” the rows on INSERT. This NO – SQL Server must “uniquify” the rows on INSERT. This costs time and space. Each duplicate has a 4-byte costs time and space. Each duplicate has a 4-byte uniquifier. In SQL Server 2000, all uniquifiers are uniquifier. In SQL Server 2000, all uniquifiers are regenerated when the CL index is rebuild. This is not true regenerated when the CL index is rebuild. This is not true in SQL Server 2005.in SQL Server 2005.

NarrowNarrowYes – Keeps the NC indexes narrowYes – Keeps the NC indexes narrow

NO – Possibly wastes spaceNO – Possibly wastes space

StaticStaticYes – Improves PerformanceYes – Improves Performance

NO – Costly to maintain during updates to the key NO – Costly to maintain during updates to the key especially if row movement and/or splitsespecially if row movement and/or splits

In fact, an identity column that’s ever-increasing is ideal…In fact, an identity column that’s ever-increasing is ideal…

Page 182: Fy06 SQL Intro

Clustering on an IdentityClustering on an Identity

Naturally UniqueNaturally Unique (although not guaranteed – should combine with key (although not guaranteed – should combine with key constraint)constraint)

Naturally StaticNaturally Static (although should be enforced through permissions (although should be enforced through permissions and/or trigger)and/or trigger)

Naturally NarrowNaturally Narrow (only numeric values possible, whole numbers scale (only numeric values possible, whole numbers scale = 0)= 0)

Naturally creates a hot spot…Naturally creates a hot spot…Needed pages for INSERT already in cacheNeeded pages for INSERT already in cacheMinimizes cache requirementsMinimizes cache requirementsHelps reduce fragmentation due to INSERTsHelps reduce fragmentation due to INSERTsHelps improve availability by naturally needing Helps improve availability by naturally needing less defragless defrag

Page 183: Fy06 SQL Intro

Clustering Key ChoicesClustering Key Choices

IdentityIdentity column columnDate, identity Date, identity

Not date alone as not uniqueNot date alone as not uniqueGUIDGUID

Populated by client-side call to .NET client Populated by client-side call to .NET client GUID generator (NOT a good CL key but GUID generator (NOT a good CL key but ok as NC Primary Key)ok as NC Primary Key)Populated by server-side newid() function Populated by server-side newid() function (no pattern, not ever-increasing)(no pattern, not ever-increasing)Populated by NEW server-side Populated by NEW server-side newsequentialid() function (Creates an newsequentialid() function (Creates an ever-increasing GUID)ever-increasing GUID)

Key points: Narrow, static, unique, ever-increasingKey points: Narrow, static, unique, ever-increasing

Page 184: Fy06 SQL Intro

Key Constraints Create IndexesKey Constraints Create Indexes

Primary Key ConstraintPrimary Key ConstraintDefaults to Unique ClusteredDefaults to Unique Clustered

Only One Per TableOnly One Per Table

ALTER TABLE EmployeeALTER TABLE Employee ADD CONSTRAINT EmployeePK ADD CONSTRAINT EmployeePK PRIMARY KEY CLUSTERED (EmployeeID) PRIMARY KEY CLUSTERED (EmployeeID)

Unique Key ConstraintsUnique Key ConstraintsDefault to Unique Non-ClusteredDefault to Unique Non-Clustered

Up to 249 Per TableUp to 249 Per Table

ALTER TABLE EmployeeALTER TABLE Employee ADD CONSTRAINT EmployeeSSNUK ADD CONSTRAINT EmployeeSSNUK UNIQUE NONCLUSTERED (SSN) UNIQUE NONCLUSTERED (SSN)

Page 185: Fy06 SQL Intro

Optimal Table Structures Optimal Table Structures Poor clustered index choicePoor clustered index choice

LastName or non-sequential GUIDLastName or non-sequential GUIDFragmentation Fragmentation poor performance poor performance

Non-unique Non-unique poor performance poor performanceUniquifier wastes space and timeUniquifier wastes space and time

SQL 2000 only, rebuilding a clustered index on a non-SQL 2000 only, rebuilding a clustered index on a non-unique unique clustered index forces non-clustered indexes to be clustered index forces non-clustered indexes to be rebuiltrebuilt

Volatile Volatile poor performance poor performanceMost-duplicated valueMost-duplicated value

Could cause record relocation and therefore Could cause record relocation and therefore fragmentationfragmentation

Wide Wide poor performance poor performanceWastes space/time – wide keys take longer to Wastes space/time – wide keys take longer to maintain/insert.maintain/insert.

The wider the key the wider the non-clustered The wider the key the wider the non-clustered indexesindexes

TIP: Don’t do this! TIP: Don’t do this!

Page 186: Fy06 SQL Intro

Optimal Table Structures Optimal Table Structures (cont’d)(cont’d)Heap is betterHeap is betterNo clustered IndexNo clustered Index

No data row Fragmentation No data row Fragmentation forwarding rows forwarding rows

Could have LOB fragmentation (new Could have LOB fragmentation (new LOB_Compaction can help this type of LOB_Compaction can help this type of fragmentation)fragmentation)

Wastes space in non-clustered indexes by using fixed-Wastes space in non-clustered indexes by using fixed-RIDRID

Scans (due to locking/consistency requirements) can Scans (due to locking/consistency requirements) can be more expensive!be more expensive!

Optimized for space over speedOptimized for space over speed

TIP: Create HEAPs for staging tables as an TIP: Create HEAPs for staging tables as an interim table for high performance data interim table for high performance data loading!loading!

See presentation “High Performance Data See presentation “High Performance Data Loading” Loading” on on www.SQLDev.Netwww.SQLDev.Net for more tips! for more tips!

Page 187: Fy06 SQL Intro

Optimal Table Structures Optimal Table Structures (cont’d)(cont’d)The The RIGHTRIGHT Clustered Index Clustered IndexThe RIGHT Clustered Index The RIGHT Clustered Index

Cluster on identity column or possibly add one for Cluster on identity column or possibly add one for clusteringclustering

Cluster on composite key (date, identity) for tables Cluster on composite key (date, identity) for tables that that also have this patternalso have this pattern

Key Criteria: Static, Narrow and UniqueKey Criteria: Static, Narrow and Unique

Effective Hot Spot(s) using Identity (or date, id) or Effective Hot Spot(s) using Identity (or date, id) or composite key to create multiple (but not random) composite key to create multiple (but not random) hot spots – possibly consider partitioning for large hot spots – possibly consider partitioning for large tables/high volume OLTPtables/high volume OLTP

TIP: If this doesn’t exist, add a surrogate column solely to cluster TIP: If this doesn’t exist, add a surrogate column solely to cluster on it!on it!

TIP: Make sure to find the RIGHT Clustered index for TIP: Make sure to find the RIGHT Clustered index for youryour environmentenvironment

TIP: Make sure to AUTOMATE defragmentation if/when TIP: Make sure to AUTOMATE defragmentation if/when necessarynecessary

Page 188: Fy06 SQL Intro

Clustering for PerformanceClustering for Performance

Cluster for internal benefits which Cluster for internal benefits which lead to:lead to:

Better insert/update performance re: Better insert/update performance re: fewer splitsfewer splits

Better Availability as full table Better Availability as full table maintenance may not be neededmaintenance may not be needed

Reliance on Nonclustered indexes is Reliance on Nonclustered indexes is greater:greater:

Faster access to narrow/low selectivity Faster access to narrow/low selectivity range queriesrange queries

NC Indexes are used in numerous non-NC Indexes are used in numerous non-obvious ways (multiple NC Indexes can obvious ways (multiple NC Indexes can be joined to cover a query)be joined to cover a query)

Page 189: Fy06 SQL Intro

Reliance on Nonclustered Reliance on Nonclustered Indexes…Indexes…

Faster access to narrow/low selectivity range Faster access to narrow/low selectivity range queriesqueriesNC Indexes are used in numerous non-NC Indexes are used in numerous non-obvious ways (multiple NC Indexes can be obvious ways (multiple NC Indexes can be joined to cover a query)joined to cover a query)More flexible in the definition (i.e., Indexed More flexible in the definition (i.e., Indexed Views can include computations, substrings, Views can include computations, substrings, etc.) etc.) New INCLUDE clause can allow wider New INCLUDE clause can allow wider covering indexes!covering indexes!Non-clustered indexes are easier/faster to Non-clustered indexes are easier/faster to rebuild rebuild (re: smaller) when they become fragmented(re: smaller) when they become fragmentedNon-clustered indexes are easier to keep less Non-clustered indexes are easier to keep less fragmented – i.e., rebuilds with FILLFACTOR fragmented – i.e., rebuilds with FILLFACTOR has greater benefit since the index row is has greater benefit since the index row is narrowernarrower

Page 190: Fy06 SQL Intro

Non-Clustered IndexesNon-Clustered Indexes

Internally, non-clustered indexes always Internally, non-clustered indexes always store an entry for EVERY row of the table store an entry for EVERY row of the table If the columns requested in the query are IN If the columns requested in the query are IN the index (regardless of order) then the the index (regardless of order) then the index COVERs the queryindex COVERs the queryIf queries are narrow this is more likely to If queries are narrow this is more likely to happenhappenReal-world queries…for example, accessing Real-world queries…for example, accessing a large number of tables in a join are a large number of tables in a join are typically narrow (the join condition, maybe a typically narrow (the join condition, maybe a search argument and possibly one or two search argument and possibly one or two columns in the select list)columns in the select list)Is it likely that you would ever execute:Is it likely that you would ever execute:

SELECT * FROM 8-table-joinSELECT * FROM 8-table-join

Page 191: Fy06 SQL Intro

…000-00-0001, 497

000-00-0002, 349

000-00-0003, 5643

000-00-0539, 12

999-99-9999, 42

File1, Page 16897 File1, Page 16898 File1, Page 16899 File1, Page 11810 File1, Page 18111 File1, Page 18112

539 entries

File1, Page 19197

539 entries

539 entries

539 entries

228 entries

149 entries

Root = 1 Page

Leaf Level149

Pages

What kind of structure is this?What kind of structure is this?

A non-clustered index on SSN on a table that has a A non-clustered index on SSN on a table that has a clustered index on EmpID (as we’ve seen)clustered index on EmpID (as we’ve seen)A clustered in on SSN n a table that has ONLY two A clustered in on SSN n a table that has ONLY two columns, SSN and EmpIDcolumns, SSN and EmpID

NOTE:NOTE: A non-clustered index is EXACTLY the same as a clustered A non-clustered index is EXACTLY the same as a clustered index – but only on the columns that make up the NC index index – but only on the columns that make up the NC index (with the RID or CL Key).(with the RID or CL Key).

Page 192: Fy06 SQL Intro

Seems Like…Seems Like…

Non-clustered indexes are only useful Non-clustered indexes are only useful for relatively selective queries…for relatively selective queries…

What about the following queries?What about the following queries?

oror

SELECT EmpID, SSN SELECT EmpID, SSN FROM Employee FROM Employee WHERE SSN between x and yWHERE SSN between x and y

SELECT EmpID, SSN SELECT EmpID, SSN FROM Employee FROM Employee WHERE SSN between x and yWHERE SSN between x and y

SELECT EmpID, SSN SELECT EmpID, SSN FROM Employee FROM Employee WHERE EmpID < 10000WHERE EmpID < 10000

SELECT EmpID, SSN SELECT EmpID, SSN FROM Employee FROM Employee WHERE EmpID < 10000WHERE EmpID < 10000

Think about the access patterns:• Table Scan (always an option)• Index Seek w/partial scan The nc index on SSN “covers” the query

Think about the access patterns:• SARG is on clustering key

The nc index on SSN “covers” the query

Page 193: Fy06 SQL Intro

Finding the Right BalanceFinding the Right Balance

Start with a minimal number of Start with a minimal number of indexesindexes

Clustered IndexClustered IndexPrimary Key (nonclustered, if not Primary Key (nonclustered, if not clustered)clustered)Unique KeysUnique Keys

ManuallyManually index foreign keys index foreign keysNon-unique indexesNon-unique indexesSpeed up join performanceSpeed up join performance

Use Database Tuning AdvisorUse Database Tuning AdvisorManually index based on either:Manually index based on either:

Specific query tuning where DTA didn’t Specific query tuning where DTA didn’t helphelpQuery frequencyQuery frequency

Page 194: Fy06 SQL Intro

Non-Clustered IndexesNon-Clustered IndexesFor optimal query performanceFor optimal query performance

Non-Clustered Indexes are BETTER for low Non-Clustered Indexes are BETTER for low selectivity/range queries because:selectivity/range queries because:

Non-clustered indexes ARE EXACTLY the same as a Non-clustered indexes ARE EXACTLY the same as a clustered index—but narrowerclustered index—but narrower

If your query only needs columns IN an index then the If your query only needs columns IN an index then the index is said to “cover” your queryindex is said to “cover” your query

No – this does No – this does notnot mean that you need to cover every mean that you need to cover every query!!!query!!!

What it does mean: You can make one or two existing What it does mean: You can make one or two existing indexes slightly wider and you probably will not hurt indexes slightly wider and you probably will not hurt INSERT, UPDATE, and DELETE performance and may INSERT, UPDATE, and DELETE performance and may RADICALLY improve SELECT performanceRADICALLY improve SELECT performance

Non-Clustered Indexes are narrower and easier to Non-Clustered Indexes are narrower and easier to maintain (let ITW/DTA help…)maintain (let ITW/DTA help…)

Use INCLUDE to create narrower trees, more Use INCLUDE to create narrower trees, more relevant leaf levels for covering!relevant leaf levels for covering!

Page 195: Fy06 SQL Intro

Covering an AggregateCovering an Aggregate

Increasing Gains with different types Increasing Gains with different types of indexes can always be achievedof indexes can always be achieved

Each query can be isolated and tuned Each query can be isolated and tuned as if it were in a sandbox – not the as if it were in a sandbox – not the goalgoal

Finding the right balance starts with:Finding the right balance starts with:Base Indexes (CL, PK, UK, FKs)Base Indexes (CL, PK, UK, FKs)

Capturing a workload, tuning with DTACapturing a workload, tuning with DTA

Prioritizing the queries that are still Prioritizing the queries that are still performing poorly and then choose the performing poorly and then choose the optimal level of gain by understanding optimal level of gain by understanding the trade-offsthe trade-offs

Page 196: Fy06 SQL Intro

Indexing for AggregationsIndexing for Aggregations

Two types of Aggregates: Two types of Aggregates: Stream and HashStream and Hash

Try to Achieve Stream to Minimize Try to Achieve Stream to Minimize Overhead in temp table creationOverhead in temp table creation

Computation of the Aggregate Still Computation of the Aggregate Still RequiredRequired

Lots of Users, Contention and/or Lots of Users, Contention and/or Minimal Cache can Aggravate the Minimal Cache can Aggravate the problem!problem!

Page 197: Fy06 SQL Intro

Aggregate QueryAggregate Query

Member has 10,000 RowsMember has 10,000 Rows

Charge has 1,600,000 RowsCharge has 1,600,000 Rows

SELECT c.member_no AS MemberNo, SELECT c.member_no AS MemberNo, sum(c.charge_amt) AS TotalSalessum(c.charge_amt) AS TotalSales

FROM dbo.charge AS cFROM dbo.charge AS cGROUP BY c.member_noGROUP BY c.member_no

SELECT c.member_no AS MemberNo, SELECT c.member_no AS MemberNo, sum(c.charge_amt) AS TotalSalessum(c.charge_amt) AS TotalSales

FROM dbo.charge AS cFROM dbo.charge AS cGROUP BY c.member_noGROUP BY c.member_no

Page 198: Fy06 SQL Intro

Aggregate Query (cont’d)Aggregate Query (cont’d)Table scan + hash aggregateTable scan + hash aggregate

Table Scan of Charge TableTable Scan of Charge TableLargest structure to evaluateLargest structure to evaluateWorst case scenarioWorst case scenario

Worktable created to store intermediate Worktable created to store intermediate aggregated results – OUT OF ORDER aggregated results – OUT OF ORDER (HASH)(HASH)Data Returned OUT OF ORDER – unless Data Returned OUT OF ORDER – unless ORDER BY addedORDER BY addedAdditional ORDER BY causes another stepAdditional ORDER BY causes another stepfor SORT – sorting can be expensive!for SORT – sorting can be expensive!

SELECT c.member_no AS MemberNo, SELECT c.member_no AS MemberNo, sum(c.charge_amt) AS TotalSalessum(c.charge_amt) AS TotalSales

FROM dbo.charge AS cFROM dbo.charge AS cGROUP BY c.member_noGROUP BY c.member_no

SELECT c.member_no AS MemberNo, SELECT c.member_no AS MemberNo, sum(c.charge_amt) AS TotalSalessum(c.charge_amt) AS TotalSales

FROM dbo.charge AS cFROM dbo.charge AS cGROUP BY c.member_noGROUP BY c.member_no

Page 199: Fy06 SQL Intro

Worst CaseWorst Case

Clustered Index Clustered Index ScanScan(table scan)(table scan)1,600,000 rows1,600,000 rowsHash AggregateHash Aggregateyields 9,114 yields 9,114 rows out of rows out of orderorderSortSortonly has to sort only has to sort 9,114 rows 9,114 rows instead of instead of 1,600,000 rows1,600,000 rowsReturn DataReturn Data

Table 'charge'. Table 'charge'. Logical reads 9335Logical reads 9335

Page 200: Fy06 SQL Intro

Aggregate QueryAggregate QueryIndex scan + hash aggregateIndex scan + hash aggregate

Out of Order Covering Index on Charge Out of Order Covering Index on Charge TableTable

Index Exists which is narrower than base tableIndex Exists which is narrower than base tableUsed instead of table – to cover the queryUsed instead of table – to cover the query

Worktable still created to store Worktable still created to store intermediate aggregated results – OUT OF intermediate aggregated results – OUT OF ORDER (HASH)ORDER (HASH)Data Returned OUT OF ORDER – unless Data Returned OUT OF ORDER – unless ORDER BY addedORDER BY addedAdditional ORDER BY causes another step Additional ORDER BY causes another step for SORT – sorting can be expensive!for SORT – sorting can be expensive!

SELECT c.member_no AS MemberNo, SELECT c.member_no AS MemberNo, sum(c.charge_amt) AS TotalSalessum(c.charge_amt) AS TotalSales

FROM dbo.charge AS cFROM dbo.charge AS cGROUP BY c.member_noGROUP BY c.member_no

SELECT c.member_no AS MemberNo, SELECT c.member_no AS MemberNo, sum(c.charge_amt) AS TotalSalessum(c.charge_amt) AS TotalSales

FROM dbo.charge AS cFROM dbo.charge AS cGROUP BY c.member_noGROUP BY c.member_no

Page 201: Fy06 SQL Intro

Not as BadNot as Bad

COVERING COVERING Index ScanIndex Scan1,600,000 1,600,000 narrower rowsnarrower rows

Hash AggregateHash Aggregateyields 9,114 yields 9,114 rows out of rows out of orderorder

SortSortonly has to sort only has to sort 9,114 rows 9,114 rows instead of instead of 1,600,000 rows1,600,000 rows

Return DataReturn Data

Table 'charge'. Table 'charge'. Logical reads 3770Logical reads 3770

Page 202: Fy06 SQL Intro

Aggregate QueryAggregate QueryIndex scan + stream aggregateIndex scan + stream aggregate

Covering Index on Charge Table (GROUP BY Covering Index on Charge Table (GROUP BY first)first)

Index Exists which is narrower than base tableIndex Exists which is narrower than base tableUsed instead of table – to cover the queryUsed instead of table – to cover the queryCovers the GROUP BY so data is groupedCovers the GROUP BY so data is grouped

Less work to aggregate results IN ORDERLess work to aggregate results IN ORDERData Returned IN ORDER – unless ORDER BY Data Returned IN ORDER – unless ORDER BY or other joins addedor other joins addedAdding an ORDER BY identical to the GROUP Adding an ORDER BY identical to the GROUP BY does NOT cause any additional step for BY does NOT cause any additional step for sorting!sorting!

SELECT c.member_no AS MemberNo, SELECT c.member_no AS MemberNo, sum(c.charge_amt) AS TotalSalessum(c.charge_amt) AS TotalSales

FROM dbo.charge AS cFROM dbo.charge AS cGROUP BY c.member_noGROUP BY c.member_no

SELECT c.member_no AS MemberNo, SELECT c.member_no AS MemberNo, sum(c.charge_amt) AS TotalSalessum(c.charge_amt) AS TotalSales

FROM dbo.charge AS cFROM dbo.charge AS cGROUP BY c.member_noGROUP BY c.member_no

Page 203: Fy06 SQL Intro

Much Better!Much Better!

COVERING COVERING Index ScanIndex Scan1,600,000 1,600,000 narrower narrower rowsrows

Stream Stream AggregateAggregatealso yields also yields 9,114 rows 9,114 rows IN ORDERIN ORDERNO SORT NO SORT REQUIREDREQUIRED

Return DataReturn Data

Table 'charge'. Table 'charge'. Logical reads 3770Logical reads 3770

Page 204: Fy06 SQL Intro

See the See the DifferenceDifference

??

Page 205: Fy06 SQL Intro

ConcernsConcerns

More temp tablesMore temp tablesMore contention in tempdbMore contention in tempdbLarger tempdb requiredLarger tempdb requiredPerformance Varies on each Performance Varies on each executionexecutionAggregate needs to be computedAggregate needs to be computed

Is there a better way?Is there a better way?Indexed Views!Indexed Views!

Page 206: Fy06 SQL Intro

Table 'charge'. Table 'charge'. Scan count 1, Scan count 1,

logical reads 34logical reads 34

Table Table 'SumOfAllCharges'. 'SumOfAllCharges'.

Logical reads 35Logical reads 35

Aggregate Aggregate QueryQuery

Indexed Indexed viewview

Page 207: Fy06 SQL Intro

Query with NO Query with NO useful indexesuseful indexesQuery with NO Query with NO useful indexesuseful indexes

Query with covering Query with covering Index in wrong orderIndex in wrong orderQuery with covering Query with covering Index in wrong orderIndex in wrong order

Query with covering Query with covering index in correct orderindex in correct orderQuery with covering Query with covering index in correct orderindex in correct order

Query w/Indexed View Query w/Indexed View

and no computations!and no computations!

Query w/Indexed View Query w/Indexed View

and no computations!and no computations!

See the See the DifferenceDifference

??

Page 208: Fy06 SQL Intro

Indexed View for AggregatesIndexed View for AggregatesKey pointsKey points

TempDB access not necessaryTempDB access not necessary

NO worktables are necessaryNO worktables are necessary

Aggregated set should be small but not too Aggregated set should be small but not too small as to create a hot ROW spot of small as to create a hot ROW spot of activity (which can create excessive activity (which can create excessive blocking)blocking)

GROUP BY member_no – probably OKGROUP BY member_no – probably OK

GROUP BY state – too few rows in aggregateGROUP BY state – too few rows in aggregate

GROUP BY country – GROUP BY country – avoidavoid like the plague! like the plague!

Performance of data modification Performance of data modification statements should be testedstatements should be tested

Page 209: Fy06 SQL Intro

INCLUDE Non-key ColumnsINCLUDE Non-key ColumnsCompared to Indexed ViewsCompared to Indexed Views

Indexed Views allow aggregates – adding Indexed Views allow aggregates – adding interesting columns in the leaf level of an interesting columns in the leaf level of an index offers “creative covering”index offers “creative covering”

With new INCLUDE clause, leaf level of With new INCLUDE clause, leaf level of index can include non-key columnsindex can include non-key columns

Index key [has been since 7.0] limited to Index key [has been since 7.0] limited to 900 bytes/16 columns – this is to keep tree 900 bytes/16 columns – this is to keep tree structure and non-leaf levels optimal/smallstructure and non-leaf levels optimal/small

Allows more covering indexesAllows more covering indexes

Indexed View v. Include – depends on what Indexed View v. Include – depends on what needs to be in the leaf levelneeds to be in the leaf level

Page 210: Fy06 SQL Intro

Finding the Right BalanceFinding the Right BalanceIndex strategies: SummaryIndex strategies: Summary

Determine Primary Usage of Table – OLTP Determine Primary Usage of Table – OLTP vs. OLAP vs. Combo? This determines vs. OLAP vs. Combo? This determines Clustered IndexClustered IndexCreate Constraints – Primary Key and Create Constraints – Primary Key and Alternate/Candidate KeysAlternate/Candidate KeysManually Add Indexes to Foreign Key Manually Add Indexes to Foreign Key ConstraintsConstraintsCapture a Workload(s) and Run through Capture a Workload(s) and Run through Index Tuning WizardIndex Tuning WizardAdd additional indexes to help improve Add additional indexes to help improve SARGs, Joins, AggregationsSARGs, Joins, Aggregations

Are you done?Are you done? NO!NO!NO!NO!

Page 211: Fy06 SQL Intro

Keeping Performance OptimalKeeping Performance Optimal

Write effective queries to help Write effective queries to help optimizer use indexesoptimizer use indexes

Make sure statistics are accurateMake sure statistics are accurateKeep auto update statistics ONKeep auto update statistics ON

Keep auto create statistics ONKeep auto create statistics ON

Make sure indexes don’t become Make sure indexes don’t become overly fragmentedoverly fragmented

Rebuild Table/Index StructuresRebuild Table/Index Structures

Defrag Table/Index StructuresDefrag Table/Index Structures

Page 212: Fy06 SQL Intro

Indexing WebcastsIndexing Webcasts

MSDN Webcast:MSDN Webcast: Indexing for Performance - Indexing for Performance - Finding the Right Balance (SQL Server 2000)Finding the Right Balance (SQL Server 2000)

http://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032254503&Culture=en-Uhttp://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032254503&Culture=en-USS

MSDN Webcast:MSDN Webcast: Indexing for Performance - Indexing for Performance - Index Maintenance Best Practices (SQL Index Maintenance Best Practices (SQL Server 2000)Server 2000)http://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032256511&Culture=en-Uhttp://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032256511&Culture=en-USS

TechNet It’s Sh0wtime Webcast:TechNet It’s Sh0wtime Webcast: Index Creation Index Creation Best Practices with SQL Server 2005Best Practices with SQL Server 2005http://www.microsoft.com/uk/technet/itsshowtime/sessionh.aspx?videoid=29http://www.microsoft.com/uk/technet/itsshowtime/sessionh.aspx?videoid=29

TechNet It’s Sh0wtime Webcast:TechNet It’s Sh0wtime Webcast: Index Index Defragmentation Best Practices with SQL Defragmentation Best Practices with SQL Server 2005Server 2005http://www.microsoft.com/uk/technet/itsshowtime/sessionh.aspx?videoid=30http://www.microsoft.com/uk/technet/itsshowtime/sessionh.aspx?videoid=30

MSDN Webcast Series:MSDN Webcast Series: Effectively Designing a Effectively Designing a Scalable and Reliable DatabaseScalable and Reliable Database

See Kimberly’s blog See Kimberly’s blog for PDC for PDC

Resources & LinksResources & Links

Page 213: Fy06 SQL Intro

CachingCaching

The Zen of CachingThe Zen of Caching

Caching at the Middle TierCaching at the Middle Tier

Caching on the ClientCaching on the Client

Page 214: Fy06 SQL Intro

Caching ZenCaching Zen

Moving information to a location Moving information to a location where it can be accessed very quicklywhere it can be accessed very quickly

From main memory to high-speed From main memory to high-speed memory on the processormemory on the processor

From the hard drive to main memoryFrom the hard drive to main memory

From main memory on one machine to From main memory on one machine to another machineanother machine

Caching is framework for managing Caching is framework for managing statestate

State being data + statusState being data + status

Page 215: Fy06 SQL Intro

Caching Zen: Why?Caching Zen: Why?

PerformancePerformanceStore the data as close as possible to Store the data as close as possible to consumerconsumer

ScalabilityScalabilityCommon data is often processed the Common data is often processed the same way for multiple userssame way for multiple users

Process it once and store the resultsProcess it once and store the results

AvailabilityAvailabilityProvide access to data even when the Provide access to data even when the primary data store or service is primary data store or service is unavailableunavailable

Page 216: Fy06 SQL Intro

Caching Zen: When?Caching Zen: When?

On-demand: cache the data after it On-demand: cache the data after it has been requested and formattedhas been requested and formatted

A priori: preload the cache with both A priori: preload the cache with both raw and formatted dataraw and formatted data

Combine the twoCombine the two

Page 217: Fy06 SQL Intro

Caching Zen: Where?Caching Zen: Where?

Generally as close to the data Generally as close to the data consumer as possible—wherever consumer as possible—wherever accessing the cache can help withaccessing the cache can help with

PerformancePerformance

ScalabilityScalability

AvailabilityAvailability

Actual cache storage can beActual cache storage can beMemory resident—often on serversMemory resident—often on servers

Disk resident—often on clientDisk resident—often on client

Page 218: Fy06 SQL Intro

Caching Zen: IssuesCaching Zen: Issues

StalenessStaleness

CoherencyCoherency

SecuritySecurity

Page 219: Fy06 SQL Intro

StalenessStaleness

Data that is cached becomes Data that is cached becomes obsoleteobsolete

Staleness is the difference between Staleness is the difference between the master data’s current value and the master data’s current value and the current version that is cachedthe current version that is cached

Staleness is relative to how often the Staleness is relative to how often the master copy of the data changes and master copy of the data changes and how tolerant an application is to stale how tolerant an application is to stale datadata

Page 220: Fy06 SQL Intro

CoherencyCoherency

Large systems often have multiple Large systems often have multiple cachescaches

Often seen with web server farmsOften seen with web server farms

Caches need to kept in sync Caches need to kept in sync otherwise clients can get inconsistent otherwise clients can get inconsistent resultsresults

Page 221: Fy06 SQL Intro

SecuritySecurity

Data stored in cache needs to Data stored in cache needs to protectedprotected

Application logic can enforce access Application logic can enforce access rules to in-memory cache on serversrules to in-memory cache on servers

Client-side cache might require both Client-side cache might require both in-memory and disk level protectionin-memory and disk level protection

SpywareSpyware

Unsafe networksUnsafe networks

UsersUsers

Page 222: Fy06 SQL Intro

Staleness SolutionStaleness Solution

Cache expiration settings based upon Cache expiration settings based upon the application’s tolerance to stale the application’s tolerance to stale datadata

Can be time driven where cache is Can be time driven where cache is reloaded based upon an expiration reloaded based upon an expiration settingsetting

Can be data driven where cache is Can be data driven where cache is updated when master data is updated when master data is updatedupdated

Page 223: Fy06 SQL Intro

Coherency SolutionCoherency Solution

Depending upon the number of Depending upon the number of distinct cache stores, updates can distinct cache stores, updates can occuroccur

At the same timeAt the same timeWorks when cache is small and speed of Works when cache is small and speed of update is quickupdate is quick

At defined intervalsAt defined intervalsWorks when cache is large and cache is load-Works when cache is large and cache is load-balancedbalanced

Page 224: Fy06 SQL Intro

Security SolutionSecurity Solution

Store cache data encryptedStore cache data encrypted

Ensure communications between Ensure communications between cache and data source are securecache and data source are secure

Page 225: Fy06 SQL Intro

Cache SyncCache Sync

New feature of SQL Server 2005New feature of SQL Server 2005

Cache source creates data based Cache source creates data based subscription that server will monitorsubscription that server will monitor

When data changes, server notifies When data changes, server notifies subscriber of change and reasonsubscriber of change and reason

Uses new Service Broker infrastructureUses new Service Broker infrastructure

Exposed in SQLClient 2.0 and Exposed in SQLClient 2.0 and ASP.NET 2.0ASP.NET 2.0

High-level and low-level implementations High-level and low-level implementations availableavailable

Page 226: Fy06 SQL Intro

State Transformation PipelineState Transformation Pipeline

State can be presented in many State can be presented in many formatsformats

Raw data object to business object to Raw data object to business object to presentation objectpresentation object

Caching possible at any and all points Caching possible at any and all points in the pipelinein the pipeline

Page 227: Fy06 SQL Intro

Caching at the Middle TierCaching at the Middle Tier

Prefer existing host processes and Prefer existing host processes and servicesservices

COM+ components and object poolingCOM+ components and object pooling

IIS with ASP.NETIIS with ASP.NET

Use custom services as neededUse custom services as neededWindows Services with custom endpoint Windows Services with custom endpoint and cache enginesand cache engines

Page 228: Fy06 SQL Intro

Caching at the Client TierCaching at the Client Tier

Local servicesLocal servicesSQL Server ExpressSQL Server Express

Custom Windows ServicesCustom Windows Services

In-memoryIn-memorySystem.Web.CachingSystem.Web.Caching

DataSetsDataSets

Custom objectsCustom objects

Page 229: Fy06 SQL Intro

Tuning, Monitoring, and Diagnosing System Tuning, Monitoring, and Diagnosing System ProblemsProblems

SQL Server ProfilerSQL Server Profiler

System MonitorSystem Monitor

Database Tuning AdvisorDatabase Tuning Advisor

Page 230: Fy06 SQL Intro

SQL Server ProfilerSQL Server Profiler

Can analyze the SQL Server Database Can analyze the SQL Server Database Engine and Analysis ServicesEngine and Analysis ServicesSignificantly easier to setup (Events, Data Significantly easier to setup (Events, Data Columns and Filter dialogs combined)Columns and Filter dialogs combined)

Special Events: Service Broker, Notification Special Events: Service Broker, Notification Services, etc…Services, etc…Special Event types: Showplan XML and Special Event types: Showplan XML and Deadlock Graph – can be saved to one (each Deadlock Graph – can be saved to one (each type) or individual files for later review and type) or individual files for later review and interaction!interaction!

Events that do/do not produce values for Events that do/do not produce values for given data columns are obvious at given data columns are obvious at selection (check boxes) and viewing selection (check boxes) and viewing (yellowed)(yellowed)Can Profile SQL Server 2000 and 2005Can Profile SQL Server 2000 and 2005Permissions to profile are grantablePermissions to profile are grantable

Page 231: Fy06 SQL Intro

Profiler EnhancementsProfiler Enhancements

SQL Server 2000 profiler scripts run SQL Server 2000 profiler scripts run with out change on 2005with out change on 2005Uses XML based definitionsUses XML based definitionsSave showplan in XML formatSave showplan in XML formatSave results in XML formatSave results in XML formatCan replay one or more rollover filesCan replay one or more rollover filesAllows profiling of Allows profiling of

Analysis ServicesAnalysis ServicesData Transformation ServicesData Transformation Services

Correlation of Events to Performance Correlation of Events to Performance Monitor dataMonitor data

Page 232: Fy06 SQL Intro

Profiler UI CondensedProfiler UI Condensed

Page 233: Fy06 SQL Intro

System Monitor (PerfMon) System Monitor (PerfMon) Integration Integration Performance Counter LogsPerformance Counter Logs

Create a Profiler TraceCreate a Profiler Trace

Create a Performance Monitor LogCreate a Performance Monitor Log

Works solely based on time – make Works solely based on time – make sure the two clients (if different) are sure the two clients (if different) are time correlatedtime correlated

Open Trace (complete load), Use File Open Trace (complete load), Use File – Import Performance Data, Select – Import Performance Data, Select Objects/Counters…Objects/Counters…

Works with SQL Server 2000 and SQL Works with SQL Server 2000 and SQL Server 2005 as it’s client side!Server 2005 as it’s client side!

Page 234: Fy06 SQL Intro

Performance Monitor LogPerformance Monitor Log

DefineDefineObjects Objects (complete (complete groups)groups)

CountersCounters

Sample IntervalSample Interval

File LocationFile Location

Start log (time)Start log (time)

Stop log (time)Stop log (time)

Page 235: Fy06 SQL Intro

Profiler with PerfMon Profiler with PerfMon IntegrationIntegration

Page 236: Fy06 SQL Intro

Database Tuning AdvisorDatabase Tuning Advisor

Index Tuning Wizard replaced by DTAIndex Tuning Wizard replaced by DTA

Adds:Adds:Partitioning recommendationsPartitioning recommendations

Time-bound tuningTime-bound tuning

Indexes with Included columns -> more Indexes with Included columns -> more efficient coveringefficient covering

XML Input/Output XML Input/Output

Drop ONLY modeDrop ONLY mode

Parameterized command line executionParameterized command line execution

Page 237: Fy06 SQL Intro

Database Tuning AdvisorDatabase Tuning Advisor

Can be launched from-Can be launched from-SQL Server management StudioSQL Server management Studio

ProfilerProfiler

Import previously saved Session Import previously saved Session DefinitionDefinition

Stored in XML formatStored in XML format

Workload optionsWorkload optionsCan be a *.trc, *.sql or *.xml formatCan be a *.trc, *.sql or *.xml format

Can be a SQL Server TableCan be a SQL Server TableFor table saved in 2000 need to launch DTA For table saved in 2000 need to launch DTA while connected to 2000 serverwhile connected to 2000 server

Page 238: Fy06 SQL Intro

Database Tuning AdvisorDatabase Tuning Advisor

Workload optionsWorkload optionsClone SessionClone Session

Actions Actions clone session clone sessionLaunches duplicate DTA window while Launches duplicate DTA window while maintaining chosen settingsmaintaining chosen settings

Tuning optionsTuning optionsPhysical Design Structures (PDS)Physical Design Structures (PDS)

To use in databaseTo use in databaseTo keep in databaseTo keep in database

Partitioning StrategyPartitioning StrategyOnline index recommendationsOnline index recommendations

Default is OFFDefault is OFFChange in Advanced Tuning OptionsChange in Advanced Tuning Options

Page 239: Fy06 SQL Intro

Database Tuning AdvisorDatabase Tuning Advisor

Optimization ProcessOptimization ProcessConfiguration infoConfiguration infoReading workloadReading workloadPerforming analysisPerforming analysisReportsReports

Tuning summary reportTuning summary reportEvent frequencyEvent frequencyQuery detailQuery detailQuery-index relationsQuery-index relationsQuery cost rangeQuery cost rangeIndex usageIndex usage

RecommendationsRecommendationsIndex, partition schemesIndex, partition schemes

Page 240: Fy06 SQL Intro

INF: How to Create a SQL Server 2000 TraceINF: How to Create a SQL Server 2000 Trace ((283790283790))HOW TO: Programmatically Load Trace Files into HOW TO: Programmatically Load Trace Files into TablesTables ( (270599270599))How To: Stop a Server-Side Trace in SQL Server How To: Stop a Server-Side Trace in SQL Server 20002000 ( (822853822853))INF: How to Monitor SQL Server 2000 TracesINF: How to Monitor SQL Server 2000 Traces (283786)(283786)INF: Stored Procedure to Create a SQL Server INF: Stored Procedure to Create a SQL Server 2000 Blackbox Trace2000 Blackbox Trace (281671) (281671)BUG: BOL Incorrectly States That Users Do Not BUG: BOL Incorrectly States That Users Do Not Need to Be Sysadmin to Use Profiler or SQL Need to Be Sysadmin to Use Profiler or SQL Profiler SPs Profiler SPs (this is a SQL Server 2000 limitation – (this is a SQL Server 2000 limitation – 310175310175))INF: Job to Monitor SQL Server 2000 Performance INF: Job to Monitor SQL Server 2000 Performance and Activityand Activity ( (283696283696))

Minimizing the Impact of Minimizing the Impact of ProfilingProfilingServer-side Trace QueuesServer-side Trace Queues

Page 241: Fy06 SQL Intro

Session SummarySession Summary

Building a scalable, reliable, and Building a scalable, reliable, and performant real-world system takes performant real-world system takes MANY steps…MANY steps…There’s no magic bullet!There’s no magic bullet!But, it is possible…But, it is possible…

using these techniques!using these techniques!

Join Indonesia SQL Server CommunityJoin Indonesia SQL Server CommunitySubscribeSubscribe

[email protected]@netindonesia.net

Un-SubscribeUn-Subscribedotnet-unsubscribe@[email protected]

Page 242: Fy06 SQL Intro

© 2005 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.