ado.net 2.0: advanced data access patterns pablo castro dat408 ado.net technical lead microsoft...
TRANSCRIPT
ADO.NET 2.0: ADO.NET 2.0: Advanced Data Access Advanced Data Access PatternsPatterns
Pablo CastroPablo CastroDAT408 DAT408 ADO.NET Technical LeadADO.NET Technical LeadMicrosoft CorporationMicrosoft Corporation
2
PRESENTATION
ASP.NETSystem.Web (2.0)CompilationConfigurationHandlersHostingMailManagementSecurityUI (2.0)Util
Presentation FrameworkSystem.Windows (3.0)
AnnotationsAutomationControlsDataDocumentsInkInteropNavigationResourcesSerializationShapesThreading
System.Windows.Media (3.0)
3DAnimationImagine
Windows FormsSystem.Windows.Forms (2.0)
DesignLayoutVisualStyles
Printing SubsystemSystem.Printing (3.0)GDI+System.Drawing
DesignDrawing2DPrinting
ASP.NET Data ManagementSystem.WebCachingProfileSessionState
XML Data System.XmlSchemaSerializationXPathXsl
Windows File SystemSystem.Storage (F)CoreAudioCalendarContactsDocumentsImageMediaMessagesRulesSyncVideo
XPS DocumentsSystem.Windows.Xps (3.0)System.IO.Packaging (3.0)Speech IntegrationSystem.Speech (3.0)RecognitionSynthesis
Language Integrated QuerySystem.Query (F)System.Data.DLinq (F)System.Xml.XLinq (F)System.Expressions (F)
Windows Workflow FoundationSystem.Workflow (3.0)System.Workflow.Activities (3.0)
RulesSystem.Workflow.ComponentModel (3.0)
ComplierDesingerSerializaztion
System.Workflow.Runtime (3.0)HostingMessaging
Windows Communications FoundationSystem.ServiceModel (3.0)
ChannelsConfigurationDiagnosticsIntegrationQueueHelper
System.ServiceModel.Security(3.0)
ProtocolsTokens
Network Class LibrarySystem.Net CacheMail (2.0)Network Information (2.0)Security (2.0)Sockets
.NET RemotingSystem.Runtime.Remoting
ASMX Web ServicesSystem.Web.Services
Identity ManagementMicrosoft.InfoCards (3.0)
MSMQSystem.Messaging
Directory ServicesSystem.DirectoryServices
COMMUNICATION
FUNDAMENTALS
BASE CLASS LIBRARIESSystem
System.CodeDomSystem.ComponentModelSystem.Diagonostics
System.IO (2.0)
System.Resrouces
System.Text
System.ServiceProcess
System.ThreadingSystem.TimersSystem.EnterpriseServicesSystem.Transactions (2.0)
.NET RemotingSystem.Runtime.Remoting
System.TextGeneric (2.0)
System.Reflection
EmailSystem.Configuration
System.TextAccessControl (2.0)Cryptography (2.0)PermissionsPolicyPrincipal (2.0)
Managed Add-In FrameworkSystem.Addins (F)
Contact
Microsoft.Build (2.0)
System.RuntimeCompilerServicesConstrainedExecution(2.0)InteropServicesHostingSerializationVersioning
“ClickOnce” DeploymentSystem.Deployoment (2.0)
WINDOWS VISTA
(2.0) (3.0) - New
(2.0) - Substantially Improved
- Windows Presentation Foundation(formerly codenamed “Avalon”)
- Windows Communication Foundation
(formerly codenamed “Indigo”)
- Windows Workflow Foundation
KEY
ADO.NETSystem.DataCommonOdbcOleDbOracleClientSqlSqlClient
DATA
3
AgendaAgenda
It’s all about moving data around…It’s all about moving data around…
From the application to the databaseFrom the application to the databaseMoving tons of data fastMoving tons of data fast
From the database to the cacheFrom the database to the cacheDataSet as a cacheDataSet as a cache
Maintaining a cache in syncMaintaining a cache in sync
From the cache to the applicationFrom the cache to the applicationQuerying the cacheQuerying the cache
yes, query over DataSet :)yes, query over DataSet :)
4
Moving Tons of DataMoving Tons of Data
Getting data from the server fast is Getting data from the server fast is easyeasy
DataReader objects do most of the workDataReader objects do most of the work
Going back to the server fast is Going back to the server fast is harderharder
Sending INSERT/UPDATE/DELETE Sending INSERT/UPDATE/DELETE statements separately is very slowstatements separately is very slow
Batching statements helpsBatching statements helpsUpdateBatchSize property in the adapterUpdateBatchSize property in the adapter
SqlClient/OracleClient – DataSet-onlySqlClient/OracleClient – DataSet-only
Parameter arraysParameter arraysOn databases/providers that support itOn databases/providers that support it
5
Client Client Server: Server: BCP+DMLBCP+DML
Useful when need to send lots of Useful when need to send lots of changeschanges
Use SqlBulkCopy in 2.0Use SqlBulkCopy in 2.0
Update sequence:Update sequence:First, bulk-insert all the changes into the First, bulk-insert all the changes into the serverserver
Use a temporary tableUse a temporary table
Second, run a DML statement per Second, run a DML statement per change typechange type
One INSERTs, one for UPDATEs, one for One INSERTs, one for UPDATEs, one for DELETEsDELETEs
Do everything in the same transactionDo everything in the same transaction
6
Client Client Server: Server: BCP+DMLBCP+DML
Serious performance gainsSerious performance gainsEven several Even several timestimes faster in some faster in some scenariosscenarios
Fine printFine printFor SQL Server, the database needs to For SQL Server, the database needs to be in simple or bulk-logged recovery be in simple or bulk-logged recovery modemode
tempdb is in simple mode by default so it’s tempdb is in simple mode by default so it’s okok
It works in “full” mode, but BCP is slowerIt works in “full” mode, but BCP is slower
Talk to your DBA about the implicationsTalk to your DBA about the implications
7
Putting All TogetherPutting All Together
For example… in a web application:For example… in a web application:
Consume HTTP request as a streamConsume HTTP request as a streamBuild a DataReader over the requestBuild a DataReader over the request
Feed the DataReader to BCPFeed the DataReader to BCPSends data to the server in streaming Sends data to the server in streaming modemode
Kick off the DML statementsKick off the DML statementsUse an asynchronous ASP.NET pageUse an asynchronous ASP.NET page
Execute an asynchronous ADO.NET Execute an asynchronous ADO.NET commandcommand
8
AgendaAgenda
It’s all about moving data around…It’s all about moving data around…
From the application to the databaseFrom the application to the databaseMoving tons of data fastMoving tons of data fast
From the database to the cacheFrom the database to the cacheDataSet as a cacheDataSet as a cache
Maintaining a cache in syncMaintaining a cache in sync
From the cache to the applicationFrom the cache to the applicationQuerying the cacheQuerying the cache
yes, query over DataSet :)yes, query over DataSet :)
9
DataView update time
4000
8000
1600
0
3200
0
6400
0
1280
00
2560
00
Rows
Tim
e
.NET 1.1 .NET 2.0 B2
Using dataset as a cacheUsing dataset as a cache
DataSet has been DataSet has been greatly enhancedgreatly enhanced
Better scalability Better scalability across the board across the board Incremental Incremental index updatesindex updatesLow-overhead Low-overhead DataView DataView maintenancemaintenanceFaster, more Faster, more compact serializatiocompact serializationn
We can now handle We can now handle huge cacheshuge caches
Cache granularity Cache granularity becomes importantbecomes important
10
SqlDependency For SqlDependency For CachingCaching
SqlDependency can notify of changesSqlDependency can notify of changesTrack queries and fire an event when the Track queries and fire an event when the results would changeresults would change
It tracks whole result-setsIt tracks whole result-sets
SqlDependency and cachingSqlDependency and cachingHaving change notifications helps in Having change notifications helps in cachingcaching
No need for cache expiration policiesNo need for cache expiration policiesExpire cache when the underlying data Expire cache when the underlying data changedchanged
11
SqlDependency + DataSetSqlDependency + DataSet
Middle-tier cacheMiddle-tier cacheCache data in a DataSetCache data in a DataSet
Track the queries used to load the Track the queries used to load the DataSet using SqlDependencyDataSet using SqlDependency
When change notifications comeWhen change notifications comeSimply get rid of the DataSetSimply get rid of the DataSet
Load a new one on the next requestLoad a new one on the next request
12
Controlling GranularityControlling Granularity
For large caches, getting rid of all the For large caches, getting rid of all the data in the cache is not an optiondata in the cache is not an option
SqlDependency doesn’t have SqlDependency doesn’t have granularity control, but…granularity control, but…
13
Controlling GranularityControlling Granularity
Use a large Use a large DataSetDataSet
Load one Load one “segment” at a “segment” at a timetime
e.g., on cache misse.g., on cache miss
On change On change notificationnotification
Use .Select to find Use .Select to find the invalidated the invalidated segmentsegment
Remove it from the Remove it from the cachecache
Next request will fill Next request will fill it up againit up again
DataTableDataTable
SqlDependencSqlDependencyy
SqlDependencSqlDependencyy
SqlDependencSqlDependencyy
SqlDependencSqlDependencyy
ChangeChange SqlDependencSqlDependencyy
14
AgendaAgenda
It’s all about moving data around…It’s all about moving data around…
From the application to the databaseFrom the application to the databaseMoving tons of data fastMoving tons of data fast
From the database to the cacheFrom the database to the cacheDataSet as a cacheDataSet as a cache
Maintaining a cache in syncMaintaining a cache in sync
From the cache to the applicationFrom the cache to the applicationQuerying the cacheQuerying the cache
Yes, query over DataSet :)Yes, query over DataSet :)
15
Using DataSet as a CacheUsing DataSet as a Cache
DataSet enhancements enable huge DataSet enhancements enable huge cachescaches
e.g. millions of rowse.g. millions of rows
With lots of data…With lots of data…It’s hard to navigate through dataIt’s hard to navigate through data
It’s more practical to query over the dataIt’s more practical to query over the data
DataSet does some query…DataSet does some query………but it’s often not enoughbut it’s often not enough
16
Query Over DataSetQuery Over DataSet
We get requests for this every We get requests for this every week :)week :)
What does it mean?What does it mean?Today, you can filter/sort over a single Today, you can filter/sort over a single tabletable
Joins are the main missing pieceJoins are the main missing piece
What does it take to do it?What does it take to do it?Most base services are there in .NET 2.0Most base services are there in .NET 2.0
Enough to build basic scenarios, some Enough to build basic scenarios, some cases will be hard to do efficientlycases will be hard to do efficiently
17
Representing a QueryRepresenting a Query
““Algebra tree” is a simple optionAlgebra tree” is a simple optionNodes represent logical query operatorsNodes represent logical query operators
Free from syntax constructsFree from syntax constructs
Full query syntax is “nice” but not Full query syntax is “nice” but not neededneeded
Easy to extend this sample with syntax Easy to extend this sample with syntax supportsupport
Create a compiler that produces algebraic Create a compiler that produces algebraic treestrees
18
Algebraic OperatorsAlgebraic Operators
ProjectProject
FilterFilter
JoinJoin
SortSort
……
Project Project (name, date)(name, date)
Join (c.id = Join (c.id = o.customerId)o.customerId)
SELECT c.name, o.dateFROM Customers c INNER JOIN Orders oON c.id = o.customerIdWHERE o.status = ‘Pending’
Scan Scan (Customers)(Customers)
Filter Filter (status=‘…’)(status=‘…’)
ScanScan(Orders)(Orders)
19
Compiling A QueryCompiling A Query
Project Project (name, (name, date)date)
Join (c.id = Join (c.id = o.customerIo.customerI
d)d)
Scan Scan (Customers)(Customers)
Filter Filter (status=‘…’(status=‘…’
))
ScanScan(Orders)(Orders)
Projection Projection iteratoriterator
Nested-loop Nested-loop join iteratorjoin iterator
Base-table Base-table scan scan
iteratoriterator
Filtered-Filtered-scan scan
iteratoriterator
+ Index + Index (DataVie(DataVie
w)w)
Indexed join Indexed join iteratoriterator
Indexed Indexed accessaccess
20
Executing A QueryExecuting A Query
Traditional approach using “iterators”Traditional approach using “iterators”
Build an iterator tree from the logical Build an iterator tree from the logical treetree
We’re skipping the physical plan for We’re skipping the physical plan for simplicitysimplicity
Iterators represent physical Iterators represent physical operatorsoperators
I.E. how the operator is implementedI.E. how the operator is implemented
Consume from the top-level iteratorConsume from the top-level iterator
21
Operators: Operators: Projection/FilterProjection/Filter
Projection (“select”)Projection (“select”)Simply keep track of the required Simply keep track of the required columnscolumns
Expressions implemented with DataSet Expressions implemented with DataSet expressionsexpressions
Filter (“where”)Filter (“where”)Implemented with DataTable.SelectImplemented with DataTable.Select
Indexes created on-demand by .SelectIndexes created on-demand by .SelectIf not already thereIf not already there
A DataView can be used as a permanent A DataView can be used as a permanent indexindex
22
Operators: SortOperators: Sort
Sort (“order by”)Sort (“order by”)DataSet surfaces this together with filterDataSet surfaces this together with filter
DataTable.Select or a DataViewDataTable.Select or a DataView
Will also use indexesWill also use indexes
Need to collapse operatorsNeed to collapse operatorsOr do one (filter/sort) manuallyOr do one (filter/sort) manually
23
Operators: JoinOperators: Join
JoinJoinNot supported by DataSet out-of-the-boxNot supported by DataSet out-of-the-box
Different strategiesDifferent strategiesDo a merge join if inputs already sortedDo a merge join if inputs already sorted
Do a relationship-based sort if found a Do a relationship-based sort if found a DataRelationDataRelation
Do a index-based join if one side of the join Do a index-based join if one side of the join has an “index” (DataView)has an “index” (DataView)
Do a nested-loop join as the last resortDo a nested-loop join as the last resort
Optionally, create indexes on demandOptionally, create indexes on demand
24
Operators: The hard onesOperators: The hard ones
DataView/Select don’t take DataView/Select don’t take comparison operator as inputcomparison operator as input
Only “==“ or “!=“Only “==“ or “!=“
So indexed range queries cannot take So indexed range queries cannot take advantage of indexesadvantage of indexes
Not many workaroundsNot many workaroundsNon-indexed queriesNon-indexed queries
Roll your own index :(Roll your own index :(
25
SummarySummary
ADO.NET 2.0 provides tons of ADO.NET 2.0 provides tons of flexible servicesflexible services
Easy scenarios are straightforwardEasy scenarios are straightforward
Hard scenarios possible with some extra Hard scenarios possible with some extra codecode
ADO.NET continues to evolve to ADO.NET continues to evolve to match modern application match modern application requirementsrequirements
26
ResourcesResources
BlogsBlogsTeam blog: Team blog: http://http://blogs.msdn.com/dataaccessblogs.msdn.com/dataaccess
http://http://blogs.msdn.com/aconradblogs.msdn.com/aconrad
http://http://blogs.msdn.com/angelsbblogs.msdn.com/angelsb
http://http://blogs.msdn.com/sushilcblogs.msdn.com/sushilc
Data-access MSDN siteData-access MSDN sitehttp://http://msdn.microsoft.commsdn.microsoft.com/data/data
http://http://msdn.microsoft.com/data/DataAccess/Whidbeymsdn.microsoft.com/data/DataAccess/Whidbey
NewsgroupsNewsgroupsNNTP server: NNTP server: msnews.microsoft.commsnews.microsoft.com
Group: Group: microsoft.public.dotnet.framework.adonetmicrosoft.public.dotnet.framework.adonet
© 2005 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.