s.m.a.r.t. biml - standardize, model, automate, reuse and transform (sqlsaturday oregon)
TRANSCRIPT
Have you ever wanted to build a Data Warehouse simply by pushing a button? It might not be quite that easy yet, but gone are the days of repetitive development. Stop wasting your time on dragging, dropping, connecting, aligning and creating the same SSIS package over and over and over again and start working S.M.A.R.T. with Biml.
You already know how to build a staging environment in an hour, so let us dive straight into some advanced features of Biml. We will start by looking at how to create our own C# classes and methods, and how to centralize and reuse code. Then we will explore the metadata modeling feature in Mist. Finally, we will create a framework of transformers that allow you to modify existing objects both interactively and automatically.
If you already think Biml is powerful, just wait until you have a toolbox full of transformers ready to do the heavy lifting for you!
(No Autobots were harmed in the making of this session.)
Session Description
Please Support Our SponsorsSQL Saturday is made possible with the generous support of these sponsors.
You can support them by opting-in and visiting them in the sponsor area.
Cathrine Wilhelmsen@cathrinew
cathrinewilhelmsen.netData Warehouse Architect
Business Intelligence Developer
Know basic Biml and BimlScript
Completed BimlScript.com lessons
Have created a staging environment
You…
…?
Yes, but how does it actually work?<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Packages>
<# foreach (var table in RootNode.Tables) { #>
<Package Name="Load<#=table.Name#>"></Package>
<# } #>
</Packages>
</Biml>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<Packages>
<Package Name="LoadCustomer"></Package>
<Package Name="LoadProduct"></Package>
<Package Name="LoadSales"></Package>
</Packages>
</Biml>
Move common code to separate files
Centralize and reuse in many projects
Update code once for all projects
Don't Repeat Yourself
Solve logical dependencies and simulate manual workflows by using tiers
Tiers instruct the BimlCompiler to compile files from lowest to highest tier
<#@ template tier="1" #>
Higher tiers can use and might depend on objects from lower tiers
Tier 1 - Create database connectionsExample: Tier 2 - Create loading packages
Tier 3 - Create master package to execute loading packages
Split and Combine Biml Files
Include common code in multiple files and projects
Can include many file types: .biml .txt .sql .cs
Use the include directive
<#@ include file="CommonCode.biml" #>
The include directive will be replaced by the content of the included file
Include pulls code from the included file into the main file
Include Files
Works like a parameterized include
File to be called (callee) specifies the input parameters it accepts
<#@ property name="Table" type="AstTableNode" #>
File that calls (caller) passes input parameters
<#=CallBimlScript("CommonCode.biml", Table)#>
CallBimlScript pushes parameters from the caller to the callee, and the callee returns code
CallBimlScript with Parameters
BIDS Helper vs. Mist
"Black Box"
Only SSIS packages visible
Save Biml to file for debugging
Visual IDE
All in-memory objects visible
Preview Expanded BimlScript
One language to query:
SQL Server Databases
XML Documents
Datasets
Collections
LINQ (Language-Integrated Query)
Two ways to write queries:
SQL-like Syntax
Extension Methods
LINQ Extension Methods
..and many, many more!
Sort
OrderBy, ThenBy
Filter
Where, OfType
Group
GroupBy
Aggregate
Count, Sum
Check Collections
All, Any, Contains
Get Elements
First, Last, ElementAt
Project Collections
Select, SelectMany
var numConnections = RootNode.Connections.Count()
foreach (var table in RootNode.Tables.Where(…))
if (RootNode.Packages.Any(…))
LINQ Extension Methods
Use lambda expressions to filter or specify values:
.Where(table => table.Schema.Name == "Production")
.OrderBy(table => table.Name)
LINQ and Lambda expressions
For each element in the collection…
.Where(table => table.Schema.Name == "Production")
.OrderBy(table => table.Name)
LINQ and Lambda expressions
…evaluate a criteria or get a value:
.Where(table => table.Schema.Name == "Production")
.OrderBy(table => table.Name)
LINQ and Lambda expressions
You can name the element anything…
.Where(x => x.Schema.Name == "Production")
.OrderBy(x => x.Name)
LINQ and Lambda expressions
…but try to avoid confusing code
.Where(cat => cat.Schema.Name == "Production")
.OrderBy(cat => cat.Name)
LINQ and Lambda expressions
LINQ: Filter collections
Where()
Returns the filtered collection with all elements that meet the criteria
RootNode.Tables.Where(t => t.Schema.Name == "Production")
OfType()
Returns the filtered collection with all elements of the specified type
RootNode.Connections.OfType<AstExcelOleDbConnectionNode>()
LINQ: Sort collections
OrderBy()
Returns the collection sorted by key…
RootNode.Tables.OrderBy(t => t.Name)
ThenBy()
…then sorted by secondary key
RootNode.Tables.OrderBy(t => t.Schema.Name).ThenBy(t => t.Name)
LINQ: Sort collections
OrderByDescending()
Returns the collection sorted by key…
RootNode.Tables.OrderByDescending(t => t.Name)
ThenByDescending()
…then sorted by secondary key
RootNode.Tables.OrderBy(t => t.Schema.Name).ThenByDescending(t => t.Name)
LINQ: Sort collections
Reverse()
Returns the collection sorted in reverse order
RootNode.Tables.Reverse()
LINQ: Group collections
GroupBy()
Returns a collection of key-value pairs where each value is a new collection
RootNode.Tables.GroupBy(t => t.Schema.Name)
LINQ: Aggregate collections
Count()
Returns the number of elements in the collection
RootNode.Tables.Count()
RootNode.Tables.Count(t => t.Schema.Name == "Production")
LINQ: Aggregate collections
Sum()
Returns the sum of the (numeric) values in the collection
RootNode.Tables.Sum(t => t.Columns.Count)
Average()
Returns the average value of the (numeric) values in the collection
RootNode.Tables.Average(t => t.Columns.Count)
LINQ: Aggregate collections
Min()
Returns the minimum value of the (numeric) values in the collection
RootNode.Tables.Min(t => t.Columns.Count)
Max()
Returns the maximum value of the (numeric) values in the collection
RootNode.Tables.Max(t => t.Columns.Count)
LINQ: Check collections
All()
Returns true if all elements in the collection meet the criteria
RootNode.Databases.All(d => d.Name.StartsWith("A"))
Any()
Returns true if any element in the collection meets the criteria
RootNode.Databases.Any(d => d.Name.Contains("DW"))
LINQ: Check collections
Contains()
Returns true if collection contains element
RootNode.Databases.Contains(AdventureWorks2014)
LINQ: Get elements
First()
Returns the first element in the collection (that meets the criteria)
RootNode.Tables.First()
RootNode.Tables.First(t => t.Schema.Name == "Production")
FirstOrDefault()
Returns the first element in the collection or default value (that meets the criteria)
RootNode.Tables.FirstOrDefault()
RootNode.Tables.FirstOrDefault(t => t.Schema.Name == "Production")
LINQ: Get elements
Last()
Returns the last element in the collection (that meets the criteria)
RootNode.Tables.Last()
RootNode.Tables.Last(t => t.Schema.Name == "Production")
LastOrDefault()
Returns the last element in the collection or default value (that meets the criteria)
RootNode.Tables.LastOrDefault()
RootNode.Tables.LastOrDefault(t => t.Schema.Name == "Production")
LINQ: Get elements
ElementAt()
Returns the element in the collection at the specified index
RootNode.Tables.ElementAt(42)
ElementAtOrDefault()
Returns the element in the collection or default value at the specified index
RootNode.Tables.ElementAtOrDefault(42)
LINQ: Project collections
Select()
Creates a new collection from one collection
A list of table names:
RootNode.Tables.Select(t => t.Name)
A list of table and schema names:
RootNode.Tables.Select(t => new {t.Name, t.Schema.Name})
LINQ: Project collections
SelectMany()
Creates a new collection from many collections and merges the collections
A list of all columns from all tables:
RootNode.Tables.SelectMany(t => t.Columns)
C# Classes and Methods
BimlScript and LINQ not enough?
Need to reuse C# code?
Create your own classes and methods!
public static class HelperClass {
public static bool AnnotationTagExists(AstNode node, string tag) {
if (node.GetTag(tag) != "") {
return true;
} else {
return false;
}
}
}
C# Classes and Methods: From this…
public static class HelperClass {
public static bool AnnotationTagExists(AstNode node, string tag) {
return (node.GetTag(tag) != "") ? true : false;
}
}
C# Classes and Methods: …to this
* For bools you can just use:return (node.GetTag(tag) != "");
But in this example we'll use the verbose, SSIS-like syntaxbecause it can be reused with other data types, like…
public static class HelperClass {
public static string AnnotationTagExists(AstNode node, string tag) {
return (node.GetTag(tag) != "") ? "Yes" : "No";
}
}
C# Classes and Methods: …or this
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<# foreach (var table in RootNode.Tables) { #>
<# if (HelperClass.AnnotationTagExists(table, "SourceSchema")) { #>
...
<# } #>
<# } #>
</Biml>
C# Classes and Methods
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<# foreach (var table in RootNode.Tables) { #>
<# if (HelperClass.AnnotationTagExists(table, "SourceSchema")) { #>
...
<# } #>
<# } #>
</Biml>
<#+
public static class HelperClass {
public static bool AnnotationTagExists(AstNode node, string tag) {
return (node.GetTag(tag) != "") ? true : false;
}
}
#>
C# Classes and Methods: Inline
<#@ include file="HelperClass.biml" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<# foreach (var table in RootNode.Tables) { #>
<# if (HelperClass.AnnotationTagExists(table, "SourceSchema")) { #>
...
<# } #>
<# } #>
</Biml>
C# Classes and Methods: Included Files
<#+public static class HelperClass {public static bool AnnotationTagExists(AstNode node, string tag) {
return (node.GetTag(tag) != "") ? true : false;}
}#>
<#@ code file="..\Code\HelperClass.cs" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<# foreach (var table in RootNode.Tables) { #>
<# if (HelperClass.AnnotationTagExists(table, "SourceSchema")) { #>
...
<# } #>
<# } #>
</Biml>
C# Classes and Methods: Code Files
public static class HelperClass {public static bool AnnotationTagExists(AstNode node, string tag) {
return (node.GetTag(tag) != "") ? true : false;}
}
<#@ code file="..\Code\HelperClass.cs" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<# foreach (var table in RootNode.Tables) { #>
<# if (HelperClass.AnnotationTagExists(table, "SourceSchema")) { #>
...
<# } #>
<# } #>
</Biml>
Extension Methods: From this…
public static class HelperClass {public static bool AnnotationTagExists(AstNode node, string tag) {return (node.GetTag(tag) != "") ? true : false;
}}
<#@ code file="..\Code\HelperClass.cs" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<# foreach (var table in RootNode.Tables) { #>
<# if (HelperClass.AnnotationTagExists(table, "SourceSchema")) { #>
...
<# } #>
<# } #>
</Biml>
Extension Methods: …to this
public static class HelperClass {public static bool AnnotationTagExists(this AstNode node, string tag) {return (node.GetTag(tag) != "") ? true : false;
}}
<#@ code file="..\Code\HelperClass.cs" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<# foreach (var table in RootNode.Tables) { #>
<# if (table.AnnotationTagExists("SourceSchema")) { #>
...
<# } #>
<# } #>
</Biml>
Extension Methods: …to this
public static class HelperClass {public static bool AnnotationTagExists(this AstNode node, string tag) {return (node.GetTag(tag) != "") ? true : false;
}}
<#@ code file="..\Code\HelperClass.cs" #>
<Biml xmlns="http://schemas.varigence.com/biml.xsd">
<# foreach (var table in RootNode.Tables.Where(t =>
t.AnnotationTagExists("SourceSchema")) { #>
...
<# } #>
</Biml>
Extension Methods: …to this
public static class HelperClass {public static bool AnnotationTagExists(this AstNode node, string tag) {return (node.GetTag(tag) != "") ? true : false;
}}
Lesson learned:
Start small, start simple
If you try to create a complete and perfectmetadata model from day 1…
Metadata Models
Entities
Relationships
Properties
Metadata Modeling in Mist
Metadata Instances
Data Items
Properties
<#@ target type="Package" mergemode="LocalMerge" #>
<Node>
<Variables>
<Variable Name="NewRows" DataType="Int32">0</Variable>
</Variables>
</Node>
Transformers syntax
<#@ target type="Package" mergemode="LocalMerge" #>
<Node>
<Variables>
<Variable Name="NewRows" DataType="Int32">0</Variable>
</Variables>
</Node>
Transformers syntax: Target Type
<#@ target type="Package" mergemode="LocalMerge" #>
<Node>
<Variables>
<Variable Name="NewRows" DataType="Int32">0</Variable>
</Variables>
</Node>
Transformers syntax: Merge Mode
LocalMerge (update object)
Keep original object and merge changes
LocalReplace (replace object)
Replace original object with new object(s)
LocalMergeAndTypeReplace (convert object)
Replace original object, but copy shared properties
Merge Modes
@cathrinew
cathrinewilhelmsen.net
no.linkedin.com/in/cathrinewilhelmsen
slideshare.net/cathrinewilhelmsen
Biml resources and references:
cathrinewilhelmsen.net/biml