s.m.a.r.t. biml - standardize, model, automate, reuse and transform (sqlsaturday oregon)

84
S.M.A.R.T. Biml Cathrine Wilhelmsen October 24 th 2015

Upload: cathrine-wilhelmsen

Post on 15-Apr-2017

5.276 views

Category:

Data & Analytics


1 download

TRANSCRIPT

S.M.A.R.T. BimlCathrine WilhelmsenOctober 24th 2015

Have you ever wanted to build a Data Warehouse simply by pushing a button? It might not be quite that easy yet, but gone are the days of repetitive development. Stop wasting your time on dragging, dropping, connecting, aligning and creating the same SSIS package over and over and over again and start working S.M.A.R.T. with Biml.

You already know how to build a staging environment in an hour, so let us dive straight into some advanced features of Biml. We will start by looking at how to create our own C# classes and methods, and how to centralize and reuse code. Then we will explore the metadata modeling feature in Mist. Finally, we will create a framework of transformers that allow you to modify existing objects both interactively and automatically.

If you already think Biml is powerful, just wait until you have a toolbox full of transformers ready to do the heavy lifting for you!

(No Autobots were harmed in the making of this session.)

Session Description

Please Support Our SponsorsSQL Saturday is made possible with the generous support of these sponsors.

You can support them by opting-in and visiting them in the sponsor area.

Cathrine Wilhelmsen@cathrinew

cathrinewilhelmsen.netData Warehouse Architect

Business Intelligence Developer

Know basic Biml and BimlScript

Completed BimlScript.com lessons

Have created a staging environment

You…

…?

S.M.A.R.T. BimlStandardize • Model • Automate • Reuse • Transform

C# Classes and Methods

Metadata Modeling

Transformers and Frameworks

Quick Recap

Biml Tools

Automate, control and manipulate Biml with C#

Flat XML"Just text"

Biml vs. BimlScript

How does it work?

Yes, but how does it work?

Yes, but how does it actually work?<Biml xmlns="http://schemas.varigence.com/biml.xsd">

<Packages>

<# foreach (var table in RootNode.Tables) { #>

<Package Name="Load<#=table.Name#>"></Package>

<# } #>

</Packages>

</Biml>

<Biml xmlns="http://schemas.varigence.com/biml.xsd">

<Packages>

<Package Name="LoadCustomer"></Package>

<Package Name="LoadProduct"></Package>

<Package Name="LoadSales"></Package>

</Packages>

</Biml>

Don't Repeat Yourself

Move common code to separate files

Centralize and reuse in many projects

Update code once for all projects

Don't Repeat Yourself

Solve logical dependencies and simulate manual workflows by using tiers

Tiers instruct the BimlCompiler to compile files from lowest to highest tier

<#@ template tier="1" #>

Higher tiers can use and might depend on objects from lower tiers

Tier 1 - Create database connectionsExample: Tier 2 - Create loading packages

Tier 3 - Create master package to execute loading packages

Split and Combine Biml Files

Include common code in multiple files and projects

Can include many file types: .biml .txt .sql .cs

Use the include directive

<#@ include file="CommonCode.biml" #>

The include directive will be replaced by the content of the included file

Include pulls code from the included file into the main file

Include Files

Include Files

Include Files

Include Files

Works like a parameterized include

File to be called (callee) specifies the input parameters it accepts

<#@ property name="Table" type="AstTableNode" #>

File that calls (caller) passes input parameters

<#=CallBimlScript("CommonCode.biml", Table)#>

CallBimlScript pushes parameters from the caller to the callee, and the callee returns code

CallBimlScript with Parameters

CallBimlScript with Parameters

CallBimlScript with Parameters

CallBimlScript with Parameters

CallBimlScript with Parameters

CallBimlScript with Parameters

BIDS Helper vs. Mist

"Black Box"

Only SSIS packages visible

Save Biml to file for debugging

Visual IDE

All in-memory objects visible

Preview Expanded BimlScript

C# Classes and Methods

One language to query:

SQL Server Databases

XML Documents

Datasets

Collections

LINQ (Language-Integrated Query)

Two ways to write queries:

SQL-like Syntax

Extension Methods

LINQ Extension Methods

..and many, many more!

Sort

OrderBy, ThenBy

Filter

Where, OfType

Group

GroupBy

Aggregate

Count, Sum

Check Collections

All, Any, Contains

Get Elements

First, Last, ElementAt

Project Collections

Select, SelectMany

var numConnections = RootNode.Connections.Count()

foreach (var table in RootNode.Tables.Where(…))

if (RootNode.Packages.Any(…))

LINQ Extension Methods

Use lambda expressions to filter or specify values:

.Where(table => table.Schema.Name == "Production")

.OrderBy(table => table.Name)

LINQ and Lambda expressions

For each element in the collection…

.Where(table => table.Schema.Name == "Production")

.OrderBy(table => table.Name)

LINQ and Lambda expressions

…evaluate a criteria or get a value:

.Where(table => table.Schema.Name == "Production")

.OrderBy(table => table.Name)

LINQ and Lambda expressions

You can name the element anything…

.Where(x => x.Schema.Name == "Production")

.OrderBy(x => x.Name)

LINQ and Lambda expressions

…but try to avoid confusing code

.Where(cat => cat.Schema.Name == "Production")

.OrderBy(cat => cat.Name)

LINQ and Lambda expressions

LINQ: Filter collections

Where()

Returns the filtered collection with all elements that meet the criteria

RootNode.Tables.Where(t => t.Schema.Name == "Production")

OfType()

Returns the filtered collection with all elements of the specified type

RootNode.Connections.OfType<AstExcelOleDbConnectionNode>()

LINQ: Sort collections

OrderBy()

Returns the collection sorted by key…

RootNode.Tables.OrderBy(t => t.Name)

ThenBy()

…then sorted by secondary key

RootNode.Tables.OrderBy(t => t.Schema.Name).ThenBy(t => t.Name)

LINQ: Sort collections

OrderByDescending()

Returns the collection sorted by key…

RootNode.Tables.OrderByDescending(t => t.Name)

ThenByDescending()

…then sorted by secondary key

RootNode.Tables.OrderBy(t => t.Schema.Name).ThenByDescending(t => t.Name)

LINQ: Sort collections

Reverse()

Returns the collection sorted in reverse order

RootNode.Tables.Reverse()

LINQ: Group collections

GroupBy()

Returns a collection of key-value pairs where each value is a new collection

RootNode.Tables.GroupBy(t => t.Schema.Name)

LINQ: Aggregate collections

Count()

Returns the number of elements in the collection

RootNode.Tables.Count()

RootNode.Tables.Count(t => t.Schema.Name == "Production")

LINQ: Aggregate collections

Sum()

Returns the sum of the (numeric) values in the collection

RootNode.Tables.Sum(t => t.Columns.Count)

Average()

Returns the average value of the (numeric) values in the collection

RootNode.Tables.Average(t => t.Columns.Count)

LINQ: Aggregate collections

Min()

Returns the minimum value of the (numeric) values in the collection

RootNode.Tables.Min(t => t.Columns.Count)

Max()

Returns the maximum value of the (numeric) values in the collection

RootNode.Tables.Max(t => t.Columns.Count)

LINQ: Check collections

All()

Returns true if all elements in the collection meet the criteria

RootNode.Databases.All(d => d.Name.StartsWith("A"))

Any()

Returns true if any element in the collection meets the criteria

RootNode.Databases.Any(d => d.Name.Contains("DW"))

LINQ: Check collections

Contains()

Returns true if collection contains element

RootNode.Databases.Contains(AdventureWorks2014)

LINQ: Get elements

First()

Returns the first element in the collection (that meets the criteria)

RootNode.Tables.First()

RootNode.Tables.First(t => t.Schema.Name == "Production")

FirstOrDefault()

Returns the first element in the collection or default value (that meets the criteria)

RootNode.Tables.FirstOrDefault()

RootNode.Tables.FirstOrDefault(t => t.Schema.Name == "Production")

LINQ: Get elements

Last()

Returns the last element in the collection (that meets the criteria)

RootNode.Tables.Last()

RootNode.Tables.Last(t => t.Schema.Name == "Production")

LastOrDefault()

Returns the last element in the collection or default value (that meets the criteria)

RootNode.Tables.LastOrDefault()

RootNode.Tables.LastOrDefault(t => t.Schema.Name == "Production")

LINQ: Get elements

ElementAt()

Returns the element in the collection at the specified index

RootNode.Tables.ElementAt(42)

ElementAtOrDefault()

Returns the element in the collection or default value at the specified index

RootNode.Tables.ElementAtOrDefault(42)

LINQ: Project collections

Select()

Creates a new collection from one collection

A list of table names:

RootNode.Tables.Select(t => t.Name)

A list of table and schema names:

RootNode.Tables.Select(t => new {t.Name, t.Schema.Name})

LINQ: Project collections

SelectMany()

Creates a new collection from many collections and merges the collections

A list of all columns from all tables:

RootNode.Tables.SelectMany(t => t.Columns)

C# Classes and Methods

BimlScript and LINQ not enough?

Need to reuse C# code?

Create your own classes and methods!

public static class HelperClass {

public static bool AnnotationTagExists(AstNode node, string tag) {

if (node.GetTag(tag) != "") {

return true;

} else {

return false;

}

}

}

C# Classes and Methods: From this…

public static class HelperClass {

public static bool AnnotationTagExists(AstNode node, string tag) {

return (node.GetTag(tag) != "") ? true : false;

}

}

C# Classes and Methods: …to this

* For bools you can just use:return (node.GetTag(tag) != "");

But in this example we'll use the verbose, SSIS-like syntaxbecause it can be reused with other data types, like…

public static class HelperClass {

public static string AnnotationTagExists(AstNode node, string tag) {

return (node.GetTag(tag) != "") ? "Yes" : "No";

}

}

C# Classes and Methods: …or this

C# Classes and Methods

Inline code blocks

Included Biml files with code blocks

Reference code files

<Biml xmlns="http://schemas.varigence.com/biml.xsd">

<# foreach (var table in RootNode.Tables) { #>

<# if (HelperClass.AnnotationTagExists(table, "SourceSchema")) { #>

...

<# } #>

<# } #>

</Biml>

C# Classes and Methods

<Biml xmlns="http://schemas.varigence.com/biml.xsd">

<# foreach (var table in RootNode.Tables) { #>

<# if (HelperClass.AnnotationTagExists(table, "SourceSchema")) { #>

...

<# } #>

<# } #>

</Biml>

<#+

public static class HelperClass {

public static bool AnnotationTagExists(AstNode node, string tag) {

return (node.GetTag(tag) != "") ? true : false;

}

}

#>

C# Classes and Methods: Inline

<#@ include file="HelperClass.biml" #>

<Biml xmlns="http://schemas.varigence.com/biml.xsd">

<# foreach (var table in RootNode.Tables) { #>

<# if (HelperClass.AnnotationTagExists(table, "SourceSchema")) { #>

...

<# } #>

<# } #>

</Biml>

C# Classes and Methods: Included Files

<#+public static class HelperClass {public static bool AnnotationTagExists(AstNode node, string tag) {

return (node.GetTag(tag) != "") ? true : false;}

}#>

<#@ code file="..\Code\HelperClass.cs" #>

<Biml xmlns="http://schemas.varigence.com/biml.xsd">

<# foreach (var table in RootNode.Tables) { #>

<# if (HelperClass.AnnotationTagExists(table, "SourceSchema")) { #>

...

<# } #>

<# } #>

</Biml>

C# Classes and Methods: Code Files

public static class HelperClass {public static bool AnnotationTagExists(AstNode node, string tag) {

return (node.GetTag(tag) != "") ? true : false;}

}

"Make it look like the method belongs to an object instead of a helper class"

Extension Methods

<#@ code file="..\Code\HelperClass.cs" #>

<Biml xmlns="http://schemas.varigence.com/biml.xsd">

<# foreach (var table in RootNode.Tables) { #>

<# if (HelperClass.AnnotationTagExists(table, "SourceSchema")) { #>

...

<# } #>

<# } #>

</Biml>

Extension Methods: From this…

public static class HelperClass {public static bool AnnotationTagExists(AstNode node, string tag) {return (node.GetTag(tag) != "") ? true : false;

}}

<#@ code file="..\Code\HelperClass.cs" #>

<Biml xmlns="http://schemas.varigence.com/biml.xsd">

<# foreach (var table in RootNode.Tables) { #>

<# if (HelperClass.AnnotationTagExists(table, "SourceSchema")) { #>

...

<# } #>

<# } #>

</Biml>

Extension Methods: …to this

public static class HelperClass {public static bool AnnotationTagExists(this AstNode node, string tag) {return (node.GetTag(tag) != "") ? true : false;

}}

<#@ code file="..\Code\HelperClass.cs" #>

<Biml xmlns="http://schemas.varigence.com/biml.xsd">

<# foreach (var table in RootNode.Tables) { #>

<# if (table.AnnotationTagExists("SourceSchema")) { #>

...

<# } #>

<# } #>

</Biml>

Extension Methods: …to this

public static class HelperClass {public static bool AnnotationTagExists(this AstNode node, string tag) {return (node.GetTag(tag) != "") ? true : false;

}}

<#@ code file="..\Code\HelperClass.cs" #>

<Biml xmlns="http://schemas.varigence.com/biml.xsd">

<# foreach (var table in RootNode.Tables.Where(t =>

t.AnnotationTagExists("SourceSchema")) { #>

...

<# } #>

</Biml>

Extension Methods: …to this

public static class HelperClass {public static bool AnnotationTagExists(this AstNode node, string tag) {return (node.GetTag(tag) != "") ? true : false;

}}

C# Classes and Methods

Metadata Modeling

Lesson learned:

Start small, start simple

If you try to create a complete and perfectmetadata model from day 1…

…there is a risk you might not get anything done

Metadata Models

Entities

Relationships

Properties

Metadata Modeling in Mist

Metadata Instances

Data Items

Properties

Metadata Model

Entities Relationships

Properties

Metadata Instance

Properties

DataItems

Metadata Modeling

Transformers and Frameworks

Transformers: Modify existing Biml

<#@ target type="Package" mergemode="LocalMerge" #>

<Node>

<Variables>

<Variable Name="NewRows" DataType="Int32">0</Variable>

</Variables>

</Node>

Transformers syntax

<#@ target type="Package" mergemode="LocalMerge" #>

<Node>

<Variables>

<Variable Name="NewRows" DataType="Int32">0</Variable>

</Variables>

</Node>

Transformers syntax: Target Type

<#@ target type="Package" mergemode="LocalMerge" #>

<Node>

<Variables>

<Variable Name="NewRows" DataType="Int32">0</Variable>

</Variables>

</Node>

Transformers syntax: Merge Mode

LocalMerge (update object)

Keep original object and merge changes

LocalReplace (replace object)

Replace original object with new object(s)

LocalMergeAndTypeReplace (convert object)

Replace original object, but copy shared properties

Merge Modes

Transformers and Frameworks

Get things doneStart small

Start simple

Start with ugly code

Keep goingExpand

Improve

Deliver often

@cathrinew

cathrinewilhelmsen.net

no.linkedin.com/in/cathrinewilhelmsen

[email protected]

slideshare.net/cathrinewilhelmsen

Biml resources and references:

cathrinewilhelmsen.net/biml