using quadstone’s data build manager thursday, september 15, 2005 9am pacific, 12pm eastern, 5pm...
TRANSCRIPT
Using Quadstone’s Data Build Manager
Thursday, September 15, 20059am Pacific, 12pm Eastern, 5pm UK/Ireland
Friday, September 16, 20052pm UK/Ireland, 3pm Central European, 9am Eastern US
Starting in 15 minutesStarting in 10 minutesStarting in 5 minutesStarting in 2 minutesStarting now
Please join the teleconference call now; if you have any difficulty, contact [email protected].
© 2005 Quadstone
How to ask questions
Use Q&A (not Chat please):
• Click on the Q&A Panel icon at the bottom-right of your screen:
• Type in your question:
© 2005 Quadstone
Using Data Build Manager
• Presenter: Patrick Surry, VP Customer Services• Overview: The Data Build Manager (also known as qsbuild) is
a powerful tool to manage all of the interdependent steps in real-world data-preparation, including parameterization for automated scheduling and the ability to run tasks in parallel.
• Audience: Existing Quadstone data architects, looking to improve the processing speed and their productivity in creating customer analysis datasets.
• Format:• A live demo with slides for sign-posting• Downloadable exercises in the form of a workbook and dataset
• Duration: 1 hour, including Q&A
© 2005 Quadstone
Transactiondata
Customerdata
To be filled
Measurement table
Customer IDs
A simple data preparation process
SORT SOR
T
MEASURE
JOIN
DERIVE
DERIVE
© 2005 Quadstone
Quadstone data preparation tools
• Efficient modular utilities operating primarily on foci• Run via Quadstone System Explorer, the command line, or an
XML build plan
RDBMS1
RDBMS2
Flat files
Third-party
RDBMS1
RDBMS2
Flat files
Third-party
Measure
FO
CU
S
FO
CU
S
Sort
Join
Enhance
etc.
XML Build Plan
© 2005 Quadstone
Data-build commands
FO
CU
S
qsbuild
COMBINING
qsjoinqsappendfieldsqsmerge
IMPORTING
qsdbaccessqsimportdb
qsgenfddqsimportflatqsimportstat
qsimportfocus
REPORTING
qsdescribeqsdescribestatqsauditqsdtsnapshotqsscsnapshotqsxtqsxt2specqsmapgen[qsinfo]
EXPORTING
qsdbcreatetableqsdbinsertqsdbupdate
qsexportflatqsexportstat
MANAGING
qscopyqslinkqsmoveqsremove[qsremoveflat]qstml
FO
CU
S
TRANSFORMING
qssort qsrenamefieldsqsselectqsderiveqsmeasureqstrack
ENHANCING
qsimportmetadataqsupdate[qsinterp][qsexportmetadata]
See Quadstone System data-build command and TML reference
© 2005 Quadstone
What does Data Build Manager do?
• Flexible environment for implementing data-builds• keep simple builds simple but support advanced requirements• XML build plan; qsbuild DBC, point & click build execution
• Key features:• Simple & robust – simple structure with many different tasks• Complete – everything in one place, including inline TML/FDL/SQL (if
desired), and/or non-Quadstone tasks• Modular & portable – structure, reuse and move builds easily• Parameters – no code changes for similar builds• Incremental builds – failure recovery, only do what’s needed• Concurrency – run multiple jobs at the same time• Logging – various ways to track build status and performance
© 2005 Quadstone
How do I launch it?
• Double-click a .qsb file (Build Plan) in the Quadstone Explorer
© 2005 Quadstone
What’s a build plan look like?
• Right-click a .qsb file in the QSE and choose View or Edit
© 2005 Quadstone
What is XML?
1. XML is for structuring data
Structured data includes things like spreadsheets, address books, configuration parameters, financial transactions, and technical drawings. … XML makes it easy for a computer to generate data, read data, and ensure that the data structure is unambiguous. …
2. XML looks a bit like HTML
Like HTML, XML makes use of tags (words bracketed by '<' and '>') and attributes (of the form name="value"). While HTML specifies what each tag and attribute means, and often how the text between them will look in a browser, XML uses the tags only to delimit pieces of data, and leaves the interpretation of the data completely to the application that reads it. …
3. XML is text, but isn't meant to be read
… One advantage of a text format is that it allows people, if necessary, to look at the data without the program that produced it; in a pinch, you can read a text format with your favorite text editor. Text formats also allow developers to more easily debug applications. Like HTML, XML files are text files that people shouldn't have to read, but may when the need arises. …
Borrowed from: http://www.w3.org/XML/1999/XML-in-10-points
© 2005 Quadstone
How can I change it?
• Extend a target with new steps (tasks)• Cut & paste examples from documentation
• Cut & paste from command-line or focus history
• Create new targets for logical separation• Note build’s default target (and initial, final); dependencies
• Nest targets if desired
• Increase efficiency• Conditional execution with ‘unless’ to avoid rework
• Temporary outputs to avoid clutter
• Inline or external scripts (TML, FDL, SQL, …)
© 2005 Quadstone
Making it reusable
• Use properties to avoid repetitive changes• Like variables but can’t change once set• Tasks to set and manipulate in many ways
• Parameters are user-visible properties• E.g. User selects build snapshot date• E.g. User selects full or sample datasets
• Example:• Parameterize build with target month
© 2005 Quadstone
More flexible ways of editing
• Some editors know XML (& schema!): very helpful
• See documentation for how to set it up, e.g. jEdit
© 2005 Quadstone
Going further
• RTFM – good overview of capabilities• Concurrency
• Default values for common attributes
• Date manipulation via qsdateproperty, e.g. today
• Debugging techniques
• Running from the command-line
• Good practices: tips & traps
• Other resources on XML, Ant, etc
© 2005 Quadstone
Where to get help
• Start>All Programs>Quadstone• Using Data Build Manager
• Also see: Data-build command and TML reference; Data-build command Tutorial
• Latest at support.quadstone.com/documentation
• Quadstone System Support:• Web Site: support.quadstone.com/
• Email: [email protected]
• Tel: US 1-800-335-3860; UK 0131 240 3140; All +44 131 240 3140
© 2005 Quadstone
After the webinar
• These slides, a workbook and data are available via www.quadstone.com/training/webinars/
• Audio and video recordings of this webinar are available via the same site
• Any problems or questions, please contact [email protected]
• For more in-depth training (our ½-day Automating Data Preparation course), contact [email protected]
© 2005 Quadstone
Questions and answers
© 2005 Quadstone
Upcoming webinars
Ideas:• Based around a real-life scenario (possibly Uplift Analysis)?• Decision trees, scorecards and quality measures: the gory math internals?• What's new in 5.2• Cluster Builder?• More on TML?• Re-run previous webinarsSee www.quadstone.com/training/webinars/.
If there’s a webinar topic you’d like to see, please let us know via [email protected].