handling large datasets by using cross tables “when turning 11 million rows into 1 billion can be...

6

Handling Large Datasets by Using Cross Tables “When Turning 11 million rows into 1 billion can be a good thing”

Upload: todd-robbins

Post on 31-Dec-2015

214 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: Handling Large Datasets by Using Cross Tables “When Turning 11 million rows into 1 billion can be a good thing”

Handling Large Datasets by Using Cross Tables“When Turning 11 million rows into 1 billion can be a good thing”

Page 2: Handling Large Datasets by Using Cross Tables “When Turning 11 million rows into 1 billion can be a good thing”

Requirements...• Query, filter rows and display them in the table view (straight table) from 550 columns and 11 million rows

• Customise the table views by selecting specific columns

• Search specific unique id(s) by pasting the id(s) within the application

Page 3: Handling Large Datasets by Using Cross Tables “When Turning 11 million rows into 1 billion can be a good thing”

What’s the challenge?

• Users wanted to query very wide data sets and there’re no aggregation expressions – we just need to expose the individual cell values in a table form (All Transactional Level).

• Having hundreds of conditional expressions on the straight table will degrade the performance and user experience, even though expressions are not aggregations.

• Each chart expression has three possible expressions – conditional expression, label and actual expression. This adds to the calculation overhead, plus managing potentially hundreds of expressions is painful or boring!

•QlikView shows better performance when calculating across rows rather than across columns

• This app needs to be intuitive even though it contains only table view(s)

Page 4: Handling Large Datasets by Using Cross Tables “When Turning 11 million rows into 1 billion can be a good thing”

With dummy row...Without dummy row...

Name Value Pair or CrossTable Approach

• Extract data from the database and store the QVD(s)

• Add dummy row to tackle the NULL issue (Using Autogenerate 1)

• Transpose the data by keeping the Unique ID & Key Fields (To join the dimension tables)

Unique ID

Type Owner Colour Home

34785 Cat NULL Black London

34786 Cat NULL White NULL

Unique ID

Field Name

Field Value

34785 Type Cat

34785 Colour Black

34785 Home London

34786 Type Cat

34786 Colour White

Unique ID

Field Name

Field Value

34785 Type Cat

34785 Colour Black

34785 Home London

34786 Type Cat

34786 Colour White

99999 Type -

99999 Owner -

99999 Colour -

99999 Home -

Unique ID

Type Owner Colour Home

34785 Cat NULL Black London

34786 Cat NULL White NULL

99999 - - - -

Dummy row

Page 5: Handling Large Datasets by Using Cross Tables “When Turning 11 million rows into 1 billion can be a good thing”

Name Value Pair or CrossTable Approach (Contd...)

• Once the data transformation is completed (i.e. after transposing the data) – we would see the following number of rows:

((Actual Rows x Transposed Columns) – (Total Number of Null Values))

(3 x 4) – (3) = 9 Rows

• Create the pivot table with two dimensions – [Unique ID] & [Field Name] and use the expression as [Field Value]

• Pivot/Drag the [Field Name] column and you will see the table

• While users think they are making column selections they are actually making row selections

Page 6: Handling Large Datasets by Using Cross Tables “When Turning 11 million rows into 1 billion can be a good thing”

Demo

Tableau in Higher Education · Quickly turn datasets into information. • Easily pivot data with columns and rows. • Quickly disaggregate data with quick filters. Built-in functionality

Deriving Rows in CDISC ADaM BDS Datasets Using SAS® …Programming Sandra Minjoe, Accenture Life Sciences ABSTRACT The ADaM Basic Data Structure (BDS) can be used for many analysis

Citing Datasets

Forest Hills Dayflower Wrap - Cascade Yarns · flower Motif] x 2, k1. Rows 3-16: Work as Rows 1-2, working Dayflower Motif Rows 3-16. Repeat Rows 1 - 16 29 more times, or until wrap

SEAF Series Datasheet - Samtec Microelectronicssuddendocs.samtec.com/catalog_english/seaf.pdf · SEAF PLATING A K TR OPTION NO. OF ROWS LEAD STYLE –04 =Four Rows –05 =Five Rows

BAMBOO STITCH With two simple rows, create rows … · BAMBOO STITCH With two simple rows, create rows that look like stalks of bamboo. WATCH VIDEO TUTORIAL CAST ON IN MULTIPLES OF

Visualization of High Dimensional Datasets · 1 1 Visualization of High Dimensional Datasets Class 10 2 Challenges of High Dimensional Datasets High dimensional datasets are common:

At the Big Data Crossroads: turning towards a smarter travel ...1.pdfindustry is at a big data crossroads: large volume, complex and unstructured datasets are beginning to reshape

(2d) Matrices CS101 2012.1. Chakrabarti Declaration and access int imat[rows][cols]; double dmat[rows][cols]; rows*cols cells allocated of the given

O H C N O P D TE IT N K - trendinglifestyle.com.au · 1 knit, 1 purl, 2 rows stocking stitch, 8 rows garter stitch, 4 rows stocking stitch, 4 rows double moss stitch, 3 rows stocking

BRC 2011 Session #4 – “Omics” Data. Session #4 - Outline Challenges and Opportunities pathogen datasets; host datasets; integrating pathogen-host datasets

Lesson 13. Excel isn’t just for crunching numbers or storing rows and rows of records. It contains a robust set of tools for turning data into charts;

MFC Datasets: Large-Scale Benchmark Datasets for Media

Rows 26–33 Rows 26–35 Rows 10–25 Rows Rows 1–9 Rows … · tcf bank stadium minneapolis, mn accessible seating chairback seats away home. $60 minnesota tribal nations plaza

PRESIDENT-DEMOCRATIC ROWS CONTINUED

Higgs to WW Analysis in CDF and ATLAS R StDenis. Higgs to WW CDF and ATLAS Reproduction of Previous Results:Identifying, datasets, Rows and columns Data

Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team

CROCHET CHEVRON BLANKET | CROCHET · 1 Chevron repeat = 6" [15 cm]. INSTRUCTIONS Stripe Pat 4 rows A, 4 rows B, 6 rows C, 2 rows D, 2 rows C, 6 rows B, 4 rows A, 2 rows D. These 30

Android Persistency: SQL Databasesmarek.piasecki.staff.iiar.pwr.wroc.pl/dydaktyka/mc_2014/L10_DB/... · indices, queries, views, triggers Insert rows, delete rows, change rows, run

First plant five rows of peas: Then plant five rows of

How many rows are there? There are rows. Circle each column of apples. · 2020-04-28 · Make arrays 1 Circle each row of sweets. How many rows are there? There are rows. 2 Circle

TURNING EXTERNAL TURNING TOOLS - teraskonttori.fiEXTERNAL TURNING TOOLS EXTERNAL TURNING CLASSIFICATION Tool Holder Features Shank Size (HxWxL ) External Turning, Facing External Turning,

Six rows of flowes

Francis Cabrel en concert - Amphithéâtre Cogeco · 2018. 11. 27. · Stage Régie 3 Sections 54 Price 100 Rows AA - M, 301-303 Rows A - N and 302 Rows A - P 83,50 $ 301-303 Rows

Rows and Seats simplified 2017 copy · 8ïlStarIight TM CÌ/zeatte 13-12 Terrace 2 Rows AA to ZZ Terrace 4 Rows AA to ZZ 18 STARLIGHT 8 9-10 Terrace 3 Rows AA to ZZ Terrace 5 Rows

Refinery29 · DAY 1 15 Rows DAY 6 10 Rows DAY 11 15 Rows 15 Biceps DAY 16 10 Rows 10 Shoulders 10 Biceps 10 Triceps DAY 21 10 Rows 10 Shoulders 10 Biceps 10 Triceps

Six rows of flowers

Francis Cabrel en concert - Amphithéâtre Cogeco · Stage Régie 3 Sections 54 Price 100 Rows AA - M, 301-303 Rows A - N and 302 Rows A - P 83,50 $ 301-303 Rows O - R and 302 Rows

Data Basics. Data Matrix Many datasets can be represented as a data matrix. Rows corresponding to entities Columns represents attributes. N: size of the

( 2 rows shelves + Base )

rows of bicycles

Deleting Duplicate Rows