delivering excellence through innovation & …...2018/03/22  · delivering excellence through...

8
Delivering Excellence Through Innovation & Technology ee . ricardo .com Openair and R-Data Analysis Training 3-day intensive openair and R data training course Become an expert user of openair software. Write and test your own analysis queries. Critically analyse data sets using specialist tools and techniques. Share and collaborate with other users to develop new tools and solutions. For professionals working in air pollution management and data analysis. W S N E 5% 10% 15% 20% mean = 4.4887 calm = 0.3% −15 −10 −5 0 5 10 15 20 55 60 65 -49 -45 -40 -35 -30 -25 -20 -15 -10-5 0 5 12 51 55 60 65 70 75 79 ●● ●● ●● ●● Frequency of counts by wind direction (% W S N E 5% 10% 15% 0% mean ws = 1 mean wd = 29.9 5 10 15 ws W S N E Formula: BC ~ PM 2.5

Upload: others

Post on 20-Jun-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Delivering Excellence Through Innovation & …...2018/03/22  · Delivering Excellence Through Innovation & Technology ee.ricardo.com Openair and R-Data Analysis Training 3-day intensive

Delivering Excellence Through Innovation & Technology

ee.ricardo.com

Openair and R-Data Analysis Training3-day intensive openair and R data training course • Become an expert user of openair software.

• Write and test your own analysis queries.

• Critically analyse data sets using specialist tools and techniques.

• Share and collaborate with other users to develop new tools and solutions.

For professionals working in air pollution management and data analysis.

Frequency of counts by wind direction (%)

W

S

N

E

5%

10%

15%

20%

mean = 4.4887

calm = 0.3%

0 to 2 2 to 4 4 to 6 6 to 20.16

(m s−1)

−20 −15 −10 −5 0 5 10 15 20

40

45

50

55

60

65

70 gridded differences

(90th percentile)

<−10

−10 to −5

−5 to −1

−1 to 1

1 to 5

5 to 10

>10

●● ●

●●

●●

● ●

●●

●●

−49−45

−40−35

−30−25−20−15−10−5 0 5 12

51

55

60

65

70

75

79

●●

●●

●●

●●●

● ●

●● ●

●●

●● ●

● ● ●●

● ● ●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●●

●●

●●●●

●●

●●●

●●

●●●

●●

●●●

● ●

●●●●

●●

●●●●

●●

●●●●

● ●

●●● ●●

●●

●● ● ●●●

●● ●

●●

●●● ●

●●

● ●

●●

●●●

●●

●●

●●●

● ●●

●●

●●●● ●

●●

●●●●

● ●

●●

●●●●

●●

●●●

●● ● ● ●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

Day 2010−04−152010−04−16

2010−04−172010−04−18

2010−04−192010−04−20

2010−04−21

Frequency of counts by wind direction (%)

W

S

N

E

5%

10%

15%

20%

mean ws = 1mean wd = 29.9

wind spd.

−10 to 0

0 to 10

(a) (b) London Marylebone Road London North Kensington

5

10

15 ws

20

W

S

N

E

Formula:

BC ~ PM 2.5

robustslope

BC PM 2.5

0.1

0.2

0.3

0.4

0.5

0.6

5

10

15 ws

W

S

N

E

Formula:

BC ~ PM 2.5

robustslope

BC PM 2.5

0.02

0.04

0.06

0.08

0.1

0.12

0.14

hour

norm

alis

ed le

vel

0.6

0.8

1.0

1.2

1.4

0 6 12 18 23

Page 2: Delivering Excellence Through Innovation & …...2018/03/22  · Delivering Excellence Through Innovation & Technology ee.ricardo.com Openair and R-Data Analysis Training 3-day intensive

2 Openair and R data analysis training

Openair is a package of software tools dedicated to the analysis of air quality data. Written in R, an open-source data-analysis programming language, openair is used extensively throughout the world in academia, industry, and the public and private sectors. With almost 100,000 downloads worldwide, it is in the top 10% of downloaded packages written in R.

R has undergone extensive development since its original inception and now has almost endless capabilities and specialist packages.

Air pollution experts face a challenge in knowing how best to develop new understanding and insights from the analysis of air pollution data. In response to this challenge, Ricardo has developed a 3-day intensive openair and R data training course.

Practical trainingLed by Dr David Carslaw, who is the lead developer of openair, and his team of air quality experts, this technical training course will enable users of openair to derive greater value from their data. This will equip them with the skills and knowledge necessary to identify the most effective data management, and analysis tools and techniques for their needs.

A large proportion of the course consists of practical exercises that provide delegates with the opportunity to practise advanced analysis techniques and approaches using their own data. Learning within a familiar context encourages better understanding and knowledge retention.

The benefit of using delegates’ own data during the course is that they retain any insights they discover as a result of improved, more powerful, data analysis that is relevant to their own air quality management challenges. This knowledge and learning can then be applied in their professional roles.

Tutors will be on hand to support this process by working closely with participants and providing one-to-one support throughout the course.

Openair and R data analysis

(a) (b) London Marylebone Road London North Kensington

5

10

15 ws

20

W

S

N

E

Formula:

BC ~ PM 2.5

robustslope

BC PM 2.5

0.1

0.2

0.3

0.4

0.5

0.6

5

10

15 ws

W

S

N

E

Formula:

BC ~ PM 2.5

robustslope

BC PM 2.5

0.02

0.04

0.06

0.08

0.1

0.12

0.14

Page 3: Delivering Excellence Through Innovation & …...2018/03/22  · Delivering Excellence Through Innovation & Technology ee.ricardo.com Openair and R-Data Analysis Training 3-day intensive

3Openair and R data analysis training

The course was originally developed for local authorities, academia, government and industry. However, it will be of benefit to anyone with data analysis needs as it will help them to critically analyse and better interpret their air pollution data.

This course is designed as a foundation level for beginners who have no previous experience of using R, but will also benefit existing users who require some refresher training or one-to-one guidance.

Who should attend?

●● ●

●●

●●

● ●

●●

●●

−49−45

−40−35

−30−25−20−15−10−5 0 5 12

51

55

60

65

70

75

79

●●

●●

●●

●●●

● ●

●● ●

●●

●● ●

● ● ●●

● ● ●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●●

●●

●●●●

●●

●●●

●●

●●●

●●

●●●

● ●

●●●●

●●

●●●●

●●

●●●●

● ●

●●● ●●

●●

●● ● ●●●

●● ●

●●

●●● ●

●●

● ●

●●

●●●

●●

●●

●●●

● ●●

●●

●●●● ●

●●

●●●●

● ●

●●

●●●●

●●

●●●

●● ● ● ●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

Day 2010−04−152010−04−16

2010−04−172010−04−18

2010−04−192010−04−20

2010−04−21

Page 4: Delivering Excellence Through Innovation & …...2018/03/22  · Delivering Excellence Through Innovation & Technology ee.ricardo.com Openair and R-Data Analysis Training 3-day intensive

4 Openair and R data analysis training

The main areas covered by the course are:

• Working with air quality data Approaches to data manipulation, data grouping and summarising using modern ‘tidyverse’ solutions appropriate to air quality data and openair convenience functions. Many data processing steps within dedicated packages will be introduced. Participants will be introduced to key data processing steps and the dedicated R packages that support them.

• An introduction to the main openair functions Delegates will be introduced to the main directional analysis functions (e.g. ‘windRose’ and ‘pollutionRose’) and time-based analyses/trends (e.g. ‘timeVariation’, ‘smoothTrend’, ‘TheilSen’ and ‘calendarPlot’). An important and powerful aspect of openair is the flexible conditional analyses that are possible. This flexibility enables users to rapidly develop sophisticated analyses in an interactive, question-led way.

• Bivariate polar plots Learn about the purpose and uses of bivariate polar plots (with extensive examples drawn from real-world situations) and how they can be used to inform users of dominant source characteristics. These analysis techniques are popular worldwide, often being used in various applications for different uses – many of which go beyond the original development aims. Participants will be introduced to the recent extensions to the basic plots including clustering, using conditional bivariate probability functions and considering two-pollutant relationships (e.g. correlation and regression statistics).

Course overview

Season

January

December

January

December

W

S

N

E

mean

PM 10

20

25

30

35

40

45

50

Page 5: Delivering Excellence Through Innovation & …...2018/03/22  · Delivering Excellence Through Innovation & Technology ee.ricardo.com Openair and R-Data Analysis Training 3-day intensive

5Openair and R data analysis training

Feedback scores from previous openair and R data training courses on behalf of the Natural Environment Research Council (NERC) were rated at 97.6% or above by attendees for the following criteria:

• Overall opinion of the course (including organisation, interest and enjoyment).

• Overall opinion of the lecturers (including knowledge and ability to stimulate interest, ability to explain concepts).

• Course materials (including lectures and on-line material).

By the end of the course delegates will:

• Be able to identify the most appropriate tools for their analysis needs – saving them time and improving the efficiency of their processes.

• Be confident in their knowledge of coding and data analysis – reducing their reliance and expenditure on external support.

• Know how to extract more insight from their data – improving the credibility and robustness of their reporting.

• Be able to conduct analyses in a robust and reproducible way.

Learning outcomes

COURSE RATINGS

Season

January

December

January

December

W

S

N

E

mean

PM 10

20

25

30

35

40

45

50

Season

January

December

January

December

W

S

N

E

mean

PM 10

20

25

30

35

40

45

50

Page 6: Delivering Excellence Through Innovation & …...2018/03/22  · Delivering Excellence Through Innovation & Technology ee.ricardo.com Openair and R-Data Analysis Training 3-day intensive

6 Openair and R data analysis training

Course format

Day 1

• Introduction and welcome.• Overview of analysing air quality data

– aims and benefits.• Setting up R/RStudio on attendees’

computers.• Outline of course materials and

approaches used.

Getting started with R:• Key concepts and differences will

be explored and contrasted to traditional approaches in packages such as Microsoft® Excel®.

• Common data management issues and how to tackle them in R.

• Data cleaning and data checking.

Day 2

M O R N I N G

A F T E R N O O N

• Data manipulation using ‘dplyr’ and ‘tidyverse’ approaches.

• Common data aggregation problems with air quality data.

• Overview of openair – aims and scope.

• Introduction to the main openair functions, and their purpose and use.

• Directional analyses using the ‘windRose’, ‘pollutionRose’ and ‘percentileRose’ functions.

• Basic bivariate polar plots.• Example uses in published journals.• Introduction to the conditional

bivariate probability function (CBPF).

Day 3

• Bivariate polar plots continued.• Clustering features.• Comparing two pollutants,

correlation and regression surfaces.• Begin own data analysis.

• One-to-one help using own data with individual tutors.

• Question and answer session.• Training concludes.

pollutionRose(mydata, pollutant = "nox")

Frequency of counts by wind direction (%)

W

S

N

E

5%

10%

15%

20%

25%

mean = 178.62calm = 0.3%

NOx

0 to 50

50 to 100

100 to 150

150 to 200

200 to 250

250 to 300

300 to 350

350 to 1092

FIGURE 12.4NO x pollution rose produced using pollutionRose and default pollutionRosesettings.

pollutionRose(mydata, pollutant = "nox", type = "so2", layout = c(4, 1))

Frequency of counts by wind direction (%)

W

S

N

E5%

10%15%

20%25%

30%35%

40%

mean = 71.389calm = 0.4%

SO2 0 to 2.17

W

S

N

E5%

10%15%

20%25%

30%35%

40%

mean = 134.61calm = 0.5%

SO2 2.17 to 4

W

S

N

E5%

10%15%

20%25%

30%35%

40%

mean = 212.11calm = 0.3%

SO2 4 to 6.5

W

S

N

E5%

10%15%

20%25%

30%35%

40%

mean = 313.26calm = 0.4%

SO2 6.5 to 63.2

NOx

0 to 5050 to 100100 to 150150 to 200200 to 250250 to 300300 to 350350 to 1092

pollutionRose(mydata, pollutant = "nox")

Frequency of counts by wind direction (%)

W

S

N

E

5%

10%

15%

20%

25%

mean = 178.62calm = 0.3%

NOx

0 to 50

50 to 100

100 to 150

150 to 200

200 to 250

250 to 300

300 to 350

350 to 1092

FIGURE 12.4NO x pollution rose produced using pollutionRose and default pollutionRosesettings.

pollutionRose(mydata, pollutant = "nox", type = "so2", layout = c(4, 1))

Frequency of counts by wind direction (%)

W

S

N

E5%

10%15%

20%25%

30%35%

40%

mean = 71.389calm = 0.4%

SO2 0 to 2.17

W

S

N

E5%

10%15%

20%25%

30%35%

40%

mean = 134.61calm = 0.5%

SO2 2.17 to 4

W

S

N

E5%

10%15%

20%25%

30%35%

40%

mean = 212.11calm = 0.3%

SO2 4 to 6.5

W

S

N

E5%

10%15%

20%25%

30%35%

40%

mean = 313.26calm = 0.4%

SO2 6.5 to 63.2

NOx

0 to 5050 to 100100 to 150150 to 200200 to 250250 to 300300 to 350350 to 1092

Page 7: Delivering Excellence Through Innovation & …...2018/03/22  · Delivering Excellence Through Innovation & Technology ee.ricardo.com Openair and R-Data Analysis Training 3-day intensive

7Openair and R data analysis training

Tutors Course materials

The course will be led by David Carslaw who holds a joint position with Ricardo and the Department of Chemistry at the University of York. David has extensive experience in the use of R and its application to the atmospheric sciences. This experience includes primary research into techniques for receptor modelling; and working with a wide range of academics, regulators and consultants

(nationally and internationally). David has run similar courses on behalf of NERC as part of its Advanced Training Short Courses for PhD students studying atmospheric sciences.

The course will be supported by other Ricardo openair experts who have extensive knowledge of air pollution science and data science including the development of many R packages available on GitHub (online development platform) of relevance to air pollution problems.

All course materials are produced using R markdown, a notebook interface, which allows delegates to practise and recreate any analyses provided as part of the course. The course materials will include detailed information on all course topics in a clear format, with all coding steps shown, such that all analyses can be reproduced easily. This approach supports continued learning following completion of the 3-day course.

Page 8: Delivering Excellence Through Innovation & …...2018/03/22  · Delivering Excellence Through Innovation & Technology ee.ricardo.com Openair and R-Data Analysis Training 3-day intensive

Venue and transport links

The course will be held at Ricardo’s central London office close to Paddington station. Paddington is a major transport hub serving several underground lines in London, connecting passengers to nearly every point in the city.

With excellent transport links to London Heathrow airport and other mainline railway stations in London, the venue is ideally located for UK and international participants. The Heathrow Express runs daily between Paddington and the airport, with a journey time of between 15 and 20 minutes.

For details of accommodation options in and around Paddington, please visit https://ee.ricardo.com/air-quality/openair-and-r-data-analysis-training/venue-and-transport-links-en

© Ricardo-AEA Ltd 2018. EED/127/Mar18/V13www.ricardo.com

For more information about our openair and R data analysis training, please contact one of our experts at [email protected] or +44 (0) 1235 753000