select la_code, sum(population)/sum(wardarea) as density, (select count(*) from wards_by_la wla2,...

14
SELECT LA_code, sum(population)/sum(wardarea) AS density, (SELECT count(*) FROM wards_by_LA WLA2, violent_crime WHERE WLA2.ward_code =violent_crime.ward_code AND WLA2.LA_code= WLA1.LA_code ) AS n_crimes FROM ward_profile, wards_by_LA WLA1 WHERE ward_profile.ward_code= WLA1.ward_code GROUP BY LA_code ORDER BY sum(population)/sum(wardarea) DESC; Correlated subqueries Tutorial 3, Q2 d) Produce a table showing the total population for each LA and the area it covers. Compute the population density. Order your output from highest to lowest density. e) Extend the previous query to include the total number of crimes in each LA. Part e) is not a simple extension of d). Bringing in violent_crime in a 3 way join will not work, and this will mean the information about population density and ward are is repeated multiple times. The answer is to use a subquery: Note the use of the subquery as an output field. This type of subquery is called a correlated subquery because it makes an external reference (WLA1) to the main query, and has to be re-evaluated for each iteration of the main query, i.e. each

Upload: regina-lane

Post on 25-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

SELECT LA_code, sum(population)/sum(wardarea) AS density, (SELECT count(*) FROM wards_by_LA WLA2, violent_crime WHERE WLA2.ward_code =violent_crime.ward_code AND WLA2.LA_code= WLA1.LA_code ) AS n_crimesFROM ward_profile, wards_by_LA WLA1WHERE ward_profile.ward_code= WLA1.ward_codeGROUP BY LA_codeORDER BY sum(population)/sum(wardarea) DESC;

Correlated subqueriesTutorial 3, Q2d) Produce a table showing the total population for each LA and the area it covers. Compute the population density. Order your output from highest to lowest density. e) Extend the previous query to include the total number of crimes in each LA.

Part e) is not a simple extension of d). Bringing in violent_crime in a 3 way join will not work, and this will mean the information about population density and ward are is repeated multiple times. The answer is to use a subquery:

Note the use of the subquery as an output field. This type of subquery is called a correlated subquery because it makes an external reference (WLA1) to the main query, and has to be re-evaluated for each iteration of the main query, i.e. each value of LA_code

SELECT ward_profile.ward_code, count(crime_id) AS n_crimeFROM ward_profile LEFT OUTER JOIN violent_crime ON ward_profile.ward_code = violent_crime.ward_codeGROUP BY ward_profile.ward_codeORDER BY count(crime_id) DESC

SELECT * FROM violent_crime INNER JOIN ward_profile ON violent_crime.ward_code= ward_profile.ward_code

All the joins we have been considering so far are called inner joins. SQL provides an alternative syntax:

Inner joins only include combinations of rows when there is a match in both tables. This can be problematic – the above join, for instance, would not produce any rows for wards where no crimes were committed. Sometimes we may want to include all the rows from one table even if there is no matching row in the other table. This is called an outer join. For example:

More SQL

LEFT just means all rows from the first table are included (vice versa for RIGHT)

FINALLY – note that once you have stored an Access query, you can query it just like a table, e.g. SELECT * from Query1. This allows very complex queries to be built up. You can think of the query as providing a particular VIEW of the database.

MORE ERM

Relationships can be a) three way and b) recursive.

SUPPLIERSUPPL

Y

PART

PROJECT

EMPLOYEE

SUPERVISOR

b) Recursive relationship. Note that the double line means every employee has to have a supervisor. This is called TOTAL PARTICIPATION in the relationship.

a) three way

N

1

Chen is not the only notation

ERM – finale: Wozzie’s Gym

USER BOOKS

USER ID

NAME

CASUAL MEMBER

FULL MEMBER

d

BANK DETAILS

U U

FACILITY

Two implementation options: 1) Combine casual and full members in same table, with an extra attribute to indicate status.

Alternatively, have a separate table for each and then use the UNION operator to combine them: SELECT user_id FROM full_member UNION SELECT user_id FROM casual_member

This notation is used to indicate subclasses of an entity. Note that the subclasses “inherit” all the attributes of the “parent class”.

Business Intelligence (BI)

• A rational approach to continuous improvement based on:– Gathering/analyzing operational (transactional) data

– Making decisions & taking actions based on that data

– Measuring results according to predetermined metrics (Key performance indicators – KPIs)

– Feeding lessons from one decision into the next

The production of timely, accurate, high value and actionable information

David Wastell

Data warehousing A Data Warehouse is a centrally managed and integrated database containing structured data from operational sources in an organization. Data extracts are validated, cleansed and transformed into a common, stable, relatable view of data.

ETL = Extract Transform Load

Dashboards and KPIs

Business dashboards provide a “control panel” for monitoring the vital functions of the business, supplying immediate information, indicating when and where performance is lagging…..

BI in Action: Crime Policy

Police

County Council

Fire service

Ambulance Service

Probation

Other

ETL

MADEMonthlyReports

Data-mining (policy research)

0

100

200

300

400

500

600

0 2 4 6 8 10 12 14 16 18 20 22

time of day

seri

ou

s vi

ole

nt

crim

e

Town 1

Town 2

Rural 1

Rural 2

Town 3

0

500

1000

1500

2000

2500

3000

0 2 4 6 8 10 12 14 16 18 20 22

time of day

no

of

ser

iou

s c

rim

es

alcohol involved

alc not involved

total

Data mining example: the aetiology of alcohol-related violence

Correlation = 0.67, very sig.

Evidence-based Policy: crime control

Street drinking ban

Ambulance Incidents(monthly)

Target zone

County demand

Before ban 9.1 516.2

After ban 8.3 541.1

Change -9% +5%

% total serious violent crime committed within target zone (reduced 14.8% to 12.3%)

All effects stat. sig.

BUT … any validity concerns, alternative explanations??

% of total in target zone

0

5

10

15

20

25

2002

04

2002

05

2002

06

2002

07

2002

08

2002

09

2002

10

2002

11

2002

12

2003

01

2003

02

2003

03

2003

04

2003

05

2003

06

2003

07

2003

08

2003

09

2003

10

2003

11

2003

12

2004

01

2004

02

2004

03

Month

Guess what…..

Crime data: relational view

OLAP: Online analytical processing

Crime as a multi-dimensional “cube”

Dimensions are typically hierarchies of categories

Slicing and dicing – and drilling down