chapter 5: data sampling design, data quality, calibration, presentation sampling physical variables...
TRANSCRIPT
CHAPTER 5: Data sampling design, data quality, calibration, presentation
Sampling
Physical variables are continuous, but samples (water samples) and digital techniques are discrete.
How often, how frequently, how long should one sample ?E.g. temperature in the Strait of Gibraltar ?
“Never measure the same place twice” ?
Most of our sampling does not resolve necessary or interesting processes…
May want to obtain a stable average: Example:Uncertainty of average = б / N1/2
• expected variance of deep flow fluctuations• estimate of integral timescale• desired accuracy of mean
approx. 5 years data needed in each box
In some ocean regions a stable mean of deep flow and its variancecan already be constructed...
Obtaining desired accuracy by spatial and temporal averaging
Example ARGO floats:Limiting is not the accuracy of the individual T measurement (0.003°C) but the sampling of ocean variability:
Largest noise comes from mesoscale eddies which are not resolved sincejust 50-100km size, i.e.
• small-scale “noise“, several 0.1°C in upper layers (one order smaller in deep ocean)
• accuracy in observing large-scale variability depends on number of single observations that are averaged over
For ARGO simulations were carried for several climate phenomana using altimeter data, which also represent a measure for heat content:
black: actual climate signal from altimetry
Red: estimate of the same signal using a 300x300km sampling
Aliasing: if a process exists with period T, have to sample at least at Δt=T/2.
If frequencies are present higher than fN=1/T=1/2Δt (Nyquist frequ), then high frequencies are aliased into lower ones. Or, have to sample at least with sampling interval T/2 (or faster) to resolve and avoid alias.
Signal period T, sampled at intervals T/4 and 3/4T
Baroclinic transport timeseries from CTD (diamonds) and XBT sections
Baroclinic transport timeseries from altimeter (thin line), filtered (blue)
Frequency resolution:A record of length T can resolve frequency intervals of Δf=1/T (fourier analysis delivers frequencies 1/T, 2/T, 3/T…. n/T).
So record length may (in addition to stable mean, long period signals) also be dictated by resolving close frequencies.
Example:semidiurnal tides at 12.0 and 12.4 hours, i.e. Δf = 0.0027 h-1.Resolving these requires record of length T=1/(0.0027) h = 15 days
Ideal case
Linear and cubic spline interpolation
polynomial interpolation
Interpolation
More advanced “frequency based” methods exist (see special class).
Note: objective analysis requires prior knowledge, and also often generates spurious max/min or “bull’s eyes”… (can be avoided with good choice of uncertainty and scales)
decimation
Problem if not filtered first !!!
Instrumental factors that determine sampling:
1) Battery endurance
2) Data storage
3) Telemetry capability
Quality of data:
Accuracy – Precision - Resolution
Accuracy:Absolute “correctness” relative to a universal/global reference standard
Precision:Repeatability of a measurement. Does not include systematic or calibration offsets.
Resolution:Smallest difference between 2 samples that can still be recognized as different.
Drift and stability of sensors
In short term, measurements may have high accuracy or precision.
Long-term drift or sudden jumps do occur. Very difficult to track and correct. If have post-calibration, still do not know WHEN change happened.
(It really means that precision depends on time-scale, but manufacturers often quote precision and stability separately.)
Approaches:1) Fit smooth curve, but this will also remove long-term signals/real
trends2) Use prior knowledge about sensor behaviour3) Compare to other data4) Ideally want self-calibrating instruments (e.g. chemical standards,
pressure standard, etc)
Drift of bottom pressure sensors:
Calibration of ARGO conductivity sensor drift
See slides about ARGO delayed-mode calibration from Section 4d.
Presentation of data
1-D data: profiles (paramater versus depth), Timeseries (parameter versus time, incl trajectory)
1-2 D plots: parameter verses parameter, e.g. T-S diagram
2-D plots: sections, horiz. Distributions, series of profiles
3-D fields:
Extract sections,horiz, slices
With time:Sequence of sections,or z-t x-t contour plots
Special type of contour plot: The Hovmüller diagramm (time versus location)
Some specialty or interesting plots….
Contur plot or single lines ?
Show where data are available
Trajectory in 2-Parameter Plot
Use of colors to emphasize or combine curves
Use of color for scaling vectors
Display of several parameters at once
Temperature andCurrent vectors
Wind speed and direction
Quantity and location sampled
3-D surface plot with color (same or additional quantity)
Temperature on a 3-D surface
Current SPEEDShown by color
One quantity in color, plus vectors (flow, wind, etc), plus distribution along sections