Martin Dubrovsky
Institute of Atmospheric Physics ASCR, Husova 456, 50008
Hradec Kralove, Czech Republic
go to my homepage
![]()
![]()
Jaroslava Kalvova
Faculty of Mathematics and Physics, Charles University V
Holesovickach 2, 18000 Prague, Czech Republic
The contribution is devoted to 1) establishing a standard reference for assessing the daily values of total ozone (TO) and 2) correlating TO with selected meteorological characteristics. In addressing the first item, the robust locally weighted regression (RLWR) is employed to get smooth annual cycles of the characteristics representing the `typical' value (mean or median) and the variability (standard deviation or interquartile range) of TO. The analysis is based on the 29-years series (1962-1990) of Dobson measurements in Hradec Kralove, Czech Republic (VANICEK, 1991). The goodness of the normal approximation of TO distribution is tested. In the second part, the TO is correlated with (a) single-site characteristics (derived from aerological soundings) and large-scale circulation patterns characterised by principal components.
The grant project has been started this year in collaboration of Institute of Physics of Atmosphere and Czech Hydrometeorological Institute. The aims of the project include implementation of the information bulletin for giving real-time values of total ozone and intensity of UV radiation, and the short-range forecast of both variables.
This contribution addresses two items of the project: (1) methodology for assessing the state of daily total ozone and (2) correlating total ozone with selected meteorological variables suitable for giving the forecast of TO. The first item was already studied in detail in KALVOVA & DUBROVSKY (1995) and only main results will be given below. Since the project has started this year and the data needed for completing the second item, only preliminary results are given in this contribution.
Rather than giving the actual value of the total ozone, it is often preferred to state whether the value is `normal', `below-normal' or `above-normal' with respect to some standard related to given day of the year. This type of information assumes the knowledge of the `typical' value - mean or median - of the total ozone as well as of the variability (standard deviation or interquartile range) for the day.
One of the two approaches given in Tab.I is commonly used to classify whether the value of the quantity may be considered normal.
The first approach assumes approximately normal distribution of the respective quantity. The second approach is more robust and is suitable on assumption that the quantity has single-mode distribution.
The second question stands, how to represent annual cycles of the characteristics. The unsmoothed daily characteristics are loaded by great noice (Fig.1) and their application might lead to absurd results: for example, the value 350 D.U. would be considered extremely above-normal on December 3 but normal only 3 days later. To solve this, the robust locally weighted regression (RLWR; SOLOW, 1988; DUBROVSKY, 1993) was used to smooth the annual cycles of the relevant characteristics. In applying the method, the regression function, r(x), is estimated by the polynom fitted through data contained in `window' <x-h,x+h>, h being the halfspan. To estimate the regression coefficients, individual measurements are weighted to account for the distance of xi's from x (measurements beyond the smoothing window receive zero localisation weight - the method is local) as well as for the distance of yi's from the estimate of the regression function in the previous iteration step (`outliers' are internally detected and receive zero robustness weight - the method is robust).
Of the optional parameters of the regression method, the value of the halfspan, h, seems to have the greatest effect upon the shape of the fitted function. Two problems arises: (1) small value of h may result in rather noisy curve which follows random fluctuations in sample data, (2) too great value of h may smooth even possible singularities. The effect of the value of the halfspan is demonstrated in Fig.2
Although the PRESS procedure is available to objectively determine optimal value of h, it is rather recommended to optimise h subjectively. In this approach, the value of the halfspan was set sufficiently large to get smooth curves without local extremes. Then the significance of deviations of unsmoothed values from the smoothed curves was tested. Thus h=50 was recognised as an optimal value of the halfspan. Concerning the annual course of the mean, the resulting curves are sufficiently smooth (Fig.2) and only 14 (of 366) unsmoothed daily means were found to be statistically different from the smoothed curves (at ALPHA=5%), which is less than 18 `allowed' by type I error. More interestingly, many of these outliers are groupped in clusters representing potential singularities (circles in Fig.2). The most significant `singularity' occurs in the second half of February and is expressed by a wave on a curve smoothed with h=15. The smoothed annual cycles of quantile characteristics are displayed in Fig.3. The tests have confirmed that curves smoothed with h=50 are also representative.
To determine which of the two approaches listed in Tab.I is more suitable, we examined the fraction of measurements with TO < AVG + k.STD, where AVG and STD are the mean and standard deviation of TO for the respective day and k is a variable parameter successively set to -2, -1, 0, 1 and 2. On assumption of normal distribution of TO, the respective fractions should equal 0.023, 0.159, 0.5, 0.841 and 0.977 (horizontal lines in Fig.4). The values of the fractions displayed in Fig.4 indicate that the normal approximation is relatively good in the middle part of the distribution [within (AVG- STD;AVG+STD)] but the model fails in approximating the tails of the TO distribution and thus would be unreliable in identifying extreme (in both directions) values of TO. It is therefore advisable to prefer quantiles (approach B in Tab.I) for assessing the state of the total ozone.
To provide basis for development of the procedure for prediction of UV radiation and total ozone, the correlation between meteorological variables and total ozone was studied. The stress was put on variables which are routinely forecast by numerical weather prediction models and thus would be available for operational forecast of TO. The total ozone data used in subsequent tests are based on Dobson measurements in Hradec Kralove.
Correlation of total ozone with the single-site upper-air characteristics was examined in a `Perfect Prog' approach (KLEIN et al., 1959). In developing prognostic equations, the predictand (independent variable) is a daily average of total ozone, TO. The set of predictors is derived from aerological soundings in Prague (TEMP-A report) and include geopotential heights, temperatures and wind components in main pressure levels up to 100 hPa. Aerological data available for present tests are from May-August, 1981-1991. The forward stepwise linear regression was employed to derive prognostic equations. Selected predictors and related reduction of variance, RV, are given in Tab.II. It is seen that: a) the applicable information contained in aerological sounding (only main pressure levels) is nearly completely represented by only geopotential heights, b) incorporating persistence into the set of predictors slightly improves prediction skill. Our results are in perfect agreement with BURROWS et al. (1994).
To account for the large-scale circulation conditions, the total ozone was correlated with Hess-Brezowski's circulation patterns in a recent study by KALVOVA & HALENKA (1995). The summary results (Tab.III) show that the direction of the flow plays a more significant role than type of the flow (A = Anticyclonic, C = cyclonic situations). In the present contribution, the circulation pattern is characterised based on geopotential heights in 500 hPa within 5 x 10 degrees grid, extending from 40W to 40E and from 35 to 65N (Fig.5). The data for this analysis were available only from DEC+JAN+FEB (winter) and JUN+JUL+AUG (summer) of 1962-1980. To reduce number of variables characterising the circulation pattern (63 gridpoints) principal components analysis was used to extract the most significant modes of circulation variability (HUTH, in press): 5 (7) principal components were selected for winter (summer) to explain about 89% (93%) of variance in original data.
The total ozone was related to the PC scores with a stepwise regression analysis, results are given in Tab.IV. The results displayed in the table show that the PC5 has the greatest effect in summer and PC3 in winter. PC5-summer corresponds to the anticyclone with center over Alps and W anticyclonic flow over Central Europe, PC3-winter corresponds to AC over France with NW anticyclonic flow in Central Europe (Fig.5). In both cases, the minus sign of the regression coefficent indicates that TO decreases with increasing intensity of the anticyclone.
In the first part of the contribution, the robust locally weighted regression was used to determine characteristics of annual cycle of total ozone which could serve as a reference in assessing real-time daily observations. The tests has shown that:
In the second part of the paper, the total ozone was correlated with meteorological characteriatics:
It is beleived that the prediction skill will be yet increased in future if both types of information (single-site characteristics and large-scale circulation characteristics) would be combined.
Acknowledgement: We thank Radan Huth of the Institute of Atmosphere, for the circulation patterns data and giving necessary explanations. This work was supported by Grant Project GA CR 205/96/1554.
Table I. Limits for assessing normality of event.
x is considered approach A approach B normal: A-S < x < A+S X25 < x < x75 below normal: x < A - S x < X25 above-normal: x > A + S x > X75 extremely above normal: x < A - 3.S x < X25 - 1.5.C extremely above-normal: x > A + 3.S x > X75 + 1.5.C
legend: A = mean; S = standard deviation, X25 and X75 are lower and upper quartile and C = X75 - X25 is an interquartile range.
Table II. Prediction of total ozone (TO) based on single-site upper-air characteristics and (optionally) persistence. Predictors are selected by stepwise regression.
predictand: TO (RMSE = 30.3)
----------------------------------------------------------------
without persistence with persistence
----------------------------------------------------------------
Predi- S RV RMSE Predi- S RV RMSE
ctors ctors
----------------------------------------------------------------
set of predictors includes all characteristics from TEMP-A report:
T500 - 0.492 21.4 TO- + 0.688 16.8
AVG + 0.672 17.2 T500 - 0.762 14.7
T100 + 0.728 15.7 AVG + 0.783 14.0
V150 + 0.744 15.2 T100 + 0.804 13.3
U850 - 0.748 15.1 V850 + 0.807 13.2
U850 - 0.810 13.1
only geopotential heights are taken from TEMP-A report:
Z300 - 0.494 21.5 TO- + 0.684 16.9
AVG + 0.679 17.1 Z300 - 0.762 14.7
Z100 + 0.707 16.3 AVG + 0.785 14.0
Z850 + 0.722 15.9 Z100 + 0.804 13.4
Z150 - 0.729 15.8 Z700 + 0.812 13.1
Z200 + 0.731 15.7
----------------------------------------------------------------
Legend: S = sign of the regression coefficient; RV = reduction of variance;
RMSE = root mean square error; Zxxx, Txxx, Uxxx, Vxxx = geopotential
height, temperature, zonal and meridional wind components where xxx represents pressure level;
AVG = mean climatology for TO; TO- = yesterday's value of
TO.
Table III. Number of days with above- and below-normal total ozone related to selected groups of Hess-Brezowski circulation types.
H-B ALL normal above-normal below-normal
# # % # % # %
---------------------------------------------------------------
C-south 485 338 69.7 31 6.4 116 23.9
AC-south 614 410 66.8 29 4.7 175 28.5
C-north 794 445 56.0 293 36.9 56 7.1
AC-north 520 366 70.4 102 19.6 52 10.0
C 1279 783 61.2 324 25.3 172 13.4
AC 1134 776 68.4 131 11.6 227 20.0
south 1099 748 68.1 60 5.5 291 26.5
north 1314 811 61.7 395 30.1 108 8.2
Table IV. Prediction of total ozone based on large-scale circulation patterns and (optionally) persistence.
without persistence with persistence
Predi- S RV RMSE Predi- S RV RMSE
ctor ctor
-----------------------------------------------------------
summer: [RMSE(TO) = 25.9]
PC5 - 0.407 20.0 PC5 - 0.407 20.0
AVG + 0.576 16.9 AVG + 0.576 16.9
PC1 + 0.622 15.9 dTO- + 0.721 13.7
PC6 - 0.633 15.7 PC1 + 0.730 13.5
PC6 - 0.734 13.4
winter: [RMSE(TO) = 48.1]
AVG + 0.333 39.3 dTO- + 0.376 38.0
PC3 - 0.455 35.5 AVG + 0.675 27.5
PC5 - 0.484 34.6 PC3 - 0.691 26.8
PC4 + 0.508 33.8 PC4 + 0.697 26.6
-----------------------------------------------------------
Legend: S = sign of the regression coefficient; RV = reduction of variance;
RMSE = root mean square error, dTO- = yesterday's value of total ozone.
Figure 1. Unsmoothed
characteristics of daily total ozone. Curves (from
below): a-3s, a-s, a, a+s, a+3s,
where a is the mean (arithmetic
average) and s is standard
deviation.
Figure 2. Annual
cycles of the mean and the standard deviation of
total ozone smoothed by robust locally weighted
regression using smoothing polynom of 2nd degree
and halfspan h=15 (thin line)
and h=50 (heavy line). The
hypothetical singularities are marked by circles.
Figure 3. Annual
cycles of median and upper and lower quartiles of
daily total ozone smoothed by robust locally
weighted regression with halfspan = 15 (thin
lines) and 50 (heavy lines).
Figure 4.
Percentage of daily observations below avg + k.std.
Horizontal lines represent values expected on
assumption of normal distribution of total ozone.
Figure 5.
Patterns of PC loadings related to the most
explanatory circulation modes: PC5 for summer and
PC3 for winter. [from HUTH, in press]