Statistical Methods of Very Short Range Forecasting of Meteorological Phenomena


Autoreport on Doctoral thesis

RNDr. Martin Dubrovsky ***



Charles University
Faculty of Mathematics and Physics
Department of Meteorology and Environment Protection

Praha, October 1996


Branch: Meteorology and Climatology
Supervisor: RNDr. Jaroslava Kalvova, CSc, Charles University, Praha


1 Introduction

The present doctoral thesis is focused on basic aspects of statistical methods used in weather forecasting. It was elaborated partly within the frame of grant projects solved in the Institute of Physics of Atmosphere.

The work is divided into two parts. The first part is based upon literature review and consists of two chapters. Chapter 1 gives a summary of main points of statistical methodology used in developing prognostic equations. The review of the applied statistics is based mainly on GLAHN et al. (1991) and WILKS (1995). Introduction to thunderstorm forecasting, which is treated in the second part of the thesis, is given in chapter 2 and is based primarily on HOUZE (1993) and ALFORD et al. (1995). This chapter also contains a review of thunderstorm climatology and forecasting in the Czech Republic.

The second part of the work gives main results obtained by the author. The experiments aimed at development of equations for probabilistic prediction of thunderstorm occurrence in the territory of the Czech Republic by means of statistical analysis of learning data samples. The individual experiments were focused on:

This study started in 1991. Some of its results were already presented in several national and international conferences and published in two papers (DUBROVSKY, 1994; DUBROVSKY, 1995). Most of the results discussed in these conference contributions and papers are included in this work.

2 Statistical methods of weather forecasting - basic terms


Weather forecasting is a process by which the future weather, say in interval <t1,t2>, is estimated based on the information available in time t0 such that t0 < t1. The length of time delay Dt = t1 - t0 is called a lead time, interval <t1,t2> is a target forecast interval and the area for which the forecast is valid is a target forecast area.

The methods used in objective weather forecasting may in principal come from two categories. The first one follows mathematical modelling of the physical processes occurring in the atmosphere. This category includes numerical weather prediction (NWP) models based upon numerical integration of a set of primitive equations. These models are commonly used to forecast weather in lead times greater than about 12 h.

The second category involves statistical models. The prognostic equations are developed by statistical analysis of a sufficiently large learning (= training, developmental) data sample. The statistical methods use to be employed in very short range forecasting (Dt < 12 h) and in objective interpretation of NWP model output. In a latter case, prognostic equations are determined in either Perfect Prog or MOS approach (see further text for explanation). The weather elements forecasted by statistical methods typically include elements not provided by NWP models and site-specific weather conditions with a stress upon elements with high spatial variability (e.g., maximum and minimum surface temperature, cloudiness, visibility, fog occurrence, precipitation amount, occurrence and type, thunderstorm occurrence and severity, clear-air turbulence, surface air quality, local winds).

In statistical forecast models, the variables representing the weather elements to be forecasted are called predictands, and the variables the forecast is based on are called predictors. The definition of the predictand should reflect (i) the nature of the weather element, (ii) the distribution of observations (climatology) of the element, and (iii) the user requirement. Concerning choice of the predictors, it is advantageous to comprise as much as possible information in the lowest number of physically reasonable predictors. This makes the prognostic equations more stable. Although it is possible to employ all available data as predictors in a purely statistical approach, the resultant prognostic function is generally less efficient as it is demonstrated in Fig.1. The dependence of the predictand on the predictors is expressed in terms of the prognostic function which is to be determined from a learning sample. The learning sample consists of the previously observed values of both predictand and predictors. In order the prognostic function is serviceable, the learning sample should be representative and as large as possible.

Two types of forecasts can be generally identified: nonprobabilistic forecasts and probabilistic forecasts. Nonprobabilistic (deterministic, categorical) forecasts are statements indicating that the predictand will take some specific value. The prognostic function takes a form:

y^ = f(x)          (eq.1)

where x ELEMENTOF Xr is an observed predictor vector and y^ ELEMENTOF Y is an estimated value of the predictand. Prognostic function f is thus a projection from the r-dimensional predictor space, Xr, into the one-dimensional predictand space, Y. The disadvantage of the deterministic forecast is that it does not give a notion on the uncertainty involved in forecasting.

Probabilistic forecasts give an estimate of the conditional probability distribution of the predictand, p(y|x). In case the predictand is a continuous variable, the simplest probabilistic forecast may be given by constructing a 95% confidence interval around a point regression forecast through application of the +/-2xMSE0.5 rule, where MSE = E[(y^ - y)2] is a mean square error. In case the predictand takes its values from a finite set, Y = {v1, v2, ..., vry}, an exhaustive estimate of the conditional probability distribution of the predictand is given in terms of ry-1 probabilities:

fi(x) = Pr^(y = vi| x)         (eq.2)

If the predictand is a binary variable, we may set Y = {0,1} (0 for event non-occurrence and 1 for event occurrence) and the probabilistic prognostic function takes a form:

f(x) = Pr^(y=1|x)         (eq.3)

The quality of the probabilistic forecast of event occurrence may be measured by mean square error,

MSE(f) = E[(f - y)2]         (eq.4)

which is minimised if f(x) coincides with Pr(y=1|x) (= E(y|x). The least square regression techniques may be therefore used to estimate f(.). An alternative way for assessing quality of the forecast is a reduction of variance,

RV = 1 - MSE/MSEref  ,      where MSEref = E[(avg(y) - y)2]         (eq.5)

which measures performance of the prognostic function f(.) with respect to the constant forecast based on mean climatology (fref = avg(y)).

Depending on a type of data used in developing and implementing prognostic equations, three formulation methods are recognised:

3 Forecasting thunderstorms

In view of the difficulty in applying dynamical 3D models in forecasting convective weather, alternative techniques use to be employed. In nowcasting and forecasting with lead times up to 6 hours, the emphasis is put on detecting and tracking (with use of surface observations, satellites, radars and other remote sensing techniques) features of convective phenomena and estimating their development and movement for short intervals into the future. As the lead time prolongs beyond the lifetime of thunderstorms, forecasting methodology concentrates on assessment of environmental conditions for initiation of convection and its maintaining - low-level moisture, potential instability, wind shear and presence of triggering mechanism. The desired characteristics, including stability indices, use to be derived from aerological sounding data; for lead times longer than about 12 h, the characteristics are calculated from the NWP model output. A comprehensive summary of stability indices and related thermodynamical parameters used in forecasting convective phenomena may be found in PEPPLER and LAMB (1989).

In forecasting thunderstorms in the Czech Republic, the great stress has been put to an analysis of the aerological conditions. Stability indices of Faust, Showalter and Whiting seem to be the most favourite predictors in the Czech Republic. REZACOVA and MOTL (1990) aimed at an optimal utilisation of the multidimensional information contained in an aerological sounding (in fact, their paper became a starting point of the present study). More recently, REZACOVA and SOKOL (1995) included the larger-scale predictors (with a stress on vorticity characteristics; their data are used also in this study to develop the probabilistic forecast) to develop the categorical forecast of thunderstorm occurrence.

Quite different approach to thunderstorm forecasting was followed by BUKVA and STEKL (1982) who based their method on automated selection of analogous fields.

4 Development of the probabilistic forecast of thunderstorm occurrence for a territory of the Czech Republic

4.1 Data

The developmental procedures were based on two series of observational data:

a) 1981-1989, May-August (referred to as a "9y sample")
The series contains:


The derived learning sample consists of:


b) 1989-1991, May-September (referred to as a "3y sample"):
This series contains:

The derived learning sample consists of:

4.2 Type of the prognostic function

Regarding (a) the superiority of the probabilistic forecasts over categorical forecasts, (b) the ease of determining the probabilistic prognostic function for the binary predictand, and (c) possibility of converting the probabilistic forecast into categorical forecast, only probabilistic prediction (eq.3) was considered in this study. The prognostic function gives an estimate of a conditional probability of thunderstorm occurrence. If the accuracy of the prognostic function is measured by MSE (eq.4), the optimal prognostic function (denoted as f**(.)) is equivalent to the regression function, ry(x) = E(y|x), where y is a binary predictand and x is a multivariate predictor. The prognostic function was determined from the learning data sample by 4 regression methods:

REEP and logistic regression come from the family of the parametric techniques which assume the regression function to be of certain type, the latter two techniques (BDT and kNN) are robust techniques which make no assumption on the shape of the regression function. The shape of the regression function (for a 1D predictor) estimated by the four methods is displayed in Fig.3.

The performance of the latter three regression techniques (LOG, BDT, kNN) was examined on randomly generated data (Monte Carlo methodology) with given joint probability distribution. The results of the tests were found in good agreement with following general findings (schematically displayed in Fig.4):

In result, having learning sample of limited size, it may happen that the parametric technique may provide better results - compared to the robust techniques - even if the parametric model does not exactly fit the structure of the data. The robust technique become more efficient if the size of the learning sample exceeds certain threshold (n* in Fig.4b). It should be noted, that even the robust techniques (kNN and BDT) may perform better if the data are `well behaved' (preferably normally distributed).

When applied to real data (Fig.5), the REEP attained poorest results of the four regression techniques and the logistic regression was found to provide results slightly superior to the two robust techniques, even though the data are not normally distributed. The latter finding corresponds to the situation displayed in Fig.4b: the size of the learning sample is not sufficiently large for the robust techniques may optimally perform. Superiority of the logistic regression was confirmed also on a 1989-1991 sample.

4.3 Selection of predictors

In a 9y series, 3 types of predictors were considered:

The prediction was made for four 6-h, two 12-h and one 24-h target forecast intervals following the respective sounding term. Projections with zero lead time are considered as a Perfect Prog approach. Regarding (i) the climatological probability of thunderstorm occurrence has its maximum in the afternoon and (ii) prediction skill of predictors derived from the midnight soundings is quite insufficient, most of the tests were focused on prediction of thunderstorm occurrence in the afternoon based on midday aerological sounding (Perfect Prog approach).

Of the predictors derived from the TEMP-A report, stability indices of Faust, Adedokun and Showalter are the most significant. Two features were detected to play a role in formulating the thermodynamical predictors: assimilation of the humidity conditions and surface conditions in the predictor definition. The predictors which account for the two mentioned conditions have generally higher prediction skill. The role of the humidity conditions was also confirmed by tests with raw sounding data (geopotential heights, temperature and dew-point deficit at main pressure levels). Persistence and frontal predictors were found to have only low skill scores if considered as standalone predictors but they may improve performance of the prognostic system if the multivariate predictor is allowed. Clearly, both persistence and frontal predictors would be unavailable if the prognostic equations are used in objective interpretation of NWP model output.

The set of predictors included in a 9-y series consists of 38 thermodynamical, 7 persistent, and 7 frontal characteristics. To concentrate prognostic information in the lowest number of predictors, three screening procedures were employed: stepwise linear regression ( X(lin)), stepwise logistic regression ( X(log)) and binary decision tree run in SP mode (X(bdt)):

X(lin) = (FI, POSsfc, F<12,18>, PERS)

X(log) = (FI, HEATsfc, ADED2, F<12,18>, PERS)

X(bdt) = (SICP, POSsfc, PERS, FI, F<12,18>)

The three above sets differ only slightly: The first two predictors always come from the group of the thermodynamical predictors (FI is Faust index, SICP is modified Showalter index, ADED2 is Adedokun index, POSsfc is latent heat energy released by parcel during buoyant rise beyond level of free convection, HEATsfc is energy which must be supplied to parcel by low-level heating to induce convective overturning). The sets are supplemented by one frontal predictor (F<12,18> is binary predictor which takes value 1 if the front, either cold or occluded, passess across Prague during <12,18> GMT) and one persistence predictor (PERS is number of stations with thunderstorm occurrence during previous 24 hours) which increase the value of RV by about 3% compared to the situation when the persistence and frontal predictors are not taken into account (simulating the case when the prognostic equations are to be used for objective interpretation of the NWP model output). Alternative way to account for the front passages is to stratify the data set by occurrence of front passages across Prague. The results show that the stratification of the data may contribute to the prediction skill: using the binary predictor (front passage in this case) to stratify the data was found slightly superior to using it as a component of the multivariate predictor vector.

Based on stepwise logistic regression analysis applied on the 9-y sample, the composite stability index, INSTABIL, was developed:

INSTABIL = Pr^(TS<12,18> | TEMP-A) =
= {1 + exp[0.21 - 0.32FI + 0.0040HEATsfc - 0.32ADED2]}-1

In case the information on persistence and front passages is available, we may use yet more efficient composite predictor:

Pr^(TS<12,18> | TEMP-A, Persistence, Fronts) =
= {1 + exp[0.69 - 0.30FI + 0.0038HEATSFC - 0.33ADED2 - 0.044PERS - 1.1F<12,18>]}-1

where predictand TS<12,18> is defined as thunderstorm occurrence in Prague during <12,18> GMT. The superiority of the composite index INSTABIL over single stability indices was also confirmed on the 3y data.

The predictive power of the larger-scale predictors was examined using the 3y data sample. The larger-scale predictors were derived from objectively analysed upper-air data, the stress was put upon vorticity characteristics. Predictive power of the individual larger-scale predictors was found lower compared to the most efficient stability indices, but a skill score related to the multidimensional larger-scale information is comparable with the skill score related to the vector of stability indices (Fig.6). It should be noted here, however, that the skill scores related to the larger-scale predictors might be somewhat positively biased due to the way the set of candidate larger-scale predictors used in a present study was pre-selected from a great amount of predictors derived from individual pressure levels and individual grid-points.

Of the predictors being tested in all above experiments, the stability indices were confirmed to play a greatest role. The larger-scale predictors, frontal predictors and persistence may further improve the overall performance of the forecast model. The contributions of the individual groups of predictors to the total prediction skill may, however, change if the prognostic equations developed in a Perfect Prog approach would be used in practice. In this case, the predictors are derived from the output of the numerical weather prediction model instead of the objectively analysed aerological data. It is expected that this transition would generally decrease predictive skill of both single station (incl. stability indices) and larger-scale predictors. Nevertheless, due to better predictability of larger-scale features of atmospheric circulation compared to the predictability of stratification of temperature and humidity, it is assumed that the larger-scale predictors would be less affected by the transition.

4.4 Optimisation of the predictand

In a present study, predictand was defined as an occurrence of thunderstorm at least at s* stations of s stations contained in given area (Prague, 100 km surroundings of Prague, whole Bohemia) during given time interval. The question arises: How to define predictand (size of the target forecast area and value of s*) to achieve maximum utility of the thunderstorm forecast? It is suggested by the author to compare utility of differently formulated predictands by means of their potential to assist the local forecast. It is believed that optimisation of the prognostic model by maximising the local skill score may increase the value of the meteorological forecasts if designed for use in any computer-based decision support system.

The station-specific thunderstorm forecast was obtained by a two-step procedure suggested by the author: In a first step, the probabilistic forecast is given for a larger area with a binary predictand TSs*/s set to 1 if thunderstorm is observed at least at s* stations of total s stations contained in selected target forecast area. The first-step forecast is then localised for individual stations contained in the target area. The purpose of the two-step forecast is to provide a) general forecast for a larger area (first step), and b) spatially consistent site-specific forecast (second step). The utility of the forecast was expressed in terms of the station-specific skill scores and skill score integrated over selected area. The utility of the forecast scheme was found to depend on a definition of the first-step predictand (Fig.7). For each size of the first-step forecast area, an optimal value of s* exists for which the forecast scheme gains maximum integrated skill score. For example, considering the first step area coinciding with whole territory (containing 41 stations) optimal value of s* is about 7. Considering only 100 km surroundings of Prague with 14 stations available in the area, optimal value of s* is 3. An alternative way for giving a first-step forecast was also suggested: instead of using a binary predictand and estimating probability of event occurrence, number of stations with thunderstorm occurrence NTSs in the area containg s stations was estimated in a first step. This type of the forecast achieved prediction skill comparable with using an optimal binary predictand for each of the first-step forecast areas. The two-step forecast provides more spatially consistent site-specific forecast and higher local skill scores compared to local models developed separately for individual stations. The dependence of the local prediction skill on a distance of the station from the reference sounding station for various first-step predictands is displayed in Fig.7.

The final experiment discussed in the thesis aimed to objectively demarcate optimally sized regions for giving a first-step forecast. The regions were defined by agglomerative hierarchical cluster analysis. The objects to be clustered were individual stations, the between-station measure of similarity was based on a frequency of coincident occurrence of thunderstorm in the two stations and the between-clusters distance was calculated by average linkage formula. The cluster analysis resulted in three relatively compact regions (Fig.8).

And the future?
In most of the experiments, the predictors were related with predictands in a Perfect Prog approach. To get a realistic view on a performance of the prognostic equations developed by this method, the equations should be validated using NWP model output. Having sufficiently long series of NWP model output data, the set of prognostic equations might be also derived in a MOS approach. It would be then interesting to compare prediction skill of prognostic equations developed in Perfect Prog, MOS and classical approaches.

5. References

Acknowledgements

First of all, I thank Dr. Jaroslava Kalvova, CSc., the supervisor of this work, for her guidance, statistical consultations and incessant moral support. In next, I wish to thank Dr. Daniela Rezacova, CSc. who motivated me to start the research into the statistical methods of thunderstorm forecasting. She provided most of the data for the analysis and was a main `thunderstorm consultant' as well as the opponent to results of my research. I am also obliged to Dr. Zbynek Sokol, CSc., who gave me larger-scale predictors and necessary explanation for using them. Last and most important, I would like to thank my wife Iveta for support, encouragement and endless patience during the time before this thesis has been finished.

Figures

Figure 1. Demonstration of a utility of the physically reasonable predictors over pure statistical approach - comparison of the prediction skill of the raw TEMP-A data and thermodynamical predictors derived from the TEMP-A data. Predictand = occurrence of thunderstorm at least at 1 station in Prague during 12-18 GMT. Each of the four horizontal bars is labelled by the type of predictors considered [RAW = raw single-station TEMP-A data (only geopotential heights, temperatures and dew point deficit at main pressure levels), TD = `physically reasonable' thermodynamical predictors derived from the sounding, including stability indices] and regression method used to relate predictand with predictors (lin = stepwise linear regression, log = stepwise logistic regression). The strips of the bars correspond to the contributions of individual predictors selected by stepwise regression procedure: the contribution to RV due to the first (second, third, ...) selected predictor is represented by first (second, third, ...) strip from the left. Based on 1981-1989 learning sample.


Figure 2. The prognostic binary decision tree with single-predictor splits, developed in a Perfect Prog approach. Predictand = occurrence of thunderstorm in Prague during <12,18> GMT), the definition of the predictors is given in the text. The numbers in terminal nodes (ellipses) give the total number of elements of the learning sample falling into the terminal node (denominator = Nterminal) and number of those with event occurrence (numerator = N(1)terminal). The plus/minus signs below the decision nodes (rectangles) indicate where to proceed if the respective decision rule is satisfied (+) or not satisfied (-). See DUBROVSKY (1995) for details.


Figure 3. Regression estimate of the probabilistic prognostic function (eq.3) for binary predictand and one-dimensional predictor. Regression function is approximated by REEP (heavy dashed line), logistic function (dashed line), binary decision tree (solid step-wise line) and k nearest neighbours (heavy solid line). Predictor = Faust index, predictand = occurrence of thunderstorm in Prague during <12,24> GMT. Lines with symbols indicate conditional frequency function of the Faust index for predictand = 0 (pluses) and 1 (squares).


Figure 4. Convergence of the parametric (dashed funnels, B) and robust (solid funnels, A) estimates of the regression function f in terms of the convergence of their error measure, e(f) (e(f) may coincide with mean square error). (a) Parametric technique converges to the optimal solution [f*(Fi) = f**]. (b) parametric technique does not converge to the optimal solution [e(f*(Fi)) > e(f**)]. Fi is a set of all functions which may be expressed by the parametric model, f*(Fi) is a function selected from Fi which minimises MSE.
a)
b)


Figure 5. Performance of six regression techniques when applied to real data. The vertical bars demarcate dispersion (in terms of <average +/- standard deviation> interval) of sample estimate of RV calculated on the independent verification samples (n = 526) randomly selected from the complete data set in ten runs. The prognostic functions were determined by regression analysis applied to the learning samples of the same size (526 data units). Applied regression techniques: (A) logistic regression; (B) tree growing algorithm run in an SP mode, (C) tree growing algorithm run in a BLDF mode; (D) k nearest neighbours with distance measured on a best linear discriminant function; (E) k nearest neighbours with weighted Mahalanobis distance; (F) REEP. Predictand = thunderstorm occurrence in Prague during <12,18> GMT. The predictor vector is formed by first r variables taken from X(log) = (FI, HEATsfc, ADED2, F<12,18>, PERS) (see text for description of the predictors). The five predictors included in X(log) were selected by stepwise logistic regression applied to the set of 38 thermodynamical, 7 persistent, and 7 frontal characteristics. The complete set of predictors (r = 52) was processed only by tree growing algorithm run in an SP mode.


Figure 6. Comparison of the prediction skill of single-site (SS) and larger-scale (LS) predictors in forecasting thunderstorm occurrence. Predictand = occurrence of thunderstorm at least at 3 stations within 100 km from Prague during <12,18> GMT (lower part of the graph) and <18,24> GMT (upper part of the graph). The bars are denoted by type of predictors considered (SS and LS) and regression method used to relate predictand with predictors (LIN = stepwise linear regression, LOG = stepwise logistic regression). The strips of the bars correspond to the contributions of individual predictors selected by stepwise regression procedure (from left to right - the reduction of the variance due to the first selected predictor is represented by the leftmost strip).


Figure 7. Local prediction skill of the two-step station-specific prognostic model. The first-step target forecast area contains all 41 synoptical stations (displayed in Fig.8); the forecast is localised with use of station-related thunderstorm climatology. Legend: local relates to the local models developed separately for each station; fractions 3/41, 7/41 and 15/41 indicate the variant of the binary first-step predictand (TSs*/41 with s* being 3, 7 and 15, respectively); symbols x mark the skill scores related to the fractional first-step predictand (NTS41). The monotonically increasing line displays the distance of the station from the aerological sounding station.


Figure 8. Hierarchical cluster analysis applied to the set of synoptical stations. The between-station similarity measure is based on concurrent occurrence of thunderstorm within four 6h periods, the between-clusters distance was calculated by average linkage formula.