published in:Studia geophysicae et geophysica et
geodaetica, vol 39 (1995), No.1, p.84-100
language: English; includes 6 figures
and 1 table
Martin Dubrovsky
Institute of Atmospheric Physics, Hradec Kralove Czech
Republic
www.ufa.cas.cz/dub/dub@htm
![]()
ABSTRACT:
The paper deals with the probabilistic prediction of event
occurrence with use of the binary decision tree which is grown
from the learning sample. The tree growing algorithm consists in
recursive partition of the predictor space by either single-predictor-based
(SP) splits or by hyperplanes perpendicular to the best linear
discriminant function (BLDF), and is intended to maximally
effectively discriminate the elements of the learning sample with
event occurrence from the elements without event occurrence. The
predictand is the thunderstorm occurrence in the afternoon in
Prague, the set of predictors includes variables derived from a
midday single-station TEMP-A data (Perfect Prog approach),
persistence predictors and predictors related to passages of the
fronts across Prague. The experiments are designed to test the
performance of the tree growing algorithm - with a stress upon
indeterminateness following from the limited size of the learning
sample - and to evaluate the predictive potential of the
predictors for thunderstorm forecasting. The stability of the
tree structure, the optimal size of the tree and the related
prognostic skill score increase with increasing size of the
learning sample. Employment of the BLDF splits allows quicker and
more effective partition of the predictor space on the assumption
that the predictor vector has lower dimension and is `well
behaved' (preferably normally distributed). The stability indices
of Faust, Showalter and Adedokun were found to be the most
effective predictors. Persistence and frontal predictors only
slightly contribute to the total prediction skill of the decision
tree. The optimally sized tree has only five splitting nodes and
employs three thermodynamical predictors, one frontal and one
persistence predictor.

Figure 3. The optimal binary decision tree with
single-predictor-based splits developed in the Perfect Prog
approach (the tree was built from the learning sample with
values of predictors being derived from the noon aerological
soundings). The tree estimates the probability of thunderstorm
occurrence in Prague in the afternoon. The horizontal
position of each node is proportional to the conditional
probability of thunderstorm occurrence related to the node. The
terminal nodes provide prognostic probability in terms of the
fraction of the total number of elements falling into the
terminal node (denominator) and the number of elements with event
occurrence (numerator).
Predictors: SICP = modified Showalter index,
FI = Faust index, POSSFC = energy
released by a surface parcel during buoyant rise beyond the level
of free convection, PERS = number of stations in
Bohemia reporting TS occurrence within a 24-h interval ending at
06 GMT, F<12,18> = (0 or 1) passage of the front (cold
or occluded) across Prague within <12,18> GMT.