In this article we will discuss about how to estimate extreme wind speeds without regard to direction.

To within a constant dimensional factor, the time series Wj is the same as the time series maxi [vJl)] = Vj (j = 1,2, . . ,N). Extreme idealized wind loads can therefore be obtained from estimates of the extreme variate V Inferred from this time series.

The vast majority of structural engineering calculations for wind are based on idealized extreme pressures, rather than actual extreme pressures.

This state of affairs is due to:

ADVERTISEMENTS:

(1) The difficulty of codifying the estimation of actual extreme wind loads for most ordinary structures,

(2) The generally inadequate availability of directional aerodynamic and wind climatological data, and

(3) Computational Inconvenience.

This last factor carries less weight in the age of personal computers, and it may be that in the near future expert systems with adequate data bases will increasingly allow directional effects to be accounted for in the estimation of wind loads. Nevertheless, for the time being, estimating extremes wind speeds without regard for direction remains an important structural engineering problem.

ADVERTISEMENTS:

The estimation of nominal design wind speeds (e.g., wind speeds with, say, a 50-year return period) is in general not unduly sensitive to the choice, within reasonable limits, of the statistical estimation procedure and the distributional form assumed to underlie the data. For example, the method of moments is inferior to the probability plot correlation coefficient (ppcc) , but using it, instead of the ppcc, to estimate 50-year wind speeds entails errors of about 3 to 5 percent.

Similar errors are inherent in the use of the assumption that a Frechet distribution with tail length parameter 7-9, rather than a Gumbel distribution, best fits the data. However, if ultimate loads (or load factors) are of interest, the results can be sensitive to the choice of estimation procedure and distribution.

Extreme largest value distributions are, strictly speaking, valid only in the asymptotic limit of large extremes. It is nevertheless reasonable to assume that extreme winds are described probabilistically at least approximately by extreme largest value distributions.

There are exactly three such distributions. In order of increasing tail lengths, they are the reverse Weibull distribution, the Gumbel distribution, and the Fr6chet distribution. A remarkable feature of the reverse Weibull distribution is its finite upper tail.

ADVERTISEMENTS:

The American National Standard A58.1-1972 (a predecessor of the current ASCE Standard 7-1993) was based on the assumption that extreme wind speeds are described by a Frechet distribution with tail length parameter γ=9. As shown by subsequent studies, it may be confidently assumed that the Gumbel distribution — which is shorter- tailed than the Frechet distribution with γ=9 — is a better probabilistic model of the extreme speeds.

However, even studies based on the Gumbel model result in apparently unrealistically high estimates of failure probabilities. This may be explained in part by the fact that those studies do not adequately account for wind direction effects. However, an additional explanation may be that the extreme speeds are best fitted not by Gumbel distributions, which have infinite upper tails, but rather by reverse Weibull distributions which like wind speeds in nature have finite upper tails.

Recent substantial advances in extreme value theory appear to justify efforts to develop more realistic probabilistic models of extreme wind speeds and, consequently, more realistic wind load factors. This is likely to be true in spite of difficulties such as the limited availability of long-term data, the current insufficiency of comprehensive meteorological models available to the extreme wind analyst, and limitations inherent in statistical procedures. We describe some recent contributions to these efforts.

Classical Extreme Value Theory and ‘Peaks over Threshold Methods’:

Classical extreme value theory is based on the analysis of data consisting of the largest value in each of a number of basic comparable sets called epochs (a set consisting, e.g., of a year of record, or of a sample of data of given size; in wind engineering, it has been customary to define epochs by calendar years). For independent, identically-distributed variates with cumulative distribution function F, the distribution of the largest of a set of n values is simply Fn.

ADVERTISEMENTS:

With proper choice of the constants an and bn, and for reasonable F’s, Fn(an + bnx) converges to a limiting distribution, known as the asymptotic distribution. A notable result of the theory is that there exist only three types of asymptotic extreme largest value distributions, known, in order of decreasing tail length, as the Frechet (or Fisher-Tippett Type II), Gumbel (Type I), and reverse Weibull (Type III) distributions.

In contrast to classical theory, the theory developed in recent years makes it possible to analyze all data exceeding a specified threshold, regardless of whether they are the largest in the respective sets or not. An asymptotic distribution — the Generalized Pareto Distribution (GPD) — has been developed using the fact that exceedances of a sufficiently high threshold are rare events to which the Poisson distribution applies.

The expression for the GPD is:

G(y) = Prob[Y < y] = 1-([1+(cy/a) ]-1/c) a > 0, (1+(cy/a)) > 0 …(5)

ADVERTISEMENTS:

Equation 5 can be used to represent the conditional cumulative distribution of the excess Y = X – u of the variate X over the threshold u, given X > u for u sufficiently large. c>0, c=0 and c<0 correspond respectively to Frechet, Gumbe-1, and reverse Weibull (right tail-limited) limiting distributions. For c=0 the expression between braces is understood in a limiting sense as the exponential exp(-y/a).

The peaks over threshold approach reflected in Eq. 5 can extend the size of the sample being analyzed. Consider, for example, two successive years in which the respective largest wind speeds were 30 m/s and 45 m/s, and assume that in the second year winds with speeds of 31 m/s, 37 m/s, 41 m/s and 44 m/s were also recorded, at dates separated by sufficiently long intervals (i.e., longer than a week, say) to view the data as independent.

For the purposes of threshold theory the two years would supply six data points. The classical theory would make use of only two data points. In fact it may be argued that, by choosing a somewhat lower threshold, the number of data points used to estimate the parameters of the GPD could be considerably larger than six in our example.

Description of CME, Pickands and Dekkers-Einmahl-De Haan Methods:

Several methods have been proposed for estimating GPD parameters: the Conditional Mean Exceedance method (CME), the Pickands method, and the Dekkers-Einmahl-de Haan method (or, for brevity, the de Haan method).

Conditional Mean Exceedance (CME) Method:

The CME (or mean residual life — MRL — as it usually termed in biometric or reliability contexts) is the expectation of the amount by which a value exceeds a threshold u, conditional on that threshold being attained. If the exceedance data are fitted by the GPD model and c < 1, u > 0, and a+uc > 0, then the CME plot (i.e., CME vs. u) should follow a line with intercept a/(1-c) and slope c/(1-c). The linearity of the CME plot can thus be used as an indicator of the appropriateness of the GPD model, and both c and a can be estimated from the CME plot.

Pickands Method:

Following Pickands’ (1975) notation, let X(1) >> X(n) denote the order statistics (ordered sample values) of a sample of size n. For s = 1, 2, .., [n/4] ([ ] denoting largest integer part of), one computes Fs(x), the empirical estimate of the exceedance CDF

F(x ; s) = Prob(X-X(4s) < x|X > X) 4s …(6)

and Gs(x), the Generalized Pareto distribution, with a and c estimated.

ĉ = log {(X(s) – X(2s))/(X(2s)– X(4s)}/log(2) …(7)

â = ĉ (X(2s) – X(4s))/2ĉ – 1 …(8)

One takes for Pickands estimators of c and a those values which minimize (for 1 < s < [n/4]) the maximum distance between the empirical exceedance CDF and the GPD model.

Following a critique of an earlier implementation of the Pickands method, an alternative implementation was developed, which entailed the following steps:

(1) Choose as threshold u an order statistic of the sample;

(2) Compute the empirical exceedance CDF for the data above u;

(3) Nonlinear least-squares fit the GPD model for the parameters c and a;

(4) Plot the resulting c estimates against u for each order statistic.

If the plot of c is stable around some horizontal level for most of the order statistic thresholds plotted, then the plot Is presumptive evidence for the GPD model being applicable and can be used to yield numerical estimates of c; the distribution is Weibull, Frechet or Gumbel according as c is negative, positive, or fluctuates around zero. The approach just described was suggested by Bingham (1990).

De Haan Method:

Recent work by de Haan (1994) and coworkers provides a moment-based estimator which, like Pickands’ estimator, is asymptotically unbiased for the true tail parameter and, in addition, is asymptotically normal. We now describe this estimator, using the order-statistic notation introduced above.

Let n denote the total number of data and k the number of data above the threshold u. (Note that u is then the (k+1)-th highest data point.)

Compute, for r=1 and r=2, the quantities:

Estimation of Variates with Specified Mean Recurrence Intervals:

For wind engineering purposes the estimates of the wind speeds corresponding to various mean recurrence intervals are of interest. We give expressions that allow the estimation from the GPD of the value of the variate corresponding to any percentage point 1 – 1/(λR), where λ is the mean crossing rate of the threshold u per year (i.e., the average number of data points above the threshold u per year), and R is the mean recurrence interval in years.

Where u is the threshold used in the estimation of c and a.

Relations between Distribution Parameters and Expected Value and Standard Deviation:

Relations between distribution parameters and the expectation E(X) and the standard deviation s(X) for the Gumbel and reverse Weibull distribution are given below. (Subscripts G and W refer to the Gumbel and reverse Weibull distributions, FG(x) and Fw(x), respectively.)

Results of Monte Carlo Simulations:

Preliminary Monte Carlo studies reported by Gross et al. (1994) led to the following tentative conclusions:

Comparison of Estimation Methods:

The CME and the de Haan methods are competitive. Both methods are superior to the Pickands method. The de Haan method gives better estimates than the CME method for extremes with large mean recurrence intervals. Note, however, that the de Haan method as described in Gross et al. (1994) was based on the de Haan estimator of the parameter c (Eq. 10b), and on an estimator of the parameter a less precise than Eq. 10b. For this reason it is reasonable to expect that the de Haan method that makes use of both Eqs. 10a and 10b performs better than the CME method.

Optimal Crossing Rate:

A high threshold reduces the bias since it conforms best with the asymptotic assumption on which the GPD distribution is based; however, because it results in a small number of data, it increases the sampling error. It appears that, with no significant error, an approximately optimal threshold corresponds to a mean exceedance rate of 5/yr to 15/yr.

Results of Extreme Wind Speed Analyses:

Results of analyses performed on sets of about 20 to 45 yearly maximum wind speeds recorded at various U.S. sites were reported by Lechner et al. (1992). About one hundred data samples of size 20 to 45 years’ recorded at stations not affected by hurricanes were analyzed by the CME procedure. For more than two-thirds of the samples the c values estimated by the modified Pickands method were negative.

The same data were recently analyzed by Simiu and Heckert (1995) by using the de Haan method. These analyses confirmed the results of Lechner et al (1992). However, because the number of data available in these samples is small, especially for large thresholds, the confidence bands for the estimates tend to be relatively wide.

Analyses were also done for 48 sets of daily data records with lengths 15 to 24 years. As explained in Gross et al. (1995), the number of data in the sets was reduced by a factor of four to decrease the effect of correlation due to wind speeds recorded in the same storm. Results based on de Haan’s method (Eq. 10a) — as opposed to the more inconclusive results based on the CME method reported by Gross et al. (1995) — showed an unmistakable tendency of the estimated values of c to be negative Simiu and Heckert, 1995.

These results are significant. They provide evidence that extreme value statistics reflect the physical fact that wind speeds are bounded. However, it appears that dependable quantitative information for use in structural reliability estimates and the development of wind load factors for building standards would require larger data sets than are presently available. We note that, using a different approach, Kanda (1994) also showed that extreme winds are best fitted by distributions with limited tails.

Sampling Errors in Estimation of Extreme Wind Speeds:

Estimates of sampling errors are available under the assumption that the extreme annual wind speeds have a Gumbel distribution. Based on that assumption, the standard deviation of the sampling errors was estimated to be about 5 to 10 percent of the wind speeds obtained from an approximately 30- year long sample of maximum yearly data. Sampling errors so estimated are acceptable approximations for use in reliability calculations.

Gust Wind Speeds versus Fastest Mile Speeds:

Peterka (1992) reported results of extreme wind analyses based on peak gust, as opposed to fastest-mile, records, and used a technique to reduce variability due to sampling error by combining stations with short records into “superstations” with long records. The acceptability of this technique is a function of the degree of mutual dependence of the storm occurrences at the various stations being consolidated.