Algebra for the Forward Search

AuthorFrancesca Torti - Marco Riani - Anthony C. Atkinson - Domenico Perrotta - Aldo Corbellini
ProfessionEuropean Commission, Joint Research Centre (JRC) - University of Parma, Italy - London School of Economics, UK - European Commission, Joint Research Centre (JRC) - University of Parma, Italy
Pages8-8
which is specif‌ied in advance. The LTS estimate is intended to minimize the sum of squares of the
residuals of hobservations. For LS, h=n. In the generalization of Least Median of Squares (LMS,
Rousseeuw, 1984) that we monitor, the estimate minimizes the median of hsquared residuals.
2. Adaptive Hard Trimming. In the Forward Search (FS), the observations are again hard trimmed, but
the value of his determined by the data, being found adaptively by the search. Data analysis starts
from a very robust f‌it to a few, carefully selected, observations found by LMS or LTS with the minimum
value of h. The number of observations used in f‌itting then increases until all are included. (See
Atkinson and Riani, 2000 and Riani et al., 2014c for regression, Atkinson et al., 2010 for a general
survey of the FS, with discussion, and Cerioli et al., 2014 for results on consistency).
3. Soft trimming (downweighting). M estimation and derived methods. The intention is that observations
near the centre of the distribution retain their value, but the ρfunction ensures that increasingly remote
observations have a weight that decreases with distance from the centre.
We shall consider all three classes of estimator. The FS by its nature provides a series of decreasingly robust
f‌its which we monitor for outliers in order to determine how to increment the subset of observations used in
f‌itting. For LTS and LMS we f‌it the regression model to increasing sized subsets h. For S estimation, which
we use as our example of soft trimming, we look at f‌its as the breakdown point varies. Here our focus is on
SAS programs.
3. Algebra for the Forward Search
Examples and a discussion of monitoring using the MATLAB version of FSDA are in Riani et al. (2014a).
To describe the SAS procedures which are the subject of this paper, needs fuller details than are given there.
It is convenient to rewrite the regression model Eq. (1) in matrix form as y=+ǫ, where yis the
n×1vector of responses, Xis an n×pfull-rank matrix of known constants (with ith row x
i), and βis a
vector of punknown parameters.
The least squares estimator of βis ˆ
β. Then the vector of nleast squares residuals is e=yˆy=yXˆ
β=
(IH)y, where H=X(XX)1Xis the ‘hat’ matrix, with diagonal elements hiand of‌f-diagonal elements
hij . The residual mean square estimator of σ2is s2=ee/(np) = Pn
i=1 e2
i/(np).
The forward search f‌its subsets of observations of size mto the data, with m0mn. Let S(m)be
the subset of size mfound by the forward search, for which the matrix of regressors is X(m). Least squares
on this subset of observations yields parameter estimates ˆ
β(m)and s2(m), the mean square estimate of σ2
on mpdegrees of freedom. Residuals can be calculated for all observations including those not in S(m).
The nresulting least squares residuals are
ei(m) = yix
iˆ
β(m).(2)
The search moves forward with the augmented subset S(m+ 1) consisting of the observations with the
m+ 1 smallest absolute values of ei(m). In the batch algorithm of §8 we explore the properties of a faster
algorithm in which we move forward by including k > 1observations.
To start we take m0=pand search over subsets of pobservations to f‌ind the subset that yields
the LMS estimate of β. However, this initial estimator is not important, provided masking is broken. Our
computational experience for regression is that randomly selected starting subsets also yield indistinguishable
results over the last one third of the search, unless there is a large number of structured outliers.
To test for outliers the deletion residual is calculated for the nmobservations not in S(m). These
residuals, which form the maximum likelihood tests for the outlyingness of individual observations, are
ri(m) = yix
iˆ
β(m)
ps2(m){1 + hi(m)}=ei(m)
ps2(m){1 + hi(m)},(3)
where the leverage hi(m) = x
i{X(m)X(m)}1xi. Let the observation nearest to those forming S(m)be
imin where
imin = arg min
i /S(m)|ri(m)|.
To test whether observation imin is an outlier we use the absolute value of the minimum deletion residual
rmin(m) = eimi n(m)
ps2(m){1 + himi n(m)},(4)
as a test statistic. If the absolute value of (4) is too large, the observation imin is considered to be an outlier,
as well as all other observations not in S(m).
8

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT