The FS batch procedure

Document

Cited in

Author	Francesca Torti - Marco Riani - Anthony C. Atkinson - Domenico Perrotta - Aldo Corbellini
Profession	European Commission, Joint Research Centre (JRC) - University of Parma, Italy - London School of Economics, UK - European Commission, Joint Research Centre (JRC) - University of Parma, Italy
Pages	15-16

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

-0.5

0.5

1.5

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

-0.2

0.2

0.4

0.6

0.8

1.2

1.4

1.6

Figure 5: Two artificial datasets generated with MixSim for the assessment.

8. The FS batch procedure

Our SAS library contains a new Forward Search strategy that increases the possibility to treat large datasets.

The idea is to reduce the size of the output tables and the amount of memory required through a batch

updating procedure.

The standard FS algorithm in §3 produces a sequence of n−m0subsets with corresponding model

parameters and relevant test statistics, used typically to test the presence of outliers. The initial subset size

m0can be as small as p, the minimum number of observations necessary to provide a f‌it to the data. In

the standard algorithm the subset size m0≤m≤nis increased by one unit at a time and only the smallest

value of the test statistic among the observations outside the subset is retained. The batch version of the

algorithm f‌its instead only one subset every k > 1steps. The value of kis set by the user through the

input parameter fs_steps. The number of subsets to be evaluated therefore reduces to (n−k)/k. For

each subset and set of estimated model parameters, the ksmallest values of the test statistic are retained :

they are assigned to the current step and to the preceding k−1, in order to obtain the complete vector of

minimum test statistics Eq. (4) to compare with the envelopes. Of course this vector is an approximation

to the real one which would be found by evaluating each of the ksteps individually; the approximation is the

cost of reducing the number of f‌its to (n−k)/k while still applying the signal detection, signal validation

and envelope superimposition phases described in §5 at each of the n−m0FS steps.

If the data are contaminated and kis too large, this approach may not be accurate enough to detect the

outliers, giving rise to biased estimates. The problem can be apprised by monitoring the statistical properties

of the batch algorithm for increasing k. We have conducted such exploratory assessment using artif‌icial

data.

We generated the data using MixSim (Maitra and Melnykov, 2010) in the MATLAB implementation of

the FSDA toolbox (Torti et al., 2018, Section 3); the functions used are MixSimreg.m and simdataset.m.

MixSim allows generation of data from a mixture of linear models on the basis of an average overlap measure

¯ωpre-specif‌ied by the user. We generated a dominant linear component containing 95% of the data and a

5% “contaminating” one with small average overlap ( ¯ω= 0.01). The generating regression model is without

intercept, with random slopes from a Uniform distribution between tanπ

6=√3

3and tan π

3=√3, and

independent variables from a Uniform distribution in the interval [0,1]. Each slope is equally likely to be

that of the dominant component. We took the error variances in the two components to be equal when

specif‌ication of the value of ¯ω, together with the values of the slopes, def‌ines the error variance for each

sample. We also added additional uniform contamination of 3% of the above data over the rectangle def‌ined

by the two slopes and the range of the two independent variables. The plots in Figure 5 are examples of

two datasets with 4750 + 250 + 150 units.

The boxplots of Figure 6 show the bias for the slope and intercept obtained from 500 such datasets with

5,150 observations each, for k∈ {1,5,10,15,20,40,60,80,100}. The bias here is simply the dif‌ference

between the estimated and real slopes, the latter referring to the dominant generating component.

The upper panel of the f‌igure shows that the median bias for both the slopes and intercepts are virtually

zero. The dispersion of the estimates for slopes and intercept remain both stable and quite small even for

values of kapproaching 100 (note that the boxplot whiskers are in [−0.01,0.01]). However, the variability

of the estimates outside the whiskers rapidly increases for k= 100. The fact that the bottom and top edges

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

The FS batch procedure

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users