Data processing
Pages | 71-89 |
71
When survey data colle ction was completed, aseries of
steps were underta ken to ensure the quality and accu-
racy of the data colle cted. The steps covered validation
rules to monitor errors and assess response patterns
reflecting inconsistencies, as well as cleaning-up strate-
gies to identify and further handle suspicious entries.
Finally, weighting adjustments were applied to data
to correct for potential over-/under-representation of
groups.
8.1. Data validation
To validate the correctness an d to ensure the high qual-
ity of the data collected aseries of steps was under-
taken (see Sections 8.1.1–8.1.5). This included tests to
evaluate the length of ti me respondents took to answer
the questionnaire and checks for consistency/logic and
to detect falsificati on attempts and dupli cate responses.
Each validation test was applied at respondent level.
Adecision about each completed questionnaire’s valid-
ity was made based on the results of the validation
tests and on the basis of aset of pre-specified criteria
(see Section 8.1.6). Questionnaires that were asse ssed
as erroneous, suspicious or inconsistent were excluded
from the cleaned datas et.
The uncleaned dataset of 141621 responses was
validated and edited. This resulted in acleaned
dataset of 139799 responses. This comprises the
final data that were used for analytical purposes.
The following section elaborates on the validation tests
and the decision cr iteria, which were examined in com-
bination to assess each ca se.
8.1.1CAPTCHA score
CAPTCHA is acomputer syste m intended to distinguish
human from machine input. An invisible reCAPTCHA
was used to detect applications from bots(23). For
those users who were actually eligible to participate
in the survey, the CAPTCH A score was recalculated just
before questionna ire submission and only entries wi th
scores greater than 0.3 were accepted– plausible val-
ues ranged from 0 (representing almost certainly an
automated completion of t he survey) to 1 (representing
almost certai nly an authentic completion of th e survey
by ahuman respondent).
8.1.2 Questionnaire duration
This step was meant to detec t respondents completing
the survey too quickly. Short questionnaire durations
raise suspicions of limited accuracy (respondents may
have not read or answered the ques tions with caution).
Total time spent on questionnaire completion is the
difference between the starting time and the time of
submission. Very long times (typically exceeding 100
minutes) can be explai ned by interrupted completion or
by late submission due to tec hnical issues. In contrast,
short times are sus picious and were subject to further
investigation.
Respondents were divided into categories created as
combinations of respondent categories (lesbian, gay
(23) A bot or aweb robot is asoft ware application tha t runs
automated ta sks over the internet. I n this way, the malicious
deployment o f bots aims to imitate or re place the behaviour
of human user s. Similar program mes may be used to imitate
and reproduce i n arepetitive way and at ahig h rate the
completion of aq uestionnaire by ala rge number of survey
respondent s in an attempt to falsif y asurvey and influe nce
its outcomes o r annul its validi ty and scope.
Data processing
A long way to go for LGBTI equality — Technical report
72
and bisexual; trans; and intersex) and the number of
different types of i ncidents (physical/sexual att ack, har-
assment, discrim ination). The number of different t ypes
of incidents was expected to affect the questionnaire
duration because incidents experienced increased the
number of survey qu estions that had to be completed
and thus led to longer expected completion times.
Respondent category was relevant because of the
additional questionnaire sections for trans and intersex
respondents. Intersex respondents who also identified
as trans were asked to complete both th e intersex sec-
tion and the trans section. Total completion time was
studied separately for each combination of respond-
ent category and number of incidents. Cut-off times
that defined the min imum duration needed to pas s the
total duration test were cho sen based on expertise fro m
similar surveys i n such away that the same proportion
of respondents were iden tified as ‘speeders’ among the
LGBTI groups. Arespond ent was:
•identified as aspeede r (i.e. fail) if the questionnaire
duration was less than or equal to 0.7 percentile of
the duration distribution;
•flagged with awarning if the questionnaire dura-
tion was between 0.7 and 1 percenti le;
•identified as anon-speeder (i.e. pass) if the ques-
tionnaire durati on was greater than 1 percentile.
Due to their limited n umber, intersex respondents were
categorised only b etween those who do or do not iden-
tify as trans people, and the number of incidents was
not used in the analy sis of their survey completion ti me.
Table 16 shows the selected cut-off points.
Furthermore, the questionnaires were also evaluated
on the basis of partial durations in six questionnaire
sections. The sections were selected in away that
minimises the effect of routing and includes questions
answered by the majorit y of respondents. Analog ically
to the approach used wit h the total questionnaire time,
the 0.7 and 1 percentiles were chosen to i dentify speed-
ers (fail) and non-speeders (pass) and to give inter-
mediate warnings . Each respondent ended up with six
flags, showing the outcom e of the test for each section.
The flags were combined in to asingle flag summarising
performance across a ll sections (Table 17).
Table 16. Cut-off durations in seconds defining speeding: fail if [min,0.7], warning if (0.7,1] and pass if (1,max]
Respondent
category Percentile Number of incidents
01 23
Lesbian, gay
and bisexual
0.7 368 (≈6min) 4 20 (≈7min) 473 (≈8min) 545 (≈10min)
1382 (≈6min) 436 (≈7min) 490 (≈8min) 567 ( ≈10min)
Trans 0.7 419 (≈7min) 460 (≈8min)533 (≈9min) 595 (≈10min)
1429 (≈7min) 477 (≈8min)552 (≈9min)608 (≈10min)
Intersex 0.7 393 (≈7min)
1407 (≈7min)
Intersex
and trans
0.7 331 (≈6min)
1361 (≈6min)
Table 17. Decision rule for combining partial duration speeder tests in six sections
Final flagIntermediate flags from six sections
No of section speedersNo of section warnings
Non-speeder 0≤1
Warning 0[2, 3, 4]
Speeder 0≥5
Non-speeder 1 0
Warning 1[1, 2, 3]
Speeder 1≥4
Warning 2≤1
Speeder 2≥2
Speeder ≥3 ≥0
Get this document and AI-powered insights with a free trial of vLex and Vincent AI
Get Started for FreeUnlock full access with a free 7-day trial
Transform your legal research with vLex
-
Complete access to the largest collection of common law case law on one platform
-
Generate AI case summaries that instantly highlight key legal issues
-
Advanced search capabilities with precise filtering and sorting options
-
Comprehensive legal content with documents across 100+ jurisdictions
-
Trusted by 2 million professionals including top global firms
-
Access AI-Powered Research with Vincent AI: Natural language queries with verified citations

Unlock full access with a free 7-day trial
Transform your legal research with vLex
-
Complete access to the largest collection of common law case law on one platform
-
Generate AI case summaries that instantly highlight key legal issues
-
Advanced search capabilities with precise filtering and sorting options
-
Comprehensive legal content with documents across 100+ jurisdictions
-
Trusted by 2 million professionals including top global firms
-
Access AI-Powered Research with Vincent AI: Natural language queries with verified citations

Unlock full access with a free 7-day trial
Transform your legal research with vLex
-
Complete access to the largest collection of common law case law on one platform
-
Generate AI case summaries that instantly highlight key legal issues
-
Advanced search capabilities with precise filtering and sorting options
-
Comprehensive legal content with documents across 100+ jurisdictions
-
Trusted by 2 million professionals including top global firms
-
Access AI-Powered Research with Vincent AI: Natural language queries with verified citations

Unlock full access with a free 7-day trial
Transform your legal research with vLex
-
Complete access to the largest collection of common law case law on one platform
-
Generate AI case summaries that instantly highlight key legal issues
-
Advanced search capabilities with precise filtering and sorting options
-
Comprehensive legal content with documents across 100+ jurisdictions
-
Trusted by 2 million professionals including top global firms
-
Access AI-Powered Research with Vincent AI: Natural language queries with verified citations

Unlock full access with a free 7-day trial
Transform your legal research with vLex
-
Complete access to the largest collection of common law case law on one platform
-
Generate AI case summaries that instantly highlight key legal issues
-
Advanced search capabilities with precise filtering and sorting options
-
Comprehensive legal content with documents across 100+ jurisdictions
-
Trusted by 2 million professionals including top global firms
-
Access AI-Powered Research with Vincent AI: Natural language queries with verified citations
