Edinburgh Research Explorer Epidemiology Data from the Scottish Health and Ethnicity Linkage Study (SHELS)

We linked the 2001 Scottish Census, which contains ethnicity, socio-economic and demographic data to health and death records, creating an anonymised retrospective cohort study of 4.65 million people to assess the association between ethnicity and health outcomes in Scotland. The databases contain data mostly from hospital discharge and mortality records, but also from other registers.  The databases are stored in a safe haven at the National Records of Scotland (NRS). NRS is currently exploring the feasibility of making Scottish Health and Ethnicity Linkage Study (SHELS) data open access while ensuring that the same level of confidentiality is maintained. If SHELS becomes open access it could be reused, with the appropriate approvals, to assess the influence of other socio-economic or demographic measures on the Scottish population’s health.


Overview Spatial coverage
The data were collected for all Scotland.
We followed a strict protocol that preserved anonymity and maintained separation of personal data from the Census and clinical data. Figure 1 shows our approach to linkage. We used computerised probability matching of names, sex, addresses and dates of birth to link the 2001 census for Scotland, to the Scottish Community Health Index (CHI), which is a register of patients using the NHS (National Health Service). At this stage, other data fields in the two datasets were excluded. CHI and the census unique numbers were encrypted. A one-way cryptographic ('hashing') algorithm was used to encrypt the CHI number. The census number was encrypted using an algorithm developed by the National Record of Scotland (NRS). About 95% (approx 4.65 million) of the people participating in the 2001 census (4.9 million) were linked as above to the Scottish Community Health Index, with 85% or more linked in every ethnic group. This represents about 92% of the 2001 population. This linked file of encrypted CHI and census numbers is the key to subsequent linkage of any health data to the 2001 census records.

DaTa paper
Epidemiology Data from the Scottish Health and Ethnicity Linkage Study (SHELS) Keywords: epidemiology; health; ethnic variation; hospitalisation; mortality; record linkage; retrospective cohort Funding Statement: This work was supported by funding from the Scottish Executive (under a special grant for the early phase), the Chief Scientist's Office of the Scottish Government (grant numbers CZH/4/432 CZH/4/648 and CZH 4/878), the British Lung Foundation (RhotN12) for the respiratory work, CRUK (A16594) for bowel cancer screening work, Health Protection Scotland (for work on blood-borne viruses) and NHS Health Scotland for supplementary grants. Judith Fernandez's internship was funded by an ERASMUS scholarship.
We linked the 2001 Scottish Census, which contains ethnicity, socio-economic and demographic data to health and death records, creating an anonymised retrospective cohort study of 4.65 million people to assess the association between ethnicity and health outcomes in Scotland. The databases contain data mostly from hospital discharge and mortality records, but also from other registers. The databases are stored in a safe haven at the National Records of Scotland (NRS). NRS is currently exploring the feasibility of making Scottish Health and Ethnicity Linkage Study (SHELS) data open access while ensuring that the same level of confidentiality is maintained. If SHELS becomes open access it could be reused, with the appropriate approvals, to assess the influence of other socio-economic or demographic measures on the Scottish population's health.
Using our retrospective cohort, we were able to analyse ethnic variations in various health and healthcare areas: cardiovascular diseases [2][3][4][5][6], cancer [7][8][9], maternal and child health [10], mental health [11], gastrointestinal diseases and respiratory diseases (to be submitted for publication in 2014). We have also linked primary care records from 10 general practices in Edinburgh and Glasgow.
Both hospitalisation diagnoses and causes of death (see tables 2 and 3 in appendix) were available in each health area dataset. Other health datasets were linked for the analyses of maternal and child health and mental health outcomes (see table 1 in appendix).
These morbidity and mortality data were examined in relation to ethnicity, adjusting for demographic and socioeconomic measures obtained from the 2001 census (See Analytical datasets contain no personal identifiers. Statistical output is subject to a NRS disclosure protocol, and scrutiny by a disclosure committee. Researchers require government baseline security clearance for access to the data in a safe setting at NRS, as well as research governance training.

ethics
The work was approved by the Multicentre Research Ethics Committee for Scotland and the Privacy Advisory Committee of NHS National Services Scotland, plus Community Health Index Advisory Group and Caldicott Guardian approval, where required.

Dataset description
Object name SHELS data is not yet open access, for further information about the datasets see the tables in appendix. . A detailed metadata and data dictionary will be produced for each health extract once open access approval is agreed.

Data type
Secondary data, processed data.

Format names and versions
The datasets are stored in the safe haven as SAS dataset (.sas7bdat).   Other -All-mortality -All hospitalisation, length of stay and readmission -Utilisation of morbidity and risk-factor data from primary care