Module Output
SQL:[IDI_Community].[eth_ec_ethnicities].[ec_ethnicities_YYYYMM]
SAS: libname cm_ec_ethnicities dsn=idi_community_srvprd schema=eth_ec_ethnicities; run ;
How to access a code module in the Data Lab:Read here
Purpose
The purpose of this module is to provide IDI users with ethnicty data relating to the Ethnic Communities populations that the Ministry for Ethnic Communities (MEC) serves coded
according to the Stats NZ published Ethnic Communities View of ETHNIC05 V2.1 concordance available here:
https://aria.stats.govt.nz/aria/#ConcordanceView:uri=http://stats.govt.nz/cms/ConcordanceVersion/QtaGtA8eVTvtszOV
The expanded definition of African (African+) developed by MEC which includes ethnicities such as Afrikaans, and Zimbabwean European that are not included in that
standard Stats NZ definition may increase the potential to analyse and output African data or MELAA data using the MELAA+ (Middle Eastern, Latin American and African+) definition.
For example in the 2023 Census there were ten thousand people under “Other” which our African+ definition includes as African, this represents a significant increase in population size.
Similarly, the MEC Continental European grouping is not part of the standard Stats NZ ethnicity approach but is a separate group in the Ethnic Communities View of ETHNIC05 V2.1.
We hope that making this grouping more accessible to IDI users may increase the visibility of this group in IDI analyses and output. We also include ethnicity data for the following
Ethnic Communities ethnicities: Chinese, Indian, South East Asian, Other Asian, Latin American, and Middle Eastern. For completeness we include the following Non Ethnic Communities ethnicities:
Other European, Maori and Pacific Peoples. Finally, we include European and MELAA ethnic groupings based on the Standard Stats NZ definitions to aid in comparisons researchers way
want to make.
Comparison against other sources
Comparison of Census 2023 individual data outputted by Stats NZ using our Ethnic Communities ethnicity definition with Census 2023 data created using this code showed a high degree of similarity in terms of proportions for Asian and MELAA+, and a small discrepancy in the proportion for Continental Europeans (within 1%).
References & Contacts
- Aria concordance between Ethnicity New Zealand Standard Classification 2005 V2.1 and Ethnic Communities View of ETHNIC05 V2.1
https://aria.stats.govt.nz/aria/#ConcordanceView:uri=http://stats.govt.nz/cms/ConcordanceVersion/QtaGtA8eVTvtszOV - Personal details table documentation: “DRAFT IDI Personal Details dervided table - under the hood” provided by Caitlin Henderson (Stats NZ)
- This code built on “prioritised_level2_ethnicity”, SQL code written by Craig Wright and made available via Github:
Social Wellbeing Agency. Definitions library. Source code. GitHub - nz-social-wellbeing-agency/definitions_library: Definitions from SWA's projects suitable for use with the dataset assembly tool. - Fiona Thomson - Director, Analytics Monitoring and Evaluation (AME) - Ministry for Ethnic Communities - @fiona.thomson
- Nick Preval - Senior Analyst - Ministry for Ethnic Communities - @nick.preval
- Deb Potter - Principal Analyst - Ministry for Ethnic Communities - @deb.potter
Code module technical information
The following sections provide detail on how the EC_ethnicity table is built in the IDI.
Community of Interest
| Role | Organisation | Name |
|---|---|---|
| Lead Developer | Ministry for Ethnic Communities | Nick Preval |
| Peer reviewers(code) | Keith McLeod Consulting | Keith McLeod |
| Ministry for Ethnic Communities | Wimcor von Tonder | |
| Keith McLeod Consulting | Keith McLeod | |
| Peer reviewers(documentation) | Ministry for Ethnic Communities | Deb Potter |
| Ministry for Ethnic Communities | Fiona Thomson | |
| Keith McLeod Consulting | Keith McLeod | |
| Stats NZ | Hannes Diener | |
| Policy expert | Ministry for Ethnic Communities | Deb Potter |
| Module steward | Ministry for Ethnic Communities | Fiona Thomson |
Stewardship team
| Role | Organisation | Name |
|---|---|---|
| Lead Developer | Ministry for Ethnic Communities | Nick Preval |
| Policy expert | Ministry for Ethnic Communities | Deb Potter |
| Module steward | Ministry for Ethnic Communities | Fiona Thomson |
Module Business Rules
- The following sources are used to obtain ethnicity information. Group and rank information are taken from the Personal Details table construction methodology.
| Source | Abbreviation | Group | Rank | Eth. data avail. |
|---|---|---|---|---|
| Census 2023 | CEN2023* | 1 | 1 | L4 |
| Census 2018 | CEN2018* | 1 | 2 | L4 |
| Census 2013 | CEN2013 | 1 | 3 | L4 |
| Department of Internal Affairs | DIA | 1 | 4 | L4 |
| Ministry of Health | MOH | 1 | 5 | L2 |
| Ministry of Education | MOE | 1 | 6 | L3 |
| Accident Compensation Corp | ACC | 1 | 7 | L2 |
| Ministry of Social Development | MSD | 1 | 8 | L2 |
| Student loan data from Studylink | SLA_MSD | 2 | 1 | L3 |
| Household Economic Survey | HES | 2 | 2 | L1 |
| Survey of Family Income and Employment | SOFIE | 2 | 3 | L2 |
| Longitudinal Immigration Survey of NZ | LISNZ | 2 | 4 | L4 |
| Household Labour Force Survey | HLFS | 2 | 5 | L4 (L1 pre-2016) |
| Auckland City Mission | ACM | 2 | 6 | L2 |
| General Social survey | GSS | 2 | 7 | L4 |
| PIAAC | PIAAC | 2 | 8 | L4 |
| NZ Crime and Victims Survey | NZCVS | 2 | 9 | L1, L2 (Indian and Chinese only), L4 |
| Working for Families | WFF | 3 | 1 | L1 |
| Student loan data from Ministry of Education | SLA_SLAM_MOE | 3 | 2 | L1 |
| Student loan data from Inland Revenue | SLA_SLAM_IR | 3 | 3 | L1 |
| *Imputed data is not used as per Personal Details table approach |
-
All available ethnicity information from relevant tables is processed into four columns (L1,L2,L3,L4). Lower level values are derived if necessary:
e.g. if a source has L4 data only, L1-L3 are generated by the code. -
Rows which do not contain any L1 values other than 9 are dropped (consistent with Personal Details table business rules).
-
The following ethnicity variables are generated using the four columns:
| Variable name | Rule used |
|---|---|
| Ethnic Community groups | |
| Asian | L1 = 4 |
| Chinese | L2 = 42 (Chinese) |
| Indian | L2 = 43 (Indian) |
| Other Asian | L2 = 40 (Asian nfd ) or |
| L3 = 444 (Other Asian) | |
| South East Asian | L2 = 41 (South East Asian) |
| MELAA+ | L1 = 5 (MELAA) or |
| L4 = 61115 (Mauritian), 61116 (Seychelles), 61117 (Other South African), | |
| 12948 (South African European), 12949 (Afrikaner), 12950 (Zimbabwean European) | |
| African+ | L2 = 53 (African) or |
| L4 = 61115 (Mauritian), 61116 (Seychelles), 61117 (Other South African), | |
| 12948 (South African European), 12949 (Afrikaner), 12950 (Zimbabwean European) | |
| Latin American | L2 = 52 (Latin American) |
| Middle Eastern | L2 = 51 (Middle Eastern) |
| Continental European | L3 = 122 (Dutch), 123 (Greek), 124 (Polish), 125 (South Slav), 126 (Italian), |
| 127 (German) or | |
| L4 = 12911 (Albanian), 12912 (Armenian), 12913 (Austrian), 12914 (Belgian), | |
| 12915 (Bulgarian), 12916 (Belorussian), 12918 (Cypriot nfd), 12919 (Czech), | |
| 12920 (Danish), 12921 (Estonian), 12922 (Finnish), 12923 (Flemish), | |
| 12924 (French), 12926 (Hungarian), 12927 (Icelandic), 12928 (Latvian), | |
| 12929 (Lithuanian), 12930 (Maltese), 12931 (Norwegian), 12932 (Portuguese), | |
| 12933 (Romanian), 12934 (Romani), 12935 (Russian) ,12937 (Slavic), | |
| 12938 (Slovak), 12939 (Spanish), 12940 (Swiss), 12941 (Swedish), | |
| 12942 (Ukrainian) | |
| --------------------------------------------------------------------------------------- | |
| Non-Ethnic Community groups | |
| Other European | L2 = 10 (European NFD) or 11 (New Zealand European) |
| L3 = 121 (British and Irish), 128 (Australian European) | |
| L4 = 12943 (American), 12945 (Canadian), 12947 (New Caledonian), 12999 (European NEC) | |
| Maori | L1 = 2 |
| Pacific Peoples | L1 = 3 |
| --------------------------------------------------------------------------------------- | |
| Standard Stats NZ groups | |
| European | L1 = 1 |
| MELAA | L1 = 5 |
-
Where the code generates multiple rows for a given source and snz_uid, rows are collapsed into a single row, preserving all ethnicity information.
-
The highest ranked source/row for each snz_uid is retained.
-
A “Total Ethnic Communities” variable is derived (if Asian = 1 or MELAA+ = 1 or Continental European = 1)
-
Zeros are replaced with NULL where there is not enough information to be certain that they are accurate. For example, if the top ranked data source for a particular snz_uid
was L1 and the only L1 ethnicity was a 6 (Other) then we would code MELAA+ as NULL rather than zero because the 6 is consistent with L4 variables such as 61115 (Mauritian).
Open Issues/Comments
-
This module draws ethnicity data from a broad range of datasets - the majority of these datasets do not contain Level 4 data. We were concerned to ensure that users do
not conflate cases where there is no information about a given ethnicity and cases where there is but the correct value is 0. We used the approach described above, but more
complex methods which combine the top ranked data sources available for each level for each snz_uid were considered as a different way of addressing this issue. Ultimately
the increased complexity was not felt to be justified. Future iterations of this module might explore this question further and also review whether adopting the Personal
Details table ranking is the best approach. -
Note that “Other Asian” is not a residual category, although it may appear to be so at first glance to users. The Ethnic Communities definition of Other Asian is an L3
grouping and differs from the Stats NZ definition which is a L2 grouping and functions closer to being a residual category (except that it excludes Asian nfd).
Ethnic Communities “Other Asian” includes Asian nfd but does not include Sri Lankan (L3 = 441),Japanese (L3 = 442), or Korean (L3 = 443) and are not
included in the current table due to the relatively small size of these groups.
Parameters
The following parameters should be supplied to this module to run it in the database:
- {targetdb}: The SQL database on which the spell datasets are to be created.
- “{idicleanversion}”: The IDI Clean version that the spell datasets need to be based on.
- {targetschema}: The project schema under the target database into which the spell datasets are to be created.
- {projprefix}: A (short) prefix that enables you to identify the spell dataset easily in the schema, and prevent overwriting any existing datasets that have the same name.
Dependencies
- Tables used by code module
| Source | Tables |
|---|---|
| Census 2023 | {idicleanversion}.[cen_clean].[census_individual_2023] |
| Census 2018 | {idicleanversion}.[cen_clean].[census_individual_2018] |
| Census 2013 | {idicleanversion}.[cen_clean].[census_individual_2013] |
| Department of Internal Affairs | {idicleanversion}.[dia_clean].[births] |
| Department of Internal Affairs | {idicleanversion}.[dia_clean].[deaths] |
| Ministry of Health | {idicleanversion}.[moh_clean].[nes_enrolment] |
| Ministry of Health | {idicleanversion}.[moh_clean].[pop_cohort_demographics] |
| Ministry of Health | {idiadhoc}.[clean_read_MOH_CIR].[moh_cir_nhi_20230519] |
| Ministry of Education | {idicleanversion}.[moe_clean].[student_per] |
| Ministry of Education | {idicleanversion}.[moe_clean].[enrolment] |
| Accident Compensation Corp | {idicleanversion}.[acc_clean].[clients] |
| Ministry of Social Development | {idicleanversion}.[msd_clean].[msd_swn] |
| Student loan data from Studylink | {idicleanversion}.[sla_clean].[msd_borrowing] |
| Household Economic Survey | {idiadhoc}.[hes_clean].[hes_person] |
| Household Economic Survey | {idicleanversion}.[hes_clean].[hes_person] |
| Survey of Families Income and Employment | {idicleanversion}.[sofie_clean].[person_waves] |
| Longitudinal Immigration Survey for NZ | {idicleanversion}.[lisnz_clean].[ethnicity] |
| Household Labourforce Survey | {idicleanversion}.[hlfs_clean].[data] |
| Household Labourforce Survey | {idicleanversion}.[hlfs_clean].[nzis_pre2016] |
| Auckland City Mission | {idicleanversion}.[acm_clean].[acm_client] |
| General Social survey | {idicleanversion}.[gss_clean].[gss_identity] |
| PIAAC | {idicleanversion}.[piaac_clean].[data] |
| NZ Crime and Victims Survey | {idicleanversion}.[nzcvs_clean].[core] |
| Working for Families | {idicleanversion}.[wff_clean].[spells] |
| Student loan data from Ministry of Education | {idicleanversion}.[sla_clean].[slam_moe] |
| Student loan data from Inland Revenue | {idicleanversion}.[sla_clean].[slam_ird] |
Outputs
{targetdb}.{targetschema}.{projprefix}_ec_ethnicities
Variable Descriptions
| Column name | Description | Values |
|---|---|---|
| snz_uid | The unique STATS NZ person identifier for the the individual | snz_uid |
| asian | Indicates whether the top ranked data source identifies an individual as Asian | (1,0) |
| chinese | Indicates whether the top ranked data source identifies an individual as Chinese | (1,0,NULL) |
| indian | Indicates whether the top ranked data source identifies an individual as Indian | (1,0,NULL) |
| other_asian | Indicates whether the top ranked data source identifies an individual as Other Asian | (1,0,NULL) |
| se_asian | Indicates whether the top ranked data source identifies an individual as South East Asian | (1,0,NULL) |
| melaa_plus | Indicates whether the top ranked data source identifies an individual as MELAA+ | (1,0,NULL) |
| middle_eastern | Indicates whether the top ranked data source identifies an individual as Middle Eastern | (1,0,NULL) |
| latin_american | Indicates whether the top ranked data source identifies an individual as Latin American | (1,0,NULL) |
| african_plus | Indicates whether the top ranked data source identifies an individual as African+ | (1,0,NULL) |
| continental_euro | Indicates whether the top ranked data source identifies an individual as Continental European | (1,0,NULL) |
| total_ethnic_comm | Indicates whether the top ranked data source identifies an individual as a member of any Ethnic Communities grouping | (1,0,NULL) |
| other_euro | Indicates whether the top ranked data source identifies an individual as Other European as per MEC definition | (1,0,NULL) |
| maori | Indicates whether the top ranked data source identifies an individual as Maori | (1,0) |
| pacific | Indicates whether the top ranked data source identifies an individual as Pacific | (1,0) |
| melaa_snz | Indicates whether the top ranked data source identifies an individual as MELAA as per Stats NZ definition | (1,0) |
| euro_snz | Indicates whether the top ranked data source identifies an individual as European as per Stats NZ definition | (1,0) |
| source_name | Top ranked data source | |
| ethnic_code_l1 | Individual Level 1 ethnicity values extracted from top ranked data source, separated using a “;” | |
| ethnic_code_l2 | Individual Level 2 ethnicity values extracted from top ranked data source, separated using a “;” | |
| ethnic_code_l3 | Individual Level 3 ethnicity values extracted from top ranked data source, separated using a “;” | |
| ethnic_code_l4 | Individual Level 4 ethnicity values extracted from top ranked data source, separated using a “;” |
Module Version & Change History
| Date | Version | Comments |
|---|---|---|
| 10 June 2025 | Initial | Version based on specifications from Commissioning document. |
| 20 October 2025 | Update | Updated NZCVS code reflecting various changes including dropped/ modified tables and new |
| L4 ethnicity data made available for the first time |
Code
:setvar targetdb "{targetdb}"
:setvar targetschema "{targetschema}"
:setvar projprefix "{projprefix}"
:setvar idicleanversion "{idicleanversion}"
:setvar idimetaversion "{idimetaversion}"
USE $(targetdb);
GO
DROP TABLE IF EXISTS #temp_ethnicity_list;
CREATE TABLE #temp_ethnicity_list (
snz_uid INT,
ethnic_code_l1 VARCHAR(45),
ethnic_code_l2 VARCHAR(45),
ethnic_code_l3 VARCHAR(80),
ethnic_code_l4 VARCHAR(100),
source_name VARCHAR(12),
group_ INT,
rank_ INT
);
/* CENSUS 2023 - note imputed data is excluded, no null values in ethnicity columns */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT [snz_uid]
, CONCAT(
SUBSTRING([cen_ind_ethnic_output_concat],1,1),';',
SUBSTRING([cen_ind_ethnic_output_concat],7,1),';',
SUBSTRING([cen_ind_ethnic_output_concat],13,1),';',
SUBSTRING([cen_ind_ethnic_output_concat],19,1),';',
SUBSTRING([cen_ind_ethnic_output_concat],25,1),';',
SUBSTRING([cen_ind_ethnic_output_concat],31,1)) AS ethnic_code_l1
, CONCAT(
SUBSTRING([cen_ind_ethnic_output_concat],1,2),';',
SUBSTRING([cen_ind_ethnic_output_concat],7,2),';',
SUBSTRING([cen_ind_ethnic_output_concat],13,2),';',
SUBSTRING([cen_ind_ethnic_output_concat],19,2),';',
SUBSTRING([cen_ind_ethnic_output_concat],25,2),';',
SUBSTRING([cen_ind_ethnic_output_concat],31,2)) AS ethnic_code_l2
, CONCAT(
SUBSTRING([cen_ind_ethnic_output_concat],1,3),';',
SUBSTRING([cen_ind_ethnic_output_concat],7,3),';',
SUBSTRING([cen_ind_ethnic_output_concat],13,3),';',
SUBSTRING([cen_ind_ethnic_output_concat],19,3),';',
SUBSTRING([cen_ind_ethnic_output_concat],25,3),';',
SUBSTRING([cen_ind_ethnic_output_concat],31,3)) AS ethnic_code_l3
, [cen_ind_ethnic_output_concat] AS ethnic_code_l4
, 'Cen_23' AS source_name
, 1 AS group_
, 1 AS rank_
FROM $(idicleanversion).[cen_clean].[census_individual_2023]
WHERE snz_uid IS NOT NULL AND [cen_ind_ethgr_src_code] IN ('11');
/* CEN2018 - note imputed data is excluded */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT [snz_uid]
, [cen_ind_eth_output_level1] AS ethnic_code_l1
, [cen_ind_eth_output_level2] AS ethnic_code_l2
, [cen_ind_eth_output_level3] AS ethnic_code_l3
, [cen_ind_ethgr_code] AS ethnic_code_l4
, 'Cen_18' AS source_name
, 1 AS group_
, 2 AS rank_
FROM $(idicleanversion).[cen_clean].[census_individual_2018]
WHERE [snz_uid] IS NOT NULL AND [cen_ind_ethgr_impt_ind] in ('11')
;
/* CEN2013 */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT [snz_uid]
, CONCAT(
SUBSTRING([cen_ind_eth_rand6_grp1_code],1,1),';',
SUBSTRING([cen_ind_eth_rand6_grp2_code],1,1),';',
SUBSTRING([cen_ind_eth_rand6_grp3_code],1,1),';',
SUBSTRING([cen_ind_eth_rand6_grp4_code],1,1),';',
SUBSTRING([cen_ind_eth_rand6_grp5_code],1,1),';',
SUBSTRING([cen_ind_eth_rand6_grp6_code],1,1))AS ethnic_code_l1
, CONCAT(
SUBSTRING([cen_ind_eth_rand6_grp1_code],1,2),';',
SUBSTRING([cen_ind_eth_rand6_grp2_code],1,2),';',
SUBSTRING([cen_ind_eth_rand6_grp3_code],1,2),';',
SUBSTRING([cen_ind_eth_rand6_grp4_code],1,2),';',
SUBSTRING([cen_ind_eth_rand6_grp5_code],1,2),';',
SUBSTRING([cen_ind_eth_rand6_grp6_code],1,2)) AS ethnic_code_l2
, CONCAT(
SUBSTRING([cen_ind_eth_rand6_grp1_code],1,3),';',
SUBSTRING([cen_ind_eth_rand6_grp2_code],1,3),';',
SUBSTRING([cen_ind_eth_rand6_grp3_code],1,3),';',
SUBSTRING([cen_ind_eth_rand6_grp4_code],1,3),';',
SUBSTRING([cen_ind_eth_rand6_grp5_code],1,3),';',
SUBSTRING([cen_ind_eth_rand6_grp6_code],1,3)) AS ethnic_code_l3
,CONCAT(
[cen_ind_eth_rand6_grp1_code],';',
[cen_ind_eth_rand6_grp2_code],';',
[cen_ind_eth_rand6_grp3_code],';',
[cen_ind_eth_rand6_grp4_code],';',
[cen_ind_eth_rand6_grp5_code],';',
[cen_ind_eth_rand6_grp6_code]) AS ethnic_code_l4
, 'Cen_13' AS source_name
, 1 AS group_
, 3 AS rank_
FROM $(idicleanversion).[cen_clean].[census_individual_2013]
WHERE [snz_uid] IS NOT NULL;
/* DIA */
/* Birth - child */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT [snz_uid]
,CONCAT(
SUBSTRING([dia_bir_ethnic1_snz_code],1,1),';',
SUBSTRING([dia_bir_ethnic2_snz_code],1,1),';',
SUBSTRING([dia_bir_ethnic3_snz_code],1,1),';',
SUBSTRING([dia_bir_ethnic4_snz_code],1,1),';',
SUBSTRING([dia_bir_ethnic5_snz_code],1,1),';',
SUBSTRING([dia_bir_ethnic6_snz_code],1,1),';',
SUBSTRING([dia_bir_ethnic7_snz_code],1,1),';',
SUBSTRING([dia_bir_ethnic8_snz_code],1,1),';',
SUBSTRING([dia_bir_ethnic9_snz_code],1,1),';',
SUBSTRING([dia_bir_ethnic10_snz_code],1,1)) AS ethnic_code_l1
,CONCAT(
SUBSTRING([dia_bir_ethnic1_snz_code],1,2),';',
SUBSTRING([dia_bir_ethnic2_snz_code],1,2),';',
SUBSTRING([dia_bir_ethnic3_snz_code],1,2),';',
SUBSTRING([dia_bir_ethnic4_snz_code],1,2),';',
SUBSTRING([dia_bir_ethnic5_snz_code],1,2),';',
SUBSTRING([dia_bir_ethnic6_snz_code],1,2),';',
SUBSTRING([dia_bir_ethnic7_snz_code],1,2),';',
SUBSTRING([dia_bir_ethnic8_snz_code],1,2),';',
SUBSTRING([dia_bir_ethnic9_snz_code],1,2),';',
SUBSTRING([dia_bir_ethnic10_snz_code],1,2)) AS ethnic_code_l2
,CONCAT(
SUBSTRING([dia_bir_ethnic1_snz_code],1,3),';',
SUBSTRING([dia_bir_ethnic2_snz_code],1,3),';',
SUBSTRING([dia_bir_ethnic3_snz_code],1,3),';',
SUBSTRING([dia_bir_ethnic4_snz_code],1,3),';',
SUBSTRING([dia_bir_ethnic5_snz_code],1,3),';',
SUBSTRING([dia_bir_ethnic6_snz_code],1,3),';',
SUBSTRING([dia_bir_ethnic7_snz_code],1,3),';',
SUBSTRING([dia_bir_ethnic8_snz_code],1,3),';',
SUBSTRING([dia_bir_ethnic9_snz_code],1,3),';',
SUBSTRING([dia_bir_ethnic10_snz_code],1,3)) AS ethnic_code_l3
,CONCAT(
[dia_bir_ethnic1_snz_code],';',
[dia_bir_ethnic2_snz_code],';',
[dia_bir_ethnic3_snz_code],';',
[dia_bir_ethnic4_snz_code],';',
[dia_bir_ethnic5_snz_code],';',
[dia_bir_ethnic6_snz_code],';',
[dia_bir_ethnic7_snz_code],';',
[dia_bir_ethnic8_snz_code],';',
[dia_bir_ethnic9_snz_code],';',
[dia_bir_ethnic10_snz_code]) AS ethnic_code_l4
, 'DIA' AS source_name
, 1 AS group_
, 4 AS rank_
FROM $(idicleanversion).[dia_clean].[births]
WHERE snz_uid IS NOT NULL AND [dia_bir_ethnic1_snz_code] IS NOT NULL;
/* Birth - parent 1 */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [parent1_snz_uid] as [snz_uid]
,CONCAT(
SUBSTRING([dia_bir_parent1_ethnic1_snz_code],1,1),';',
SUBSTRING([dia_bir_parent1_ethnic2_snz_code],1,1),';',
SUBSTRING([dia_bir_parent1_ethnic3_snz_code],1,1),';',
SUBSTRING([dia_bir_parent1_ethnic4_snz_code],1,1),';',
SUBSTRING([dia_bir_parent1_ethnic5_snz_code],1,1),';',
SUBSTRING([dia_bir_parent1_ethnic6_snz_code],1,1),';',
SUBSTRING([dia_bir_parent1_ethnic7_snz_code],1,1),';',
SUBSTRING([dia_bir_parent1_ethnic8_snz_code],1,1),';',
SUBSTRING([dia_bir_parent1_ethnic9_snz_code],1,1),';',
SUBSTRING([dia_bir_parent1_ethnic10_snz_code],1,1)) AS ethnic_code_l1
,CONCAT(
SUBSTRING([dia_bir_parent1_ethnic1_snz_code],1,2),';',
SUBSTRING([dia_bir_parent1_ethnic2_snz_code],1,2),';',
SUBSTRING([dia_bir_parent1_ethnic3_snz_code],1,2),';',
SUBSTRING([dia_bir_parent1_ethnic4_snz_code],1,2),';',
SUBSTRING([dia_bir_parent1_ethnic5_snz_code],1,2),';',
SUBSTRING([dia_bir_parent1_ethnic6_snz_code],1,2),';',
SUBSTRING([dia_bir_parent1_ethnic7_snz_code],1,2),';',
SUBSTRING([dia_bir_parent1_ethnic8_snz_code],1,2),';',
SUBSTRING([dia_bir_parent1_ethnic9_snz_code],1,2),';',
SUBSTRING([dia_bir_parent1_ethnic10_snz_code],1,2)) AS ethnic_code_l2
,CONCAT(
SUBSTRING([dia_bir_parent1_ethnic1_snz_code],1,3),';',
SUBSTRING([dia_bir_parent1_ethnic2_snz_code],1,3),';',
SUBSTRING([dia_bir_parent1_ethnic3_snz_code],1,3),';',
SUBSTRING([dia_bir_parent1_ethnic4_snz_code],1,3),';',
SUBSTRING([dia_bir_parent1_ethnic5_snz_code],1,3),';',
SUBSTRING([dia_bir_parent1_ethnic6_snz_code],1,3),';',
SUBSTRING([dia_bir_parent1_ethnic7_snz_code],1,3),';',
SUBSTRING([dia_bir_parent1_ethnic8_snz_code],1,3),';',
SUBSTRING([dia_bir_parent1_ethnic9_snz_code],1,3),';',
SUBSTRING([dia_bir_parent1_ethnic10_snz_code],1,3)) AS ethnic_code_l3
,CONCAT(
[dia_bir_parent1_ethnic1_snz_code],';',
[dia_bir_parent1_ethnic2_snz_code],';',
[dia_bir_parent1_ethnic3_snz_code],';',
[dia_bir_parent1_ethnic4_snz_code],';',
[dia_bir_parent1_ethnic5_snz_code],';',
[dia_bir_parent1_ethnic6_snz_code],';',
[dia_bir_parent1_ethnic7_snz_code],';',
[dia_bir_parent1_ethnic8_snz_code],';',
[dia_bir_parent1_ethnic9_snz_code],';',
[dia_bir_parent1_ethnic10_snz_code]) AS ethnic_code_l4
, 'DIA' AS source_name
, 1 AS group_
, 4 AS rank_
FROM $(idicleanversion).[dia_clean].[births]
WHERE snz_uid IS NOT NULL AND [dia_bir_parent1_ethnic1_snz_code] IS NOT NULL and dia_bir_parent1_ethnic1_snz_code is not null;
/* Birth - parent 2 */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct parent2_snz_uid as [snz_uid]
,CONCAT(
SUBSTRING([dia_bir_parent2_ethnic1_snz_code],1,1),';',
SUBSTRING([dia_bir_parent2_ethnic2_snz_code],1,1),';',
SUBSTRING([dia_bir_parent2_ethnic3_snz_code],1,1),';',
SUBSTRING([dia_bir_parent2_ethnic4_snz_code],1,1),';',
SUBSTRING([dia_bir_parent2_ethnic5_snz_code],1,1),';',
SUBSTRING([dia_bir_parent2_ethnic6_snz_code],1,1),';',
SUBSTRING([dia_bir_parent2_ethnic7_snz_code],1,1),';',
SUBSTRING([dia_bir_parent2_ethnic8_snz_code],1,1),';',
SUBSTRING([dia_bir_parent2_ethnic9_snz_code],1,1),';',
SUBSTRING([dia_bir_parent2_ethnic10_snz_code],1,1)) AS ethnic_code_l1
,CONCAT(
SUBSTRING([dia_bir_parent2_ethnic1_snz_code],1,2),';',
SUBSTRING([dia_bir_parent2_ethnic2_snz_code],1,2),';',
SUBSTRING([dia_bir_parent2_ethnic3_snz_code],1,2),';',
SUBSTRING([dia_bir_parent2_ethnic4_snz_code],1,2),';',
SUBSTRING([dia_bir_parent2_ethnic5_snz_code],1,2),';',
SUBSTRING([dia_bir_parent2_ethnic6_snz_code],1,2),';',
SUBSTRING([dia_bir_parent2_ethnic7_snz_code],1,2),';',
SUBSTRING([dia_bir_parent2_ethnic8_snz_code],1,2),';',
SUBSTRING([dia_bir_parent2_ethnic9_snz_code],1,2),';',
SUBSTRING([dia_bir_parent2_ethnic10_snz_code],1,2)) AS ethnic_code_l2
,CONCAT(
SUBSTRING([dia_bir_parent2_ethnic1_snz_code],1,3),';',
SUBSTRING([dia_bir_parent2_ethnic2_snz_code],1,3),';',
SUBSTRING([dia_bir_parent2_ethnic3_snz_code],1,3),';',
SUBSTRING([dia_bir_parent2_ethnic4_snz_code],1,3),';',
SUBSTRING([dia_bir_parent2_ethnic5_snz_code],1,3),';',
SUBSTRING([dia_bir_parent2_ethnic6_snz_code],1,3),';',
SUBSTRING([dia_bir_parent2_ethnic7_snz_code],1,3),';',
SUBSTRING([dia_bir_parent2_ethnic8_snz_code],1,3),';',
SUBSTRING([dia_bir_parent2_ethnic9_snz_code],1,3),';',
SUBSTRING([dia_bir_parent2_ethnic10_snz_code],1,3)) AS ethnic_code_l3
,CONCAT(
[dia_bir_parent2_ethnic1_snz_code],';',
[dia_bir_parent2_ethnic2_snz_code],';',
[dia_bir_parent2_ethnic3_snz_code],';',
[dia_bir_parent2_ethnic4_snz_code],';',
[dia_bir_parent2_ethnic5_snz_code],';',
[dia_bir_parent2_ethnic6_snz_code],';',
[dia_bir_parent2_ethnic7_snz_code],';',
[dia_bir_parent2_ethnic8_snz_code],';',
[dia_bir_parent2_ethnic9_snz_code],';',
[dia_bir_parent2_ethnic10_snz_code]) AS ethnic_code_l4
, 'DIA' AS source_name
, 1 AS group_
, 4 AS rank_
FROM $(idicleanversion).[dia_clean].[births]
WHERE snz_uid IS NOT NULL AND [dia_bir_parent2_ethnic1_snz_code] IS NOT NULL and dia_bir_parent2_ethnic1_snz_code is not null;
/* DIA deaths */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT [snz_uid]
,CONCAT(
SUBSTRING([dia_dth_ethnic1_snz_code],1,1),';',
SUBSTRING([dia_dth_ethnic2_snz_code],1,1),';',
SUBSTRING([dia_dth_ethnic3_snz_code],1,1),';',
SUBSTRING([dia_dth_ethnic4_snz_code],1,1),';',
SUBSTRING([dia_dth_ethnic5_snz_code],1,1),';',
SUBSTRING([dia_dth_ethnic6_snz_code],1,1),';',
SUBSTRING([dia_dth_ethnic7_snz_code],1,1),';',
SUBSTRING([dia_dth_ethnic8_snz_code],1,1),';',
SUBSTRING([dia_dth_ethnic9_snz_code],1,1),';',
SUBSTRING([dia_dth_ethnic10_snz_code],1,1)) AS ethnic_code_l1
,CONCAT(
SUBSTRING([dia_dth_ethnic1_snz_code],1,2),';',
SUBSTRING([dia_dth_ethnic2_snz_code],1,2),';',
SUBSTRING([dia_dth_ethnic3_snz_code],1,2),';',
SUBSTRING([dia_dth_ethnic4_snz_code],1,2),';',
SUBSTRING([dia_dth_ethnic5_snz_code],1,2),';',
SUBSTRING([dia_dth_ethnic6_snz_code],1,2),';',
SUBSTRING([dia_dth_ethnic7_snz_code],1,2),';',
SUBSTRING([dia_dth_ethnic8_snz_code],1,2),';',
SUBSTRING([dia_dth_ethnic9_snz_code],1,2),';',
SUBSTRING([dia_dth_ethnic10_snz_code],1,2)) AS ethnic_code_l2
,CONCAT(
SUBSTRING([dia_dth_ethnic1_snz_code],1,3),';',
SUBSTRING([dia_dth_ethnic2_snz_code],1,3),';',
SUBSTRING([dia_dth_ethnic3_snz_code],1,3),';',
SUBSTRING([dia_dth_ethnic4_snz_code],1,3),';',
SUBSTRING([dia_dth_ethnic5_snz_code],1,3),';',
SUBSTRING([dia_dth_ethnic6_snz_code],1,3),';',
SUBSTRING([dia_dth_ethnic7_snz_code],1,3),';',
SUBSTRING([dia_dth_ethnic8_snz_code],1,3),';',
SUBSTRING([dia_dth_ethnic9_snz_code],1,3),';',
SUBSTRING([dia_dth_ethnic10_snz_code],1,3)) AS ethnic_code_l3
,CONCAT(
[dia_dth_ethnic1_snz_code],';',
[dia_dth_ethnic2_snz_code],';',
[dia_dth_ethnic3_snz_code],';',
[dia_dth_ethnic4_snz_code],';',
[dia_dth_ethnic5_snz_code],';',
[dia_dth_ethnic6_snz_code],';',
[dia_dth_ethnic7_snz_code],';',
[dia_dth_ethnic8_snz_code],';',
[dia_dth_ethnic9_snz_code],';',
[dia_dth_ethnic10_snz_code]) AS ethnic_code_l4
, 'DIA' AS source_name
, 1 AS group_
, 4 AS rank_
FROM $(idicleanversion).[dia_clean].[deaths]
WHERE snz_uid IS NOT NULL AND [dia_dth_ethnic1_snz_code] IS NOT NULL;
/* MOH */
/*MOH PHO enrollment */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [snz_uid]
,CONCAT(
SUBSTRING([moh_nes_ethnic1_snz_code],1,1),';',
SUBSTRING([moh_nes_ethnic2_snz_code],1,1),';',
SUBSTRING([moh_nes_ethnic3_snz_code],1,1)) AS ethnic_code_l1
,CONCAT(
SUBSTRING([moh_nes_ethnic1_snz_code],1,2),';',
SUBSTRING([moh_nes_ethnic2_snz_code],1,2),';',
SUBSTRING([moh_nes_ethnic3_snz_code],1,2)) AS ethnic_code_l2
, 'NA'AS ethnic_code_l3
, 'NA' AS ethnic_code_l4
, 'MOH' AS source_name
, 1 AS group_
, 5 AS rank_
FROM $(idicleanversion).[moh_clean].[nes_enrolment]
WHERE snz_uid IS NOT NULL and moh_nes_ethnic1_snz_code is not null;
/* MOH NHI population cohort demographics*/
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct
[snz_uid]
,CONCAT(
SUBSTRING([moh_pop_ethnic1_snz_code],1,1),';',
SUBSTRING([moh_pop_ethnic2_snz_code],1,1),';',
SUBSTRING([moh_pop_ethnic3_snz_code],1,1)) AS ethnic_code_l1
,CONCAT(
SUBSTRING([moh_pop_ethnic1_snz_code],1,2),';',
SUBSTRING([moh_pop_ethnic2_snz_code],1,2),';',
SUBSTRING([moh_pop_ethnic3_snz_code],1,2)) AS ethnic_code_l2
, 'NA' AS ethnic_code_l3
, 'NA' AS ethnic_code_l4
, 'MOH' AS source_name
, 1 AS group_
, 5 AS rank_
FROM $(idicleanversion).[moh_clean].[pop_cohort_demographics]
WHERE snz_uid IS NOT NULL;
/* MOH CIR (Covid immunisation register) */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct b.[snz_uid]
,CONCAT(
SUBSTRING(cast([ethnic_code_1] AS varchar),1,1),';',
SUBSTRING(cast([ethnic_code_2] AS varchar),1,1),';',
SUBSTRING(cast([ethnic_code_3] AS varchar),1,1)) AS ethnic_code_l1
,CONCAT(
SUBSTRING(cast([ethnic_code_1] AS varchar),1,2),';',
SUBSTRING(cast([ethnic_code_2] AS varchar),1,2),';',
SUBSTRING(cast([ethnic_code_3] AS varchar),1,2)) AS ethnic_code_l2
, 'NA' AS ethnic_code_l3
, 'NA' AS ethnic_code_l4
, 'MOH' AS source_name
, 1 AS group_
, 5 AS rank_
FROM [IDI_Adhoc].[clean_read_MOH_CIR].[moh_cir_nhi_20230519] as a
INNER JOIN $(idicleanversion).[security].[concordance] AS b
ON a.snz_moh_uid = b.snz_moh_uid
WHERE a.snz_uid IS NOT NULL and ethnic_code_1 is not null;
/* MOE */
/* MOE STUDENT PER */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [snz_uid]
,CONCAT(
SUBSTRING([moe_spi_ethnic1_snz_code],1,1),';',
SUBSTRING([moe_spi_ethnic2_snz_code],1,1),';',
SUBSTRING([moe_spi_ethnic3_snz_code],1,1)) AS ethnic_code_l1
,CONCAT(
SUBSTRING([moe_spi_ethnic1_snz_code],1,2),';',
SUBSTRING([moe_spi_ethnic2_snz_code],1,2),';',
SUBSTRING([moe_spi_ethnic3_snz_code],1,2)) AS ethnic_code_l2
,CONCAT(
[moe_spi_ethnic1_snz_code],';',
[moe_spi_ethnic1_snz_code],';',
[moe_spi_ethnic1_snz_code]) AS ethnic_code_l3
,'NA' AS ethnic_code_l4
, 'MOE' AS source_name
, 1 AS group_
, 6 AS rank_
FROM $(idicleanversion).[moe_clean].[student_per]
WHERE snz_uid IS NOT NULL and [moe_spi_ethnic1_snz_code] is not null;
/* MOE ENROLMENTS */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [snz_uid]
,CONCAT(
SUBSTRING([moe_enr_ethnic1_snz_code],1,1),';',
SUBSTRING([moe_enr_ethnic2_snz_code],1,1),';',
SUBSTRING([moe_enr_ethnic3_snz_code],1,1)) AS ethnic_code_l1
,CONCAT(
SUBSTRING([moe_enr_ethnic1_snz_code],1,2),';',
SUBSTRING([moe_enr_ethnic2_snz_code],1,2),';',
SUBSTRING([moe_enr_ethnic3_snz_code],1,2)) AS ethnic_code_l2
,CONCAT(
[moe_enr_ethnic1_snz_code],';',
[moe_enr_ethnic1_snz_code],';',
[moe_enr_ethnic1_snz_code]) AS ethnic_code_l3
,'NA' AS ethnic_code_l4
, 'MOE' AS source_name
, 1 AS group_
, 6 AS rank_
FROM $(idicleanversion).[moe_clean].[enrolment]
WHERE snz_uid IS NOT NULL and [moe_enr_ethnic1_snz_code] is not null;
/* ACC */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [snz_uid]
,CONCAT(
SUBSTRING([acc_cli_ethnic1_snz_code],1,1),';',
SUBSTRING([acc_cli_ethnic2_snz_code],1,1),';',
SUBSTRING([acc_cli_ethnic3_snz_code],1,1)) AS ethnic_code_l1
,CONCAT(
[acc_cli_ethnic1_snz_code],';',
[acc_cli_ethnic2_snz_code],';',
[acc_cli_ethnic3_snz_code]) AS ethnic_code_l2
,'NA' AS ethnic_code_l3
,'NA' AS ethnic_code_l4
, 'ACC' AS source_name
, 1 AS group_
, 7 AS rank_
FROM $(idicleanversion).[acc_clean].[clients]
WHERE snz_uid IS NOT NULL and acc_cli_ethnic1_snz_code is not null;
/* MSD */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [snz_uid]
,CONCAT(
SUBSTRING([msd_swn_ucvii_ethnic1_snz_code],1,1),';',
SUBSTRING([msd_swn_ucvii_ethnic2_snz_code],1,1),';',
SUBSTRING([msd_swn_ucvii_ethnic3_snz_code],1,1)) AS ethnic_code_l1
,CONCAT(
[msd_swn_ucvii_ethnic1_snz_code],';',
[msd_swn_ucvii_ethnic2_snz_code],';',
[msd_swn_ucvii_ethnic3_snz_code]) AS ethnic_code_l2
, 'NA' AS ethnic_code_l3
, 'NA' AS ethnic_code_l4
, 'MSD' AS source_name
, 1 AS group_
, 8 AS rank_
FROM $(idicleanversion).[msd_clean].[msd_swn]
WHERE snz_uid IS NOT NULL and msd_swn_ucvii_ethnic1_snz_code is not null;
/* SLA_MSD */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [snz_uid]
,CONCAT(
SUBSTRING([msd_sla_ethnic1_snz_code],1,1),';',
SUBSTRING([msd_sla_ethnic2_snz_code],1,1),';',
SUBSTRING([msd_sla_ethnic3_snz_code],1,1)) AS ethnic_code_l1
,CONCAT(
SUBSTRING([msd_sla_ethnic1_snz_code],1,2),';',
SUBSTRING([msd_sla_ethnic2_snz_code],1,2),';',
SUBSTRING([msd_sla_ethnic3_snz_code],1,2)) AS ethnic_code_l2
,CONCAT(
[msd_sla_ethnic1_snz_code],';',
[msd_sla_ethnic2_snz_code],';',
[msd_sla_ethnic3_snz_code]) AS ethnic_code_l3
,'NA' AS ethnic_code_l4
, 'SLA_MSD' AS source_name
, 2 AS group_
, 1 AS rank_
FROM $(idicleanversion).[sla_clean].[msd_borrowing]
WHERE snz_uid IS NOT NULL and msd_sla_ethnic1_snz_code is not null;
/* HES */
/* HES_PERSON table in IDI_CLEAN */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [snz_uid]
, CONCAT(
(case when [hes_per_ethnic_grp1_snz_ind] =1 then 1 else 0 END),';',
(case when [hes_per_ethnic_grp2_snz_ind] =1 then 2 else 0 END),';',
(case when [hes_per_ethnic_grp3_snz_ind] =1 then 3 else 0 END),';',
(case when [hes_per_ethnic_grp4_snz_ind] =1 then 4 else 0 END),';',
(case when [hes_per_ethnic_grp5_snz_ind] =1 then 5 else 0 END),';',
(case when [hes_per_ethnic_grp6_snz_ind] =1 then 6 else 0 END)) AS ethnic_code_l1
, 'NA' AS ethnic_code_l2
, 'NA' AS ethnic_code_l3
, 'NA' AS ethnic_code_l4
, 'HES' AS source_name
, 2 AS group_
, 2 AS rank_
FROM $(idicleanversion).[hes_clean].[hes_person]
WHERE snz_uid IS NOT NULL and hes_per_ethnic_grp1_snz_ind is not null;
/* HES adhoc tables covering years 2015/16- 2023/24 */
DROP TABLE IF EXISTS #temp_hes;
CREATE TABLE #temp_hes (
snz_hes_uid INT,
dvethnicity VARCHAR(45));
INSERT INTO #temp_hes (snz_hes_uid, dvethnicity)
SELECT DISTINCT
snz_hes_uid,
dvethnicity
FROM [IDI_ADHOC].[clean_read_HES].[hes_person_1516] WHERE snz_hes_uid IS NOT NULL AND dvethnicity IS NOT NULL
UNION
SELECT DISTINCT
snz_hes_uid,
dvethnicity
FROM [IDI_ADHOC].[clean_read_HES].[hes_person_1617] WHERE snz_hes_uid IS NOT NULL AND dvethnicity IS NOT NULL
UNION
SELECT DISTINCT
snz_hes_uid,
dvethnicity
FROM [IDI_ADHOC].[clean_read_HES].[hes_person_1718] WHERE snz_hes_uid IS NOT NULL AND dvethnicity IS NOT NULL
UNION
SELECT DISTINCT
snz_hes_uid,
dvethnicity
FROM [IDI_ADHOC].[clean_read_HES].[hes_person_1819] WHERE snz_hes_uid IS NOT NULL AND dvethnicity IS NOT NULL
UNION
SELECT DISTINCT
snz_hes_uid,
dvethnicity
FROM [IDI_ADHOC].[clean_read_HES].[hes_persondem_1819] WHERE snz_hes_uid IS NOT NULL AND dvethnicity IS NOT NULL
UNION
SELECT DISTINCT
snz_hes_uid,
dvethnicity
FROM [IDI_ADHOC].[clean_read_HES].[hes_persondem_1920] WHERE snz_hes_uid IS NOT NULL AND dvethnicity IS NOT NULL
UNION
SELECT DISTINCT
snz_hes_uid,
dvethnicity
FROM [IDI_ADHOC].[clean_read_HES].[hes_persondem_2021] WHERE snz_hes_uid IS NOT NULL AND dvethnicity IS NOT NULL
UNION
SELECT DISTINCT
snz_hes_uid,
dvethnicity
FROM [IDI_ADHOC].[clean_read_HES].[hes_persondem_2122] WHERE snz_hes_uid IS NOT NULL AND dvethnicity IS NOT NULL
UNION
SELECT DISTINCT
snz_hes_uid,
dvethnicity
FROM [IDI_ADHOC].[clean_read_HES].[hes_persondem_2223] WHERE snz_hes_uid IS NOT NULL AND dvethnicity IS NOT NULL
UNION
SELECT DISTINCT
snz_hes_uid,
dvethnicity
FROM [IDI_ADHOC].[clean_read_HES].[hes_persondem_2324] WHERE snz_hes_uid IS NOT NULL AND dvethnicity IS NOT NULL;
UPDATE #temp_hes
SET dvethnicity = REPLACE (DVETHNICITY,CHAR(34),'');
UPDATE #temp_hes
SET dvethnicity = REPLACE (DVETHNICITY,',',';');
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct b.[snz_uid] as snz_uid
,a.[dvethnicity] as ethnic_code_l1
,'NA' AS ethnic_code_l2
,'NA' AS ethnic_code_l3
,'NA' AS ethnic_code_l4
,'HES' as source_name
, 2 AS group_
, 2 AS rank_
FROM #temp_hes AS A
INNER JOIN $(idicleanversion).[security].[concordance] as B
ON a.[snz_hes_uid] = b.[snz_hes_uid];
DROP TABLE IF EXISTS #temp_hes;
/* SOFIE */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [snz_uid]
,CONCAT(
SUBSTRING([sofie_wav_ethnic1_snz_code],1,1),';',
SUBSTRING([sofie_wav_ethnic2_snz_code],1,1),';',
SUBSTRING([sofie_wav_ethnic3_snz_code],1,1),';',
SUBSTRING([sofie_wav_ethnic4_snz_code],1,1)) AS ethnic_code_l1
,CONCAT(
[sofie_wav_ethnic1_snz_code],';',
[sofie_wav_ethnic2_snz_code],';',
[sofie_wav_ethnic3_snz_code],';',
[sofie_wav_ethnic4_snz_code]) AS ethnic_code_l2
, 'NA' AS ethnic_code_l3
, 'NA' AS ethnic_code_l4
, 'SOFIE' AS source_name
, 2 AS group_
, 3 AS rank_
FROM $(idicleanversion).[sofie_clean].[person_waves]
WHERE snz_uid IS NOT NULL and sofie_wav_ethnic1_snz_code is not null;
/* LISNZ */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [snz_uid]
, substring([lisnz_ethnic_snz_code], 1, 1) AS ethnic_code_l1
, substring([lisnz_ethnic_snz_code], 1, 2) AS ethnic_code_l2
, substring([lisnz_ethnic_snz_code], 1, 3) AS ethnic_code_l3
, [lisnz_ethnic_snz_code] AS ethnic_code_l4
, 'LISNZ' AS source_name
, 2 AS group_
, 4 AS rank_
FROM $(idicleanversion).[lisnz_clean].[ethnicity]
WHERE snz_uid IS NOT NULL and [lisnz_ethnic_snz_code] IS NOT NULL;
/* HLFS */
/* HLFF hlfs_clean.nzis */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [snz_uid]
,CONCAT(
SUBSTRING([hlfs_urd_ethnic_1_code],1,1),';',
SUBSTRING([hlfs_urd_ethnic_2_code],1,1),';',
SUBSTRING([hlfs_urd_ethnic_3_code],1,1),';',
SUBSTRING([hlfs_urd_ethnic_4_code],1,1),';',
SUBSTRING([hlfs_urd_ethnic_5_code],1,1),';',
SUBSTRING([hlfs_urd_ethnic_6_code],1,1),';',
SUBSTRING([hlfs_urd_ethnic_7_code],1,1),';',
SUBSTRING([hlfs_urd_ethnic_8_code],1,1),';',
SUBSTRING([hlfs_urd_ethnic_9_code],1,1),';',
SUBSTRING([hlfs_urd_ethnic_10_code],1,1),';',
SUBSTRING([hlfs_urd_ethnic_11_code],1,1),';',
SUBSTRING([hlfs_urd_ethnic_12_code],1,1),';',
SUBSTRING([hlfs_urd_ethnic_13_code],1,1),';',
SUBSTRING([hlfs_urd_ethnic_14_code],1,1)) AS ethnic_code_l1
,CONCAT(
SUBSTRING([hlfs_urd_ethnic_1_code],1,2),';',
SUBSTRING([hlfs_urd_ethnic_2_code],1,2),';',
SUBSTRING([hlfs_urd_ethnic_3_code],1,2),';',
SUBSTRING([hlfs_urd_ethnic_4_code],1,2),';',
SUBSTRING([hlfs_urd_ethnic_5_code],1,2),';',
SUBSTRING([hlfs_urd_ethnic_6_code],1,2),';',
SUBSTRING([hlfs_urd_ethnic_7_code],1,2),';',
SUBSTRING([hlfs_urd_ethnic_8_code],1,2),';',
SUBSTRING([hlfs_urd_ethnic_9_code],1,2),';',
SUBSTRING([hlfs_urd_ethnic_10_code],1,2),';',
SUBSTRING([hlfs_urd_ethnic_11_code],1,2),';',
SUBSTRING([hlfs_urd_ethnic_12_code],1,2),';',
SUBSTRING([hlfs_urd_ethnic_13_code],1,2),';',
SUBSTRING([hlfs_urd_ethnic_14_code],1,2)) AS ethnic_code_l2
,CONCAT(
SUBSTRING([hlfs_urd_ethnic_1_code],1,3),';',
SUBSTRING([hlfs_urd_ethnic_2_code],1,3),';',
SUBSTRING([hlfs_urd_ethnic_3_code],1,3),';',
SUBSTRING([hlfs_urd_ethnic_4_code],1,3),';',
SUBSTRING([hlfs_urd_ethnic_5_code],1,3),';',
SUBSTRING([hlfs_urd_ethnic_6_code],1,3),';',
SUBSTRING([hlfs_urd_ethnic_7_code],1,3),';',
SUBSTRING([hlfs_urd_ethnic_8_code],1,3),';',
SUBSTRING([hlfs_urd_ethnic_9_code],1,3),';',
SUBSTRING([hlfs_urd_ethnic_10_code],1,3),';',
SUBSTRING([hlfs_urd_ethnic_11_code],1,3),';',
SUBSTRING([hlfs_urd_ethnic_12_code],1,3),';',
SUBSTRING([hlfs_urd_ethnic_13_code],1,3),';',
SUBSTRING([hlfs_urd_ethnic_14_code],1,3)) AS ethnic_code_l3
, CONCAT(
[hlfs_urd_ethnic_1_code],';',
[hlfs_urd_ethnic_2_code],';',
[hlfs_urd_ethnic_3_code],';',
[hlfs_urd_ethnic_4_code],';',
[hlfs_urd_ethnic_5_code],';',
[hlfs_urd_ethnic_6_code],';',
[hlfs_urd_ethnic_7_code],';',
[hlfs_urd_ethnic_8_code],';',
[hlfs_urd_ethnic_9_code],';',
[hlfs_urd_ethnic_10_code],';',
[hlfs_urd_ethnic_11_code],';',
[hlfs_urd_ethnic_12_code],';',
[hlfs_urd_ethnic_13_code],';',
[hlfs_urd_ethnic_14_code],';') AS ethnic_code_l4
, 'HLFS' AS source_name
, 2 AS group_
, 5 AS rank_
FROM $(idicleanversion).[hlfs_clean].[data]
WHERE snz_uid IS NOT NULL and hlfs_urd_ethnic_1_code is not null;
/* HLFS hlfs_clean.nzis_pre2016 */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [snz_uid]
,[nzis_is_ethnic_code] as ethnic_code_l1
, 'NA' AS ethnic_code_l2
, 'NA' AS ethnic_code_l3
, 'NA' AS ethnic_code_l4
, 'HLFS' AS source_name
, 2 AS group_
, 5 AS rank_
FROM $(idicleanversion).[hlfs_clean].[nzis_pre2016]
WHERE snz_uid IS NOT NULL and nzis_is_ethnic_code is not null;
/* Auckland City Mission */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [snz_uid]
,CONCAT(
substring([acm_cli_ethnic1_snz_code],1,1),';',
substring([acm_cli_ethnic2_snz_code],1,1),';',
substring([acm_cli_ethnic3_snz_code],1,1)) AS ethnic_code_l1
,CONCAT(
[acm_cli_ethnic1_snz_code],';',
[acm_cli_ethnic2_snz_code],';',
[acm_cli_ethnic3_snz_code]) as ethnic_code_l2
, 'NA' AS ethnic_code_l3
, 'NA' AS ethnic_code_l4
, 'ACM' AS source_name
, 2 AS group_
, 6 AS rank_
FROM $(idicleanversion).[acm_clean].[acm_client]
WHERE snz_uid IS NOT NULL and acm_cli_ethnic1_snz_code is not null;
/* GSS */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct
a.snz_uid
,a.e1 AS ethnic_code_l1
, CONCAT(
SUBSTRING(a.e4,1,2),';',
SUBSTRING(a.e4,7,2),';',
SUBSTRING(a.e4,13,2),';',
SUBSTRING(a.e4,19,2),';',
SUBSTRING(a.e4,25,2),';',
SUBSTRING(a.e4,31,2),';',
SUBSTRING(a.e4,37,2),';',
SUBSTRING(a.e4,43,2)) AS ethnic_code_l2
, CONCAT(
SUBSTRING(a.e4,1,3),';',
SUBSTRING(a.e4,7,3),';',
SUBSTRING(a.e4,13,3),';',
SUBSTRING(a.e4,19,3),';',
SUBSTRING(a.e4,25,3),';',
SUBSTRING(a.e4,31,3),';',
SUBSTRING(a.e4,37,3),';',
SUBSTRING(a.e4,43,3)) AS ethnic_code_l3,
e4 as ethnic_code_l4,
'GSS' AS source_name
, 2 AS group_
, 7 AS rank_
FROM (select distinct
snz_uid,
replace([gss_id_dvethnic],', ',';') as e1,
replace([gss_id_dvethnic_code],', ',';') as e4,
gss_id_dvethnic_code
from $(idicleanversion).[gss_clean].[gss_identity]
WHERE snz_uid IS NOT NULL and gss_id_dvethnic is not null ) as a;
/* PIAAC */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [snz_uid]
,CONCAT(
SUBSTRING(CAST([piaac_ethmention1_code] as VARCHAR(10)),1,1),';',
SUBSTRING(CAST([piaac_ethmention2_code] as VARCHAR(10)),1,1),';',
SUBSTRING(CAST([piaac_ethmention3_code] as VARCHAR(10)),1,1),';',
SUBSTRING(CAST([piaac_ethmention4_code] as VARCHAR(10)),1,1),';',
SUBSTRING(CAST([piaac_ethmention5_code] as VARCHAR(10)),1,1),';',
SUBSTRING(CAST([piaac_ethmention6_code] as VARCHAR(10)),1,1),';',
SUBSTRING(CAST([piaac_ethmention7_code] as VARCHAR(10)),1,1)) AS ethnic_code_l1
, CONCAT(
SUBSTRING(CAST([piaac_ethmention1_code] as VARCHAR(10)),1,2),';',
SUBSTRING(CAST([piaac_ethmention2_code] as VARCHAR(10)),1,2),';',
SUBSTRING(CAST([piaac_ethmention3_code] as VARCHAR(10)),1,2),';',
SUBSTRING(CAST([piaac_ethmention4_code] as VARCHAR(10)),1,2),';',
SUBSTRING(CAST([piaac_ethmention5_code] as VARCHAR(10)),1,2),';',
SUBSTRING(CAST([piaac_ethmention6_code] as VARCHAR(10)),1,2),';',
SUBSTRING(CAST([piaac_ethmention7_code] as VARCHAR(10)),1,2)) AS ethnic_code_l2
, CONCAT(
SUBSTRING(CAST([piaac_ethmention1_code] as VARCHAR(10)),1,3),';',
SUBSTRING(CAST([piaac_ethmention2_code] as VARCHAR(10)),1,3),';',
SUBSTRING(CAST([piaac_ethmention3_code] as VARCHAR(10)),1,3),';',
SUBSTRING(CAST([piaac_ethmention4_code] as VARCHAR(10)),1,3),';',
SUBSTRING(CAST([piaac_ethmention5_code] as VARCHAR(10)),1,3),';',
SUBSTRING(CAST([piaac_ethmention6_code] as VARCHAR(10)),1,3),';',
SUBSTRING(CAST([piaac_ethmention7_code] as VARCHAR(10)),1,3)) AS ethnic_code_l3
, CONCAT(
[piaac_ethmention1_code],';',
[piaac_ethmention2_code],';',
[piaac_ethmention3_code],';',
[piaac_ethmention4_code],';',
[piaac_ethmention5_code],';',
[piaac_ethmention6_code],';',
[piaac_ethmention7_code]) AS ethnic_code_l4
, 'PIAAC' AS source_name
, 2 AS group_
, 8 AS rank_
FROM $(idicleanversion).[piaac_clean].[data]
WHERE snz_uid IS NOT NULL and piaac_ethmention1_code is not null;
/* NZCVS - see three sources now - core , incident and module, note we obtain Indian and Chinese ethnicity data, 'Other Asian' was judged to be ambiguous so not used */
/* Core*/
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [snz_uid]
,CONCAT(
SUBSTRING(CAST(nzcvs_core_ETHNICCODE1 as VARCHAR(10)),1,1),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE2 as VARCHAR(10)),1,1),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE3 as VARCHAR(10)),1,1),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE4 as VARCHAR(10)),1,1),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE5 as VARCHAR(10)),1,1),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE6 as VARCHAR(10)),1,1),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE7 as VARCHAR(10)),1,1),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE8 as VARCHAR(10)),1,1),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE9 as VARCHAR(10)),1,1),';',
(CASE WHEN nzcvs_core_ethnic_grp1_snz_ind =1 then 1 else 0 end),';',
(CASE WHEN nzcvs_core_ethnic_grp2_snz_ind =1 then 2 else 0 end),';',
(CASE WHEN nzcvs_core_ethnic_grp3_snz_ind =1 then 3 else 0 end),';',
(CASE WHEN nzcvs_core_ethnic_grp4_snz_ind =1 then 4 else 0 end),';',
(CASE WHEN nzcvs_core_ethnic_grp5_snz_ind =1 then 5 else 0 end),';',
(CASE WHEN nzcvs_core_ethnic_grp6_snz_ind =1 then 6 else 0 end),';',
(CASE WHEN nzcvs_core_DVETHN_CHINESE =1 then 4 else 0 end),';',
(CASE WHEN nzcvs_core_DVETHN_INDIAN =1 then 4 else 0 end))
AS ethnic_code_l1
, CONCAT(
SUBSTRING(CAST(nzcvs_core_ETHNICCODE1 as VARCHAR(10)),1,2),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE2 as VARCHAR(10)),1,2),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE3 as VARCHAR(10)),1,2),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE4 as VARCHAR(10)),1,2),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE5 as VARCHAR(10)),1,2),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE6 as VARCHAR(10)),1,2),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE7 as VARCHAR(10)),1,2),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE8 as VARCHAR(10)),1,2),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE9 as VARCHAR(10)),1,2),';',
(CASE WHEN nzcvs_core_DVETHN_CHINESE =1 then 42 else 0 end),';',
(CASE WHEN nzcvs_core_DVETHN_INDIAN =1 then 43 else 0 end))
AS ethnic_code_l2
,CONCAT(
SUBSTRING(CAST(nzcvs_core_ETHNICCODE1 as VARCHAR(10)),1,3),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE2 as VARCHAR(10)),1,3),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE3 as VARCHAR(10)),1,3),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE4 as VARCHAR(10)),1,3),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE5 as VARCHAR(10)),1,3),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE6 as VARCHAR(10)),1,3),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE7 as VARCHAR(10)),1,3),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE8 as VARCHAR(10)),1,3),';',
SUBSTRING(CAST(nzcvs_core_ETHNICCODE9 as VARCHAR(10)),1,3))
AS ethnic_code_l3
,CONCAT(
nzcvs_core_ETHNICCODE1,';',
nzcvs_core_ETHNICCODE2,';',
nzcvs_core_ETHNICCODE3,';',
nzcvs_core_ETHNICCODE4,';',
nzcvs_core_ETHNICCODE5,';',
nzcvs_core_ETHNICCODE6,';',
nzcvs_core_ETHNICCODE7,';',
nzcvs_core_ETHNICCODE8,';',
nzcvs_core_ETHNICCODE9)
AS ethnic_code_l4
, 'NZCVS' AS source_name
, 2 AS group_
, 9 AS rank_
FROM $(idicleanversion).[nzcvs_clean].[core]
WHERE snz_uid IS NOT NULL and nzcvs_core_ethnic_grp1_snz_ind is not null;
/* WFF */
/*primary caregiver */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [snz_uid]
, [wff_spe_l_ethnic_snz_code] AS ethnic_code_l1
, 'NA' AS ethnic_code_l2
, 'NA' AS ethnic_code_l3
, 'NA' AS ethnic_code_l4
, 'WFF' AS source_name
, 3 AS group_
, 1 AS rank_
FROM $(idicleanversion).[wff_clean].[spells]
WHERE snz_uid IS NOT NULL and wff_spe_l_ethnic_snz_code is not null;
/* partner */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [partner_snz_uid] as snz_uid
, [wff_spe_r_ethnic_snz_code] AS ethnic_code_l1
, 'NA' AS ethnic_code_l2
, 'NA' AS ethnic_code_l3
, 'NA' AS ethnic_code_l4
, 'WFF' AS source_name
, 3 AS group_
, 1 AS rank_
FROM $(idicleanversion).[wff_clean].[spells]
WHERE partner_snz_uid IS NOT NULL and wff_spe_r_ethnic_snz_code is not null;
/* SLA_SLAM_MOE - note only one l1 ethnicity response is recorded per snz_uid */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [snz_uid]
, [slam_moe_ethnic_snz_code] AS ethnic_code_l1
, 'NA' AS ethnic_code_l2
, 'NA' AS ethnic_code_l3
, 'NA' AS ethnic_code_l4
, 'SLA_SLAM_MOE' AS source_name
, 3 AS group_
, 2 AS rank_
FROM $(idicleanversion).[sla_clean].[slam_moe]
WHERE snz_uid IS NOT NULL and slam_moe_ethnic_snz_code is not null;
/* SLA_SLAM_IRD */
INSERT INTO #temp_ethnicity_list (snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_ , rank_)
SELECT distinct [snz_uid]
,SUBSTRING ([slam_ird_ethnic_snz_code], 1,1 ) AS ethnic_code_l1
, [slam_ird_ethnic_snz_code] AS ethnic_code_l2
, 'NA' AS ethnic_code_l3
, 'NA' AS ethnic_code_l4
, 'SLA_SLAM_IR' AS source_name
, 3 AS group_
, 3 AS rank_
FROM $(idicleanversion).[sla_clean].[slam_ird]
WHERE snz_uid IS NOT NULL and slam_ird_ethnic_snz_code is not null;
/* Process temp_ethnicity_list table */
/* Drop rows which don't have valid l1 ethnicity in l1 */
delete from #temp_ethnicity_list
WHERE iif (ethnic_code_l1 like '%1%'
or ethnic_code_l1 like '%2%'
or ethnic_code_l1 like '%3%'
or ethnic_code_l1 like '%4%'
or ethnic_code_l1 like '%5%'
or ethnic_code_l1 like '%6%'
,1,0)=0;
/* modify table to add EC variables */
drop table if exists #temp_ethnicity_listA;
create table #temp_ethnicity_listA
(
snz_uid INT,
ethnic_code_l1 VARCHAR(45),
ethnic_code_l2 VARCHAR(45),
ethnic_code_l3 VARCHAR(80),
ethnic_code_l4 VARCHAR(100),
source_name VARCHAR(12),
group_ INT,
rank_ INT,
final_rank int,
asian int,
se_asian int,
chinese int,
indian int,
other_asian int,
continental_euro int,
african_plus int,
middle_eastern int,
latin_american int,
melaa_plus int,
other_euro int,
maori int,
pacific int,
euro_snz int,
melaa_snz int,
total_ethnic_comm int
);
GO
INSERT INTO #temp_ethnicity_listA
(snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_, rank_)
SELECT
snz_uid, ethnic_code_l1, ethnic_code_l2, ethnic_code_l3, ethnic_code_l4, source_name, group_, rank_
FROM #temp_ethnicity_list;
drop table if exists #temp_ethnicity_list;
update #temp_ethnicity_listA
set asian = IIF (ethnic_code_l1 like '%4%', 1, 0),
se_asian = IIF (ethnic_code_l2 like '%41%', 1, 0),
chinese = IIF (ethnic_code_l2 like '%42%', 1, 0),
indian = IIF (ethnic_code_l2 like '%43%', 1, 0),
other_asian= IIF (ethnic_code_l3 like '%444%' or ethnic_code_l2 like '%40%', 1, 0),
continental_euro = IIF
(ethnic_code_l3 like '%122%'
or ethnic_code_l3 like '%123%'
or ethnic_code_l3 like '%124%'
or ethnic_code_l3 like '%125%'
or ethnic_code_l3 like '%126%'
or ethnic_code_l3 like '%127%'
or ethnic_code_l4 like '%12911%'
or ethnic_code_l4 like '%12912%'
or ethnic_code_l4 like '%12913%'
or ethnic_code_l4 like '%12914%'
or ethnic_code_l4 like '%12915%'
or ethnic_code_l4 like '%12916%'
or ethnic_code_l4 like '%12918%'
or ethnic_code_l4 like '%12919%'
or ethnic_code_l4 like '%12920%'
or ethnic_code_l4 like '%12921%'
or ethnic_code_l4 like '%12922%'
or ethnic_code_l4 like '%12923%'
or ethnic_code_l4 like '%12924%'
or ethnic_code_l4 like '%12926%'
or ethnic_code_l4 like '%12927%'
or ethnic_code_l4 like '%12928%'
or ethnic_code_l4 like '%12929%'
or ethnic_code_l4 like '%12930%'
or ethnic_code_l4 like '%12931%'
or ethnic_code_l4 like '%12932%'
or ethnic_code_l4 like '%12933%'
or ethnic_code_l4 like '%12934%'
or ethnic_code_l4 like '%12935%'
or ethnic_code_l4 like '%12937%'
or ethnic_code_l4 like '%12938%'
or ethnic_code_l4 like '%12939%'
or ethnic_code_l4 like '%12940%'
or ethnic_code_l4 like '%12941%'
or ethnic_code_l4 like '%12942%', 1 , 0),
middle_Eastern= IIF (ethnic_code_l2 like '%51%', 1, 0),
latin_American= IIF (ethnic_code_l2 like '%52%', 1, 0),
african_plus = IIF (ethnic_code_l2 like '%53%'
or ethnic_code_l4 like '%61115%'
or ethnic_code_l4 like '%61116%'
or ethnic_code_l4 like '%61117%'
or ethnic_code_l4 like '%12948%'
or ethnic_code_l4 like '%12949%'
or ethnic_code_l4 like '%12950%', 1, 0),
melaa_plus = IIF(ethnic_code_l1 like '%5%'
or ethnic_code_l4 like '%61115%'
or ethnic_code_l4 like '%61116%'
or ethnic_code_l4 like '%61117%'
or ethnic_code_l4 like '%12948%'
or ethnic_code_l4 like '%12949%'
or ethnic_code_l4 like '%12950%', 1,0),
other_euro = IIF
(ethnic_code_l2 like '%10%'
or ethnic_code_l2 like '%11%'
or ethnic_code_l3 like '%121%'
or ethnic_code_l3 like '%128%'
or ethnic_code_l4 like '%12943%'
or ethnic_code_l4 like '%12945%'
or ethnic_code_l4 like '%12947%'
or ethnic_code_l4 like '%12999%', 1 , 0),
maori = IIF (ethnic_code_l1 like '%2%', 1, 0),
pacific = IIF (ethnic_code_l1 like '%3%', 1, 0),
melaa_snz = IIF (ethnic_code_l1 like '%5%', 1, 0),
euro_snz = IIF (ethnic_code_l1 like '%1%', 1, 0)
;
/* Collapse multiple rows from same source */
DROP TABLE IF EXISTS #temp_ethnicity_list2;
select snz_uid,
max(asian) as asian,
max(se_asian) as se_asian,
max(chinese) as chinese,
max(indian) as indian,
max(other_asian) as other_asian,
max(continental_euro) as continental_euro,
max(middle_eastern) as middle_eastern,
max(latin_american) as latin_american,
max(african_plus) as african_plus,
max(melaa_plus) as melaa_plus,
max(other_euro) as other_euro,
max(maori) as maori,
max(pacific) as pacific,
max(melaa_snz) as melaa_snz,
max(euro_snz) as euro_snz,
source_name,
group_ ,
rank_,
final_rank,
total_ethnic_comm,
string_agg(ethnic_code_l1, ';') as ethnic_code_l1,
string_agg(ethnic_code_l2, ';') as ethnic_code_l2,
string_agg(ethnic_code_l3, ';') as ethnic_code_l3,
string_agg(ethnic_code_l4, ';') as ethnic_code_l4
into #temp_ethnicity_list2
from #temp_ethnicity_listA
group by snz_uid, source_name, group_, rank_, final_rank, total_ethnic_comm;
/* Identify top ranked row for each snz_uid */
with ranked as (select
*,
Row_number () OVER (PARTITION BY snz_uid ORDER BY group_ ASC, rank_ ASC) AS rankval
FROM #temp_ethnicity_list2
)
update b
set final_rank= r.rankval
FROM #temp_ethnicity_list2 as b
JOIN ranked as r on b.snz_uid = r.snz_uid and b.group_ = r.group_ and r.rank_ = b.rank_;
/* Make final table "ec_ethnicity" by only keeping top row */
DROP TABLE IF EXISTS #EC_ethnicity;
select
*
--into [IDI_Sandpit].[DL-MAA2020-47].[EC_ethnicity]
into #EC_ethnicity
FROM #temp_ethnicity_list2
where final_rank=1;
/* update ethnicity variables to NULL where there is incomplete information */
--update [IDI_Sandpit].[DL-MAA2020-47].[EC_ethnicity]
update #EC_ethnicity
set chinese = NULL where chinese = 0 and ethnic_code_l2 = 'NA' and ethnic_code_l1 like '%4%' ;
--update [IDI_Sandpit].[DL-MAA2020-47].[EC_ethnicity]
update #EC_ethnicity
set indian = NULL where indian = 0 and ethnic_code_l2 = 'NA' and ethnic_code_l1 like '%4%';
--update [IDI_Sandpit].[DL-MAA2020-47].[EC_ethnicity]
update #EC_ethnicity
set se_asian = NULL where se_asian = 0 and ethnic_code_l2 = 'NA' and ethnic_code_l1 like '%4%';
--update [IDI_Sandpit].[DL-MAA2020-47].[EC_ethnicity]
update #EC_ethnicity
set other_asian = NULL where other_asian = 0 and (ethnic_code_l2 = 'NA' and ethnic_code_l1 like '%4%')
or (ethnic_code_l3 = 'NA' and (ethnic_code_l2 like '%44%'));
--update [IDI_Sandpit].[DL-MAA2020-47].[EC_ethnicity]
update #EC_ethnicity
set middle_eastern = NULL where middle_eastern = 0 and ethnic_code_l2 = 'NA' and ethnic_code_l1 like '%5%';
--update [IDI_Sandpit].[DL-MAA2020-47].[EC_ethnicity]
update #EC_ethnicity
set latin_american = NULL where latin_american = 0 and ethnic_code_l2 = 'NA' and ethnic_code_l1 like '%5%';
--update [IDI_Sandpit].[DL-MAA2020-47].[EC_ethnicity]
update #EC_ethnicity
set continental_euro = NULL
where continental_euro = 0 and ((ethnic_code_l1 like '%1%' and ethnic_code_l2 ='NA') or (ethnic_code_l2 like '%12%' and ethnic_code_l3 ='NA') or (ethnic_code_l3 like '%129%' and ethnic_code_l4 ='NA'));
--update [IDI_Sandpit].[DL-MAA2020-47].[EC_ethnicity]
update #EC_ethnicity
set other_euro = NULL
where other_euro = 0 and ((ethnic_code_l1 like '%1%' and ethnic_code_l2 ='NA') or (ethnic_code_l2 like '%12%' and ethnic_code_l3 ='NA') or (ethnic_code_l3 like '%129%' and ethnic_code_l4 ='NA'));
--update [IDI_Sandpit].[DL-MAA2020-47].[EC_ethnicity]
update #EC_ethnicity
set african_plus = NULL
where african_plus = 0 and (((ethnic_code_l1 like '%1%' or ethnic_code_l1 like '%6%') and ethnic_code_l2 = 'NA') or
((ethnic_code_l2 like '%12%' or ethnic_code_l1 like '%61%') and ethnic_code_l3 = 'NA') or
((ethnic_code_l3 like '%129%' or ethnic_code_l1 like '%611%') and ethnic_code_l4 = 'NA'));
--update [IDI_Sandpit].[DL-MAA2020-47].[EC_ethnicity]
update #EC_ethnicity
set melaa_plus = NULL
where melaa_plus = 0 and (((ethnic_code_l1 like '%1%' or ethnic_code_l1 like '%6%') and ethnic_code_l2 = 'NA') or
((ethnic_code_l2 like '%12%' or ethnic_code_l1 like '%61%') and ethnic_code_l3 = 'NA') or
((ethnic_code_l3 like '%129%' or ethnic_code_l1 like '%611%') and ethnic_code_l4 = 'NA'));
--update [IDI_Sandpit].[DL-MAA2020-47].[EC_ethnicity]
update #EC_ethnicity
set total_ethnic_comm = CASE WHEN melaa_plus =1 or continental_euro =1 or asian = 1 THEN 1
ELSE 0 END;
--update [IDI_Sandpit].[DL-MAA2020-47].[EC_ethnicity]
update #EC_ethnicity
set total_ethnic_comm = NULL where melaa_plus is NULL and asian = 0 and continental_euro is NULL;
--update [IDI_Sandpit].[DL-MAA2020-47].[EC_ethnicity]
update #EC_ethnicity
set total_ethnic_comm = NULL where melaa_plus is NULL and asian = 0 and continental_euro = 0;
--update [IDI_Sandpit].[DL-MAA2020-47].[EC_ethnicity]
update #EC_ethnicity
set total_ethnic_comm = NULL where melaa_plus = 0 and asian = 0 and continental_euro is NULL;
/* select distinct total_ethnic_comm, melaa_plus, asian, continental_euro from [IDI_Sandpit].[DL-MAA2020-47].[EC_ethnicity] */
/* select distinct total_ethnic_comm, melaa_plus, asian, continental_euro from #EC_ethnicity */
/*clean up l1-l4 columns*/
/* code below removes excess ';'s and double ups within a column for ethicity l1-l4 */
/* L1 */
WITH cleanvals AS (
SELECT snz_uid, string_agg(VALUE, ';') as cleaned
FROM (
SELECT DISTINCT a.snz_uid,
TRIM(value) as value
FROM #EC_ethnicity a
CROSS APPLY STRING_SPLIT(ethnic_code_l1, ';') as split
where TRIM(split.value) <>''
) as filtered
GROUP BY snz_uid
)
UPDATE a
SET a.ethnic_code_l1 = b.cleaned
FROM #EC_ethnicity a
JOIN cleanvals b on a.snz_uid = b.snz_uid;
/* L2 */
WITH cleanvals AS (
SELECT snz_uid, string_agg(VALUE, ';') as cleaned
FROM (
SELECT DISTINCT a.snz_uid,
TRIM(value) as value
FROM #EC_ethnicity a
CROSS APPLY STRING_SPLIT(ethnic_code_l2, ';') as split
where TRIM(split.value) <>''
) as filtered
GROUP BY snz_uid
)
UPDATE a
SET a.ethnic_code_l2 = b.cleaned
FROM #EC_ethnicity a
JOIN cleanvals b on a.snz_uid = b.snz_uid;
/* L3 */
WITH cleanvals AS (
SELECT snz_uid, string_agg(VALUE, ';') as cleaned
FROM (
SELECT DISTINCT a.snz_uid,
TRIM(value) as value
FROM #EC_ethnicity a
CROSS APPLY STRING_SPLIT(ethnic_code_l3, ';') as split
where TRIM(split.value) <>''
) as filtered
GROUP BY snz_uid
)
UPDATE a
SET a.ethnic_code_l3 = b.cleaned
FROM #EC_ethnicity a
JOIN cleanvals b on a.snz_uid = b.snz_uid;
/* L4 */
WITH cleanvals AS (
SELECT snz_uid, string_agg(VALUE, ';') as cleaned
FROM (
SELECT DISTINCT a.snz_uid,
TRIM(value) as value
FROM #EC_ethnicity a
CROSS APPLY STRING_SPLIT(ethnic_code_l4, ';') as split
where TRIM(split.value) <>''
) as filtered
GROUP BY snz_uid
)
UPDATE a
SET a.ethnic_code_l4 = b.cleaned
FROM #EC_ethnicity a
JOIN cleanvals b on a.snz_uid = b.snz_uid;
/* drop temporary tables */
DROP TABLE IF EXISTS #temp_ethnicity_listA;
DROP TABLE IF EXISTS #temp_ethnicity_list2;
/* Final select statement */
drop table if exists $(targetdb).$(targetschema).$(projprefix)_ec_communities;
SELECT
snz_uid,
source_name,
asian,
se_asian,
chinese,
indian,
other_asian,
continental_euro,
african_plus,
middle_eastern,
latin_american,
melaa_plus,
total_ethnic_comm,
other_euro,
maori,
pacific,
euro_snz,
melaa_snz,
ethnic_code_l1,
ethnic_code_l2,
ethnic_code_l3,
ethnic_code_l4
INTO $(targetdb).$(targetschema).$(projprefix)_ec_communities
FROM #EC_ethnicity
;