Outputs:
SQL: [IDI_Community].[edu_highest_nqflevel_spells].[highest_nqflevel_spells_202603]
SAS: libname cm_highest_nqflevel_spells dsn=IDI_community_srvprd schema=edu_highest_nqflevel_spells; run ;
How to access a code module in the Data Lab: Read here
Inputs:
| Dependency | Dependency Type |
|---|---|
[edu_quals_census_2013].[quals_census_2013] |
Code module |
[edu_quals_census_2018].[quals_census_2018] |
Code module |
[edu_quals_census_2023].[quals_census_2023] |
Code module |
[edu_quals_moe_enr_prereq].[quals_moe_enr_prereq] |
Code module |
[edu_quals_moe_industry_train].[quals_moe_industry_train] |
Code module |
[edu_quals_moe_school_leavers].[quals_moe_school_leavers] |
Code module |
[edu_quals_moe_secondary].[quals_moe_secondary] |
Code module |
[edu_quals_moe_tec_prereq].[quals_moe_tec_prereq] |
Code module |
[edu_quals_moe_tertiary].[quals_moe_tertiary] |
Code module |
[edu_quals_moe_ttr_prereq].[quals_moe_ttr_prereq] |
Code module |
[edu_quals_msd].[quals_msd] |
Code module |
[tbd_cen2013_quals].[cen2013_quals] |
Code module |
[tbd_cen2018_quals].[cen2018_quals] |
Code module |
[tbd_cen2023_quals].[cen2023_quals] |
Code module |
[tbd_moe_ito_quals].[moe_ito_quals] |
Code module |
[tbd_moe_schoolleaver_quals].[moe_schoolleaver_quals] |
Code module |
[tbd_moe_secondary_quals].[moe_secondary_quals] |
Code module |
[tbd_moe_tertiary_quals].[moe_tertiary_quals] |
Code module |
[IDI_Clean_202603].[cen_clean].[census_individual_2013] |
Source |
[IDI_Clean_202603].[cen_clean].[census_individual_2018] |
Source |
[IDI_Clean_202603].[cen_clean].[census_individual_2023] |
Source |
[IDI_Clean_202603].[moe_clean].[completion] |
Source |
[IDI_Clean_202603].[moe_clean].[enrolment] |
Source |
[IDI_Clean_202603].[moe_clean].[student_leavers] |
Source |
[IDI_Clean_202603].[moe_clean].[student_qualification] |
Source |
[IDI_Clean_202603].[moe_clean].[targeted_training] |
Source |
[IDI_Clean_202603].[moe_clean].[tec_it_learner] |
Source |
[IDI_Clean_202603].[msd_clean].[msd_education_history] |
Source |
[IDI_Metadata].[clean_read_CLASSIFICATIONS].[CEN_QFIELD2_V3_0] |
Source |
[IDI_Metadata].[clean_read_CLASSIFICATIONS].[CEN_QFIELD] |
Source |
[IDI_Metadata].[clean_read_CLASSIFICATIONS].[CEN_STDHQ06] |
Source |
[IDI_Metadata_202603].[moe_school].[highest_attainment_code] |
Source |
[IDI_Metadata_202603].[moe_school].[qualification_concord] |
Source |
Script: create_highest_nqflevel_spells.sql
Output Table
SQL: [IDI_Community].edu_highest_nqflevel_spells.highest_nqflevel_spells_YYYYMM
SAS: libname cm_hqual ODBC dsn=IDI_community_srvprd schema=edu_highest_nqflevel_spells;
How to access a code module in the Data Lab: Read here
Context
This script creates highest NZQF level spells for individuals as gathered from multiple education data sources. For every person, this dataset gives the time period for which an NZQF level gained was the highest ever achieved by the person until the date it gets superseded by a higher NZQF. The business key of the dataset is one row per snz_uid per max_NQFlevel_sofar.
Key Business Rules
This dataset is derived off a few other qualification spell datasets - refer the “Dependencies” section for these sources, and the business logic employed in the sources. This dataset combines qualification data gathered from the Ministry of Education (MoE), the Ministry of Social Development (MSD), the Census and from Tertiary Education Commision (TEC). The steps in deriving max NZQF level spells are as follows:
- Combine all NZQF data and dates from the source datasets
- For each individual, sort the available data by qualification attained/recorded date - then assign the highest NZQF level as on the qualification attained date for each record. Hence if a lower level NZQF is attained after a higher NZQF level, this lower value gets superseded by the higher value from the previous records.
- From the above dataset, obtain the earliest date for which a new NZQF level supersedes a lower NZQF level
- Construct spells based on this earliest date.
Entity Counting for output checking
In order to output check the results, entity counts will be required. We have added entities to the output dataset to make this easier.
However, there are some key points to note:
- The entities are only available in some of the input datasets, specifically:
- moe_schoolleaver_quals
- moe_tertiary_quals
- moe_ito_quals
- The entities are not available for the following input tables:
- moe_secondary_quals (there is a field for school but its almost always 9999 so we are treating it as blank)
- msd_quals
- cen2013_quals
- cen2018_quals
- cen2023_quals
- moe_quals_tec_prereq
- moe_quals_enr_prereq
- moe_quals_ttr_prereq
- For the inputs where there are no entities, an entity variable has been added but it is left as NULL
- The rules for output checking are that entity counts are required if the input data contains entities and are not required if the
input data doesn’t include entities. - This module includes both types of sources so we’ve worked out the following approach with Stats:
- An entity ■■■■■■■■■■ or more = data NOT suppressed
- An entity ■■■■■■■■■■ = data suppressed
- An entity ■■■■■■■■■■ = data NOT suppressed
- To make this work, when you are counting entities count NULL as 0.
- Where there is a mixture of actual entities and NULL entities, the code will keep the actual entity.
Inputs
- {targetdb}: The SQL database on which the spell datasets are to be created.
- {targetschema}: The project schema under the target database into which the spell datasets are to be created.
- {projprefix}: A (short) prefix that enables you to identify the spell dataset easily in the schema, and prevent overwriting any existing datasets that have the same name.
- {idicleanversion}: The IDI Clean version that the spell datasets need to be based on.
Version Log
| Date | Version Comments |
|---|---|
| 15 September 2020 | Initial version (Vinay Benny) |
| 20 March 2024 | Addition of entities to the output datasets |
| 16 April 2024 | Changed the approach to entities for 1_moe_quals_secondary_qualifications |
| 13 March 2025 | Add in census 2023 |
Community of Interest
| Person/Artifact/Agency | Involvement |
|---|---|
| NA |
Code
:setvar targetdb "{targetdb}"
:setvar targetschema "{targetschema}"
:setvar projprefix "{projprefix}"
:setvar idicleanversion "{idicleanversion}"
USE $(targetdb);
GO
/* Put all the data sources together*/
DROP TABLE IF EXISTS #allspells;
SELECT *
INTO #allspells
FROM (
SELECT snz_uid
, data_source
, NULL AS entity
, [qual_attained_date]
, [qual_expiry_date]
, nzqflevel
FROM $(targetdb).$(targetschema).$(projprefix)_moe_secondary_quals
UNION ALL
SELECT snz_uid
, data_source
, provider_code AS entity
, [school_leaving_date]
, [qual_expiry_date]
, equiv_nqf_level
FROM $(targetdb).$(targetschema).$(projprefix)_moe_schoolleaver_quals
UNION ALL
SELECT snz_uid
, data_source
, provider_code AS entity
, [completion_date]
, [qual_expiry_date]
, [nzqflevel]
FROM $(targetdb).$(targetschema).$(projprefix)_moe_tertiary_quals
UNION ALL
SELECT snz_uid
, data_source
, provider_code AS entity
, [qual_attained_date]
, [qual_expiry_date]
, [nzqflevel]
FROM $(targetdb).$(targetschema).$(projprefix)_moe_ito_quals
UNION ALL
SELECT snz_uid
, data_source
, NULL AS entity
, [qual_recorded_date]
, [qual_expiry_date]
, [nzqflevel]
FROM $(targetdb).$(targetschema).$(projprefix)_quals_msd
UNION ALL
SELECT snz_uid
, data_source
, NULL AS entity
, [qual_recorded_date]
, [qual_expiry_date]
, high_qual_nqf
FROM $(targetdb).$(targetschema).$(projprefix)_cen2013_quals
UNION ALL
SELECT snz_uid
, data_source
, NULL AS entity
, [qual_recorded_date]
, [qual_expiry_date]
, high_qual_nqf
FROM $(targetdb).$(targetschema).$(projprefix)_cen2018_quals
UNION ALL
SELECT snz_uid
, data_source
, NULL AS entity
, [qual_recorded_date]
, [qual_expiry_date]
, high_qual_nqf
FROM $(targetdb).$(targetschema).$(projprefix)_cen2023_quals
UNION ALL
SELECT snz_uid
, data_source
, NULL AS entity
, qual_recorded_date
, qual_expiry_date
, nzqflevel
FROM $(targetdb).$(targetschema).$(projprefix)_quals_moe_enr_prereq
UNION ALL
SELECT snz_uid
, data_source
, NULL AS entity
, qual_recorded_date
, qual_expiry_date
, nzqflevel
FROM $(targetdb).$(targetschema).$(projprefix)_quals_moe_tec_prereq
UNION ALL
SELECT snz_uid
, data_source
, NULL AS entity
, qual_recorded_date
, qual_expiry_date
, nzqflevel
FROM $(targetdb).$(targetschema).$(projprefix)_quals_moe_ttr_prereq
) AS a;
/*Turn it into spells and pick the highest one at each point*/
SELECT snz_uid
, max_nqflevel_sofar
, min_qual_attained_date AS nqf_attained_date
, COALESCE(
DATEADD(dd, -1,
LEAD(min_qual_attained_date) OVER (
PARTITION BY snz_uid
ORDER BY max_nqflevel_sofar
)
)
,DATEFROMPARTS(9999, 12, 31)
) AS until_date
INTO #highest_qual
FROM (
SELECT snz_uid
, max_nqflevel_sofar
, MIN(qual_attained_date) AS min_qual_attained_date
FROM (
SELECT a.snz_uid
, a.data_source
, a.qual_attained_date
, a.qual_expiry_date
, a.nzqflevel
, MAX(a.nzqflevel) OVER (
PARTITION BY a.snz_uid
ORDER BY a.qual_attained_date
) AS max_nqflevel_sofar
FROM #allspells a
) x
GROUP BY snz_uid
, max_nqflevel_sofar
) y ;
/*Take the maximum entity value for each nzqflevel*/
DROP TABLE IF EXISTS #entity_unique ;
SELECT snz_uid
, nzqflevel
, MAX(entity) AS entity
INTO #entity_unique
FROM #allspells
GROUP BY snz_uid
, nzqflevel;
/*Put the nzqflevel spells and entities together for the final output.*/
SELECT a.snz_uid
, a.max_nqflevel_sofar
, a.nqf_attained_date
, a.until_date
, b.entity
INTO [$(targetdb)].[$(targetschema)].[$(projprefix)_highest_nqflevel_spells]
FROM #highest_qual AS a
LEFT JOIN #entity_unique AS b
ON a.snz_uid = b.snz_uid
AND a.max_nqflevel_sofar = b.nzqflevel
;