Module Output
SQL: [IDI_Community].emp_assistance_programmes.assistance_programmes_YYYYMM
SAS: libname cmempas ODBC dsn=idi_community_srvprd schema=emp_assistance_programmes;
How to access a code module in the Data Lab : Read here
MSD-funded Employment Assistance
This code creates spells of participation in Employment Assistance (EA) Programmes managed/contracted by the Ministry of Social Development (MSD). The intended business key of this dataset is person, programme name and start date of participation. However, note that this code module does not cover all kinds of employment assistance programmes and services; for instance, the following are some of the exclusions from this code module-
- Non-MSD EA programmes and services- MSD is not the only agency that delivers EA interventions, other agencies include:
- Government agencies: several agencies provide EA programmes and services. These include: Corrections, Education, ACC, MBIE, Health and Education.
- NGOs: a number of non-government organisations provide their own programmes and services to help people back into work. While these may be funded by the government, if the referral and administration is entirely run by the NGO, then the NGO would need to supply this information to the IDI directly for it to be available to IDI researchers. In the IDI, information on non-MSD funded EA interventions include: Corrections (cor_clean.ra_programmes),
- Case management activities- The EA dataset does not include MSD’s in-house case management activities such as meetings with people to help them prepare and plan for moving back into employment.
- Youth Transition Services- One set of EA programmes are called Youth Transition Services (YTS). These are described in a separate code module. YTS are designed to help people transitioning from school who are at risk of not being in education, employment or training (NEET).
- COVID wage subsidy: The COVID wage subsidy participation data has been ad hoc loaded into the IDI (IDI_Adhoc.clean_read_MSD_CWS.msd_cws_*). When this information is included in a regular refresh, these participation spells will come through in the code module.
Key Concepts
What is MSD Employment Assistance?
Employment Assistance (EA) covers employment and training programmes and services designed to help people prepare for, move into and sustain employment and reduce the time they spend on income support. MSD funded EA is primarily targeted to people eligible for income support assistance. In the academic literature, EA interventions are also called Active Labour Market Programmes (ALMP), to distinguish them from income support programmes (also called Passive Labour Market Programmes), e.g.: main benefits.
Classification of EA interventions
In the literature, EA is classified in a number of different ways. Types of EA programmes can include job search programmes, wage subsidies, or literacy courses, just to name a few examples. MSD maintains its own typology of employment assistance, which is based on the underlying intervention logic for the programmes. The typology gets updated with new types of interventions as well as changes in classifications of specific interventions based on more detailed understanding of the interventions’ logics. Because EA programmes can vary in many ways (for instance - they can be targeted at different segments of the population) analysts may find metadata about EA to be useful. MSD have suggested a typology of EA based on the concept of “staircasing”. Staircasing is based on the idea that people move through a sequence of steps to move into sustained employment. The steps are:
- Availability: becoming motivated to find work.
- Preparation: undertaking activities such as training to meet requirements of the labour market
- Acquisition: process of finding and applying for jobs
- Transition: shifting into employment
- Retention: sustaining employment over the long term
- Advancement: job progression through seniority and earnings
These are the programme types associated with each staircasing step-
-
Availability
- Activation measures - cover programmes and case management techniques designed to maintain job search activities for people expected to move into employment (e.g. people receiving Unemployment-related benefits). If people are judged not to be sufficiently engaged in job search then they can have their income support payments reduced or even cancelled.
- Work confidence - Programmes designed at encouraging and motivating people to have the confidence to begin to move into employment.
- Work experience - Provide people with work experience in either a private sector employer or through placements with not-for-profit organisations to help in social or environmental projects.
-
Preparation
- Career advice - standard service provided by public employment services to help job seekers make informed decisions about their current and future employment choices.
- Training - aim to increase the foundational and vocational skills of clients to enable them to compete in the labour market. Programmes can either be directly contracted with training providers, or provide financial assistance to help low income people participate in secondary or tertiary level education.
-
Acquisition
- Job search assistance - Programmes designed to improve the job search skills of participants and to ensure that job seekers, especially short-term job seekers, are active in looking for work.
- Job placement services - In-house or contracted out services to place people into paid employment. For contracted out services are often based on a fee-for-outcome contracting model.
- Hiring wage subsidies - A temporary subsidy to compensate employers who take on disadvantaged job seekers (i.e., they would not have been the hired by the employer in the absence of the subsidy).
- Training for predetermined employment - Programmes that involve matching job seekers to vacancies by providing short-term training to meet the specific needs of an employer.
- Self-employment assistance - Assistance to help people set up their own business. Self-employment assistance can involve a combination of: training, mentoring, free capital to start up a business, and a temporary subsidy to cover living costs until business cash flow is sufficient to support the participant.
- Incentive payments - Programmes that provide payments to people if they take up employment, in particular to take up employment outside their local area.
-
Transition
- Transition to work financial support - Financial assistance to help cover initial costs of moving into employment (e.g. work clothes and equipment) or to cover the period until the person is paid by the employer.
-
Retention
- In-work support (financial) - Financial assistance to help people with disruptions to employment or pay to ensure they can continue in employment and avoid returning to main benefit.
- In-work support (pastoral) - Programmes that contact people once they are in work to see how things are progressing and to help with any issues that might arise.
- Childcare assistance - Financial payments to low income families to help cover the cost of childcare services.
- Incentive payments - Payments to people who remain in employment for set periods (e.g. 3, 6 and 12 months)
-
Advancement
- No programmes of this type have been tried in New Zealand.
This code module includes this typology information as part of the output dataset. For more details regarding typology, see de Boer & Ku, 2021. Regarding individual EA programmes, the MSD EA evidence catalogue contains detailed information on programme descriptions, current statuses, timelines of policy and design changes, cost, participant profiles, evidence on effectiveness and links to relevant research and evaluation reports.
Overlaps with other IDI data
Because of the integration of employment and income support services through the formation of Work and Income in 1998, several EA programmes are delivered through the income support payment system, in particular grants for starting work or to fund tertiary study.
Regarding the EA Source system data
Because EA information exists in more than seven MSD administrative systems, compiling information about EA interventions is not always straightforward. For this reason, extensive data cleaning is undertaken by MSD when building the EA participation datasets that are supplied to the IDI. Despite this data cleaning, there are several issues with how well EA intervention data are recorded. These issues are covered in more detailed in de Boer & Ku, 2021. Below is a short outline of some of the issues to be aware of:
-
Duplicate participation events: information on a participation event may be recorded more than once in different systems. The data cleaning attempts to remove as many of these as possible, but this has to be balanced against removing genuine repeat participation in the same programme.
-
Inconsistent system information: because recording of information about the same participant event occurs on different systems there can be inconsistencies in the data between systems. The general approach to resolving these inconsistencies is to favour the source that is most closely associated with the event itself. For example, if a contract system end date differs from the front-line system recorded end date, we take the front-line system end date.
-
Participation end dates: One difficult area of EA participation is an accurate recording of participation end dates. Either end dates are missing, or they are mis-keyed, giving either implausibly long participation spells or end dates that are earlier than start dates. Therefore, end date is often imputed either based on related record information such as expected end date or known programme duration.
But if these are not available, then end date is based on durations for participants in the same programme where end date has been entered. -
Intervention cost: the estimated cost of EA interventions is based on a cost allocation model designed to assign MSD costs to individual outputs like EA programme participation (de Boer & Ku, 2017). The module includes direct costs such as grants, contract and subsidy payments, as well as staff time and indirect costs. The metadata table IDI_metadata.clean_read_CLASSIFICATIONS.msd_empa_expenditure provides a breakdown of the specific cost components for each intervention by financial year. The dollar values are nominal, (i.e., unadjusted for inflation). The model is updated regularly. Updates include additional expenditure and outputs, but also include updates to the process of allocating costs to various cost components, based on better information or a better understanding of programme logic and financials. Note that whenever there are any changes to the cost-allocation model, these changes are applied to all financial years from 2001/2002 onwards to ensure comparability of results over time.
-
Cost information is less accurate for older interventions: For interventions starting before July 2006, there is limited financial information on the various costs of interventions and where this information is missing has had to be imputed. Similarly, information on staff time spent on administrating or running EA interventions is available from 1 January 2010 onwards. For these cost-components, the costs for participant spells starting before 1 July 2010 are imputed/estimated values.
-
Intervention names change through time: The official names of interventions do change often in response to an expansion or redesign. To maintain relevancy, the intervention names of existing participation events are updated to match the current intervention name. The EA evidence catalogue is a useful resource to find out about the relationship between old and new names.
-
Intervention names are often derived: some intervention names are not actual programme names, but instead correspond to collections of programmes (e.g. “Driver licence programmes”). This often happens where interventions are funded through a generic programme category such as “New initiative” or “Targeted Training”. These broad classifications are not useful for understanding the type of intervention being funded, and for this reason the data cleaning process assigns a more descriptive intervention name based on more detailed information such as contract descriptions.
-
Provider id: MSD sometimes contract out EA interventions or places participants with employers or sponsors. When this happens, the output table includes an anonymised provider id. This value is used to do confidentiality checks for the identification of results that may indirectly reveal information about the provider.
-
Contract and opportunity ids: a number of EA interventions are linked either to a specific contract (contract id) or an (opportunity id), these can be used to group participants who are participating in the same course (i.e., at the same time and location)
Practical Notes
NA
References & Contacts
- de Boer & Ku (2021) Effectiveness of MSD employment assistance, Technical report for 2019/2020 financial year, Ministry of Social Development, Wellington
- de Boer & Ku (2017) Service Delivery Cost Allocation Model for Individual Outputs: 2017 version, Ministry of Social Development, Wellington (MSD file ref:A9317887)
- EA evidence catalogue
Community of Interest
| Domain | Agency | Person |
|---|---|---|
| Lead SME | MSD | Marc de Boer |
| IDI analysts | MSD | Marc de Boer |
| The Treasury | Sarah Crichton | |
| MSD | Bryan Ku | |
| MSD | Jeffery Azzato | |
| Data supply | MSD | Marc de Boer |
| MSD | Bryan Ku | |
| MSD | Sam Mortlock | |
| Policy | MSD | Anne Riley |
| Operational | MSD | Arron Orr |
Module Business Rules
-
The employment assistance spells data resides in the [msd_clean].msd_employment_assistance table. The intended business key of this table should be person ID (defined by snz_swn_nbr), programme name & start date; but due to various quality issues, there are duplicates on this set of columns. Some of the known reasons are stated below-
- The most common cause for duplicates are records being repeated with different Programme Code values. MSD advises that Programme Code is a deprecated column, and can be ignored. Hence a distinct set of rows after excluding this column is used as the base dataset.
- A second cause for duplicates is individuals with the same programme name and participation start date but with different end dates. MSD advises to use the record with the later end date in such cases - this has been implemented as advised.
- A third cause is that individuals with the same programme name, participation start date & end date that get repeated with different provider IDs. It is unknown why this happens, and since this is a very minor problem, the duplicates resulting from this cause have been left as-is without resolution.
-
This employment assistance spells data has a column for participation costs, but the values are sparsely populated. Hence there is a need to impute these costs from general average values for each programme for better coverage of cost information. These are obtained from IDI_metadata.clean_read_CLASSIFICATIONS.msd_empa_expenditure table which provides the average cost per programme per financial year, broken down to specific cost components.
-
The expenditure table lists “total” costs per programme per financial year. MSD notes that “0” total costs are not actually zero-cost programmes; these should be treated as unavailable/unknown costs. The same is true for indirect costs too. However, this is not true of all kinds of costs - for instance, a “0” or NULL grant/subsidy cost means that there are no grants or subsidy costs for the programme; but only if total costs are available & non-zero.
-
The expenditure table also lists multiple line items per programme per financial year for “indirect” costs, which can be identified from the cost component type.
-
Finally, income transfer costs can be identified by cost component “grant” and “subsidy” which indicates payments made to the programme attendee.
-
The spell data is joined with the expenditure data (total costs, indirect costs & transfer costs) on the basis of programme name and programme start date. Wherever total costs are available directly in the spell data, these are used. Else, total costs are imputed from the expenditure table based on the average for a programme, for the financial year in which the programme start date falls.
-
The indirect and transfer costs are not available at a “spell” level, so the average values from expenditure data is used per spell and added into the final dataset.
-
Finally, the typology information regarding the employment assistance programme is also obtained from the relevant metadata table and added in.
Open Issues/Comments
-
The output of this code module will have a small number of duplicates on the Social Welfare number(SWN), Programme Name & Programme Start date combination, due to unresolved quality issues regarding Provider IDs for the programme.
-
Whenever the cost for a programme participation is not directly available from the [msd_clean].msd_employment_assistance base table, it will be a rough estimate obtained from an average value per person per financial year for that programme. Differences can be expected between the two quantities for a programme, even for the same financial year.
-
There are cases where the cost information (whether direct estimates or derived from average costs) are implausibly large; in the order of hundreds of thousands or even millions. These are most likely incorrect estimates derived from the Cost Allocation Model - due to trailing spend for some programmes with very low number of participants. It is recommended that the users identify outliers and remove those before using the cost information. Work is currently underway to add a Std. Deviation column into the expenditure metadata to allow the user to flag incorrect estimates more accurately.
-
There are a few cases where the end date for participation is a future date (sometimes 9999/12/31). These are most likely a placeholder date, and does not mean that the programme has been completed. MSD advises that the participation end dates for employment assistance spells are unreliable as this information is either missing or mis-keyed. Exceptions in some cases are where the programme has a fixed duration (such as Limited Services Volunteer) and the participant has finished the programme. However, we are also unable to tell whether a person has finished a programme from this dataset. Use these with caution.
-
There are cases where the programme duration is implausibly long; this is also quite likely a result of mis-keyed end dates or high end dates where the programme has not been completed yet.
-
Contract ID and Opportunity ID are currently Null, but will be populated at a future date.
-
Note that all costs of employment programmes are attributed to the programme start date. There may be programmes that can run for a long time, and accrue costs through this time period, but the logic in this code module assigns the costs upfront to the programme start date. This is done partly because the programme end dates are unreliable, and it is not known which events are long running versus have high end dates due to data quality reasons.
Parameters
The following parameters should be supplied to this module to run it in the database:
- {targetdb}: The SQL database on which the spell datasets are to be created.
- “{idicleanversion}”: The IDI Clean version that the spell datasets need to be based on.
- {targetschema}: The project schema under the target database into which the spell datasets are to be created.
- {projprefix}: A (short) prefix that enables you to identify the spell dataset easily in the schema, and prevent overwriting any existing datasets that have the same name.
Dependencies
{idicleanversion}.[msd_clean].[msd_employment_assistance]
IDI_Metadata.[clean_read_CLASSIFICATIONS].[msd_empa_information]
IDI_Metadata.[clean_read_CLASSIFICATIONS].[msd_empa_expenditure]
Outputs
This module has been deployed within the Data Lab and is available as refresh-timestamped tables under the following database and schema:
[IDI_Community].[MSD_EMP_ASSISTANCE]
Alternatively, users who wish to run the script themselves will find output created at:
{targetdb}.{targetschema}.{projprefix}_msd_emp_assistance
{targetdb}.{targetschema}.{projprefix}_msd_emp_assistance
Variable Descriptions
| Column name | Description |
|---|---|
| data_source | A tag signifying a source dataset description (hard-coded to MSD_EMPLOYMENT_ASSISTANCE") |
| snz_uid | The unique STATSNZ person identifier for the the individual |
| snz_msd_uid | The MSD identifier for an individual, encrypted from SWN number. Invariant across refreshes. |
| snz_swn_nbr | The MSD source system identifier for an individual. This is an encrypted version of SWN number before STATSNZ applies cleaning & deduplication rules to derive snz_msd_uids, and the closest representation of MSD’s own person identifier. |
| participation_sd | The date on which the individual starts participating in the employment assistance programme |
| participation_ed | The date on which the individual end participating in the employment assistance programme. MSD warns that the quality of this variable is poor due to data entry issues. There are also instances of “high” dates (where the programme has not been completed at the time of data capture) or future dates. |
| prog_name | The name of the employment assistance programme that the individual attended. |
| prog_type_lev1 | A typology or classification of the programme |
| prog_type_lev2 | A typology or classification of the programme, at a higher hierachy. Can be nulls. |
| prog_type_lev3 | A typology or classification of the programme, at a higher hierachy. Can be nulls. |
| intvn_total_cost | The cost of the programme, per person per start. Note that this is a lump-sum cost and not a daily rate. If this data is unavailable for the individual, it is imputed from an average cost value per person for that programme & financial year. If this is also unavailable, then it is NULL. The costs are sourced from an estimation model at MSD - and may be different from the actual cost incurred. |
| intvn_cost_source | If the cost was directly available in the data, the value is “DIRECT_ESTIMATE”. Else if it was imputed from average cost per programme, the value is “FY_AVGCOST”. Else “UNKNOWN_COST”. |
| intvn_avg_indirect_cost | This cost is an estimate obtained from the average “indirect” costs per programme per financial year. The “indirect” costs for a programme pertain to administrative & overhead costs for running the programme, averaged to a per-person per-start basis. Can be NULL when there is no information available regarding this - NULLs should not be interpreted as no/zero indirect cost. |
| intvn_avg_transfer_cost | This cost is an estimate obtained from the average “transfer” costs per programme per financial year. The transfer cost refers to payments made to the programme attendee as subsidies or grants. Can be NULL when there is no information available regarding this - NULLs should not be interpreted as no/zero transfer cost. |
| provider_uid | The employment assistance programmes are either conducted by MSD directly, or contracted out to a service provider or employer. This ID gives the identifier for the provider (which can be used for %p suppression rules if required). |
| msd_empa_opportunity_id | ID to identify a single instance or event of a programme (such as a training course) that the individual attended, at a date and location. Unavailable as of now. |
| msd_empa_contract_id | ID for individual contracts with providers of programmes. Not all programmes are contracted out. Unavailable as of now. There may have been multiple contracts with the same provider over time or at the same time (for different programmes). |
Module Version & Change History
| Date | Version Comments |
|---|---|
| 30 June 2022 | Initial version based on specifications from Commissioning document |
| 24 March 2024 | Update to use the reference tables located in IDI_Metadata_YYYYMM |
Code
:setvar targetdb "{targetdb}"
:setvar targetschema "{targetschema}"
:setvar projprefix "{projprefix}"
:setvar idicleanversion "{idicleanversion}"
:setvar idirefresh "{idirefresh}"
/* Assign the target database to which all the components need to be created in. */
USE $(targetdb);
/* Delete the database object if it already exists */
DROP VIEW IF EXISTS $(targetschema).$(projprefix)_msd_emp_assistance;
GO
/* Create the database object */
create view $(targetschema).$(projprefix)_msd_emp_assistance as
/* Clean up the expenditure data, create proper start & end dates for the financial year.*/
with expenditure as (
select
financialyear
,msd_empa_prog_name_text
,component_type,component
,cast(averagecost as numeric(28,4)) as averagecost
,datefromparts(left(financialyear, 4), 7, 1) as fy_start_date
,datefromparts(right(financialyear, 4), 6, 30) as fy_end_date
from IDI_Metadata_$(idirefresh).msd_empa.expenditure23_ref
)
/* Create a temporary dataset with only the total cost per programme per financial year. Any zero costs should be treated as NULL.*/
,totalexpend as (
select
msd_empa_prog_name_text
,case when averagecost = 0.0 then NULL else averagecost end as totalcost
,fy_start_date
,fy_end_date
,financialyear
from expenditure
where trim(component_type) = 'total'
)
/* MSD advises that all programmes have an indirect cost - if the indirect costs are not available in the metadata table, or if it is zero, treat these as unavailable - represented by NULL.
Obtain a dataset with the total of all "Indirect" costs for each programme per financial year.*/
,indirexpend as (
select
financialyear
,msd_empa_prog_name_text
,fy_start_date
,fy_end_date
,case when sum(averagecost) = 0.0 then NULL else sum(averagecost) end as total_indirect_cost
from expenditure
where trim(component_type) = 'Indirect'
group by financialyear, msd_empa_prog_name_text, fy_start_date, fy_end_date
)
/* Wherever total costs are available (not NULL), obtain a dataset with the total of all "Transfer" costs for each programme per financial year. If transfer costs do not exist where a total cost exists,
this means there are no transfer costs - treat it as 0.0.*/
,transferexpend as (
select
tot.msd_empa_prog_name_text
,tot.financialyear
,tot.fy_start_date
,tot.fy_end_date
,coalesce(trfr.total_transfer_cost, 0.0) as total_transfer_cost
from totalexpend tot
left join
(
select
financialyear
,msd_empa_prog_name_text
,coalesce(sum(averagecost), 0.0) as total_transfer_cost
from expenditure expnd
where trim(component_type) in ('Grant', 'Subsidy')
group by financialyear, msd_empa_prog_name_text
) trfr on (tot.msd_empa_prog_name_text = trfr.msd_empa_prog_name_text and tot.financialyear = trfr.financialyear)
where tot.totalcost is not null
)
/* De-duplicate the employment assistance spells after excluding the programme code and the unique number. If duplicates still exist, take the spell with the longer duration.
If there are multiple spells for the same person for the same start date for the same programme with the same duration - keep those duplicates anyway since there is no good way
to circumvent it.*/
,empast as (
select distinct
snz_uid
,snz_msd_uid
,msd_empa_participation_start_date
,msd_empa_participation_end_date
,msd_empa_prog_name_text
,msd_empa_assistance_amt
,msd_empa_provider_uid
,snz_swn_nbr
,msd_empa_opportunity_id
,msd_empa_contract_id
,rank() over (partition by snz_swn_nbr, msd_empa_prog_name_text, msd_empa_participation_start_date order by msd_empa_participation_end_date desc) as rnk
from $(idicleanversion).msd_clean.msd_employment_assistance
)
/* The full employment assistance spells data excluding the outlier indicator.*/
select
'MSD_EMPLOYMENT_ASSISTANCE' as data_source
,empast.snz_uid
,empast.snz_msd_uid
,empast.snz_swn_nbr
,empast.msd_empa_participation_start_date as participation_sd
,empast.msd_empa_participation_end_date as participation_ed
,empast.msd_empa_prog_name_text as prog_name
,info.catagory_level1 as prog_type_lev1
,case when trim(info.catagory_level2) = '' then NULL else info.catagory_level2 end as prog_type_lev2
,case when trim(info.catagory_level3) = '' then NULL else info.catagory_level3 end as prog_type_lev3
,coalesce(empast.msd_empa_assistance_amt, expndtot.totalcost) as intvn_total_cost
,case
when empast.msd_empa_assistance_amt is null and expndtot.totalcost is not null then 'FY_AVGCOST'
when empast.msd_empa_assistance_amt is null and expndtot.totalcost is null then 'UNKNOWN_COST'
else 'DIRECT_ESTIMATE'
end as intvn_total_cost_source
,expndindr.total_indirect_cost as intvn_avg_indirect_cost
,expndtrfr.total_transfer_cost as intvn_avg_transfer_cost
,empast.msd_empa_provider_uid as provider_uid
,empast.msd_empa_opportunity_id as opportunity_id
,empast.msd_empa_contract_id as contract_id
from (select * from empast where rnk = 1) empast
left join IDI_Metadata_$(idirefresh).msd_empa.information23_concord info
on (empast.msd_empa_prog_name_text = info.msd_emp_prog_name_text)
left join totalexpend expndtot
on (empast.msd_empa_prog_name_text = expndtot.msd_empa_prog_name_text
and empast.msd_empa_participation_start_date between expndtot.fy_start_date and expndtot.fy_end_date)
left join indirexpend expndindr
on (empast.msd_empa_prog_name_text = expndindr.msd_empa_prog_name_text
and empast.msd_empa_participation_start_date between expndindr.fy_start_date and expndindr.fy_end_date)
left join transferexpend expndtrfr
on (empast.msd_empa_prog_name_text = expndtrfr.msd_empa_prog_name_text
and empast.msd_empa_participation_start_date between expndtrfr.fy_start_date and expndtrfr.fy_end_date);
GO