Module output
SQL::[IDI_Community].[inc_self_employment]. self_employment_YYYYMM
SAS: libname cmti ODBC dsn=idi_community_srvprd schema=inc_self_employment; proc print data = cmti.self_employment_YYYYMM
How to access a code module in the Data Lab : Read here
Purpose of Self employment (SE) module
The purpose of this module is to construct the total self-employment (SE) income for people in New Zealand.
Key concepts
Self-employment income
Self-employment income is the net profit or loss received from all current and previous self-employment jobs held over the reference period, including drawings (cash or goods the respondent takes out of the business instead of receiving a ’wage’). Self-employment income covers schedular payments, company director income, partnership income, sole trader income (excluding rental income) and shareholder-employee salary without PAYE deducted earned by business owners.
Total income spell
An income spell is the minimum payment period per income type. SE income in practice may be paid in a daily, weekly, fortnightly, monthly spell etc.
Source frequency
Source frequency is the frequency with which an income type is stored in the data. Source frequency for SE income is monthly when it is paid as schedular payments and identified in EMS and annual when it is filed in IR3 returns.
Gross income amount
For the purpose of this module, gross income is Self -employment income and income before tax for schedular payments.
Observed tax
Tax recorded as paid to IRD as part of the income filing process. Observed tax data for SE income is spread across multiple IR tables in IDI, including EMS and IR3. This is the amount of tax that has been has been deducted from source (eg: PAYE and tax filing).
Note: observed tax deduction is not the same as actual tax liability since this can only be calculated once all income sources have been identified over a tax period. This information has not been included in the initial version of the module.
Granularity
The modules are setup to be event format: One row per event, events have a person, start and end dates, and descriptive information. The variables below will uniquely identify each event in the above table:
Entity id: snz uid
Period: start end, end date
Event information: income source, taxable income indicator
Data sources
IR3: ird_rtns_keypoints_ir3, ir_ir3_2000_to_2014, ir_ir3_2013_to_2020
IR3 is the income tax return for individuals used to confirm the amount of personal income tax to be paid at the end of each tax year with data available from March 2000 to present. The main purpose of IR3 is to include any monetary payments that have not been taxed at source. IR3 is used for self-employment (filed annually by sole traders) and includes non-zero partnership, or shareholder salary income, as well as rental income. IR3 is mandatory for individuals who earned more than $200 (before tax) income other than salary, wages, interest, dividends, and/or taxable Māori authority distributions. IR3 is filed annually.
There may be a substantial lag in reporting IR3 data as individual businesses and those using tax agents have up to a year after the end of the financial year to file an annual tax return with IR.
IR3 data comes from a different table to the EIE data but is still direct information submitted by a customer
Note: 3 IR2 tables (2 ad-hoc tables and 1 clean) are required for creating the timeseries of IR3 data. IR3s belonging to people who reported no self-employment, partnership or shareholder income were excluded for tax years prior to 2019. They are in the ad-hoc tables.
IR4: ird_attachments_ir4s
IR4 is filed by companies and includes remuneration income paid to shareholders, directors, and relatives of shareholders with data available from March 1995 to present. This table holds information on non-zero shareholder income and shows non-PAYE salary payments made to working proprietors of businesses. IR4 and IR4s is required from all active NZ resident companies. IR4S is filed annually.
IR4 data comes from a different table to the EIE data but is still direct information submitted by a customer
IR20 (ird_attachments_ir20
IR7 (formerly IR20) is filed for partnership and look-through companies with data available from March 1995 to present. It is filed annually. This table contains information about zero and non-zero partnership income.
Employer monthly schedule: ir_clean.ird_ems
Employer monthly schedule (EMS) table contains employee level data from the EMS return. Regular earnings of some self-employed persons can be reported in the EMS as wages or salaries. The earnings of independent contractors in the EMS are known as withholding or schedular payments.
EMS is available in the LBD/IDI. This is a monthly tax return filed by all businesses with paid employees, which summarises the monthly wage and salary payments made to each of their employees, and the ‘pay-as-you-earn’ (PAYE) income tax deductions made. EMS is a mandatory reporting requirement for all employing firms.
Since April 2019, Inland Revenue has shifted from monthly filings under the EMS to compulsory payday filing under the Employment Information - Employee (EI-E) system. This follows a transition period since April 2018 in which firms could select whether to file monthly or through the payday filing system. For consistency, Stats NZ has made a processed version of the EI-E data available, allocating payday information to calendar months and including these records as part of the EMS tables.
Note:
- Employers are able to file late returns and/or amend EMS returns relating to prior periods.
- The return period associated with an EMS is not always the same as the month in which a person was employed because it records the month in which they were paid.
- Each record in the EMS corresponds to a job (an employer-employee relationship) and includes the employee’s tax code and employment start or end dates if they are in the month in which they were paid.
- The EMS form is also used by government agencies to report government transfer payments to individuals that are taxed at source. They include income-tested benefits, New Zealand superannuation, student allowances, paid parental leave, and accident compensation payments.
- No ‘cleaning’ is done before the data leaves IR in the EIE file. The data is as it is submitted by employers and/or drawn directly from IRs admin system. If errors in the information provided by employers is identified (either by IR or the employer), then the underlying record will be updated, The next file will show this change. All data points in the files supplied by IR are either drawn directly from a return/other ‘file’ a customer has submitted, or it is system generated. From the EIE file for example, information about the type of return, processing date, return version, customer type/subtype etc is system generated. Information about pay periods, employment periods, earnings, deductions, and contributions come directly from the return.
ird_customers
The IR customers file records all Inland Revenue’s entities, their description, status (active, deceased, etc), commencement/cessation, student loan/tax residency status and business industry code where relevant.
A full file is extracted each month containing all new records and changes to existing records.
The information in this file is not derived from an IR form. It is a combination of different fields that are updated when this specific customer information changes. IR can be notified by post, in person, by phone, email, myIR, secure mail.
The population is all entities that engage with IR. At the highest level it can be:
- a person
- a company
- an incorporated or unincorporated society or club
- a joint venture or partnership
- a trustee or a trust or estate
- a public or local authority.
ird_tax_registrations:
The IR tax registration file records the status of an entities tax type registration with IR. It contains start/end dates, tax return filing frequencies, if the entity has a tax agent, if they are a corporate entity and if they file electronically.
The population is all entities that engage with IR. At the highest level it can be:
- a person
- a company
- an incorporated or unincorporated society or club
- a joint venture or partnership
- a trustee or a trust or estate
- a public or local authority.
ird_cross_reference:
The IR cross reference file links two IRD numbers, documents the period they are linked and describes the relationship between them.
The information in this file is not derived from an IR form. It is a combination of different fields that are updated when relationships between entities change. A change can be submitted by post, in person, by phone, email, myIR, secure mail.
The population is individual to business and business to business relationships between IR entities
Practical notes
Period
Self employment income information is from 1999 onwards.
Identification of Earnings not liable to ACC levy and bonuses
All employees must pay an ACC earners’ levy to cover the cost of non work-related injuries. IR collect this on behalf of the ACC. The earners’ levy has been built into the PAYE tables and is deducted along with the tax. Almost all earnings subject to PAYE are liable for the levy with the exception of schedular payments, retirement payments, redundancy payments, benefits from an employee share scheme, jury fees, witness fees, and taxable pensions. See IR employers guide for more information. Inclusion of earnings not liable for ACC levy is not essential for this module.
Additionally, we will not identify bonus in this module as that is not easy to identify in IDI due to quality of the lumpsum payment flag, as not all employers update it consistently.
Other exclusions from this module
We are not able to identify and quantify income support payment off-sets, debt repayments and re-directions.
Negative amounts
No exclusions made from EMS for negative amounts.
Delay in filing returns by contractors
There is a time lag to pay ACC levy, income tax, and/or Student loan after receiving their income. Those unable to pay the full amount may choose a monthly repayment schedule for a fixed period. The timing of repayments will affect the net income amount. In this module, we only report on gross income and do not capture other deductions to arrive at net income.
Other time lags can be due to individuals delayed filing of income, source agency delay in providing information to Stats NZ, or Stats NZ delay in adding information received from source agency into IDI. eg: Filing happens up to 12 months following tax year and then they have to wait for the IDI refresh. This may explain some dips in an income time series towards the end of the tax year. Due to this, historic information is more likely to be complete. Data from 12 - 18 months in the past it is likely to not have complete income deductions and tax information.
Seasonal workers identification
Not applicable total income module. Refer to employment spells module for more information
Staged approach to include tax within the total income module
Stage 1 of module creation will only include gross income amounts. Stage 2 (pending) will include observed tax deducted at source since they are available in EMS and MSD tables. This can be used to calculate the tax at a tax year level which includes tax deducted at source and from tax filings. Stage 3 (pending) aims to derive tax at a monthly level. This is complex and will require support from SME to apply rules to derive tax. This can be used to calculate tax for any time period.
Sources of Validation
No validation dataset available, hence unable to validate the module output.
References
- IDI Derived tables
- Document : experimental-income-estimates-from-linked-admin-data-methods-and-results-revised.pdf
- Timelines at the end of the tax year
- Individual income tax return - IR3
- ir3-2024.pdf
- Extension of time arrangements
- ir4-2024.pdf
- ir4s-2024.pdf
- Income tax for partnerships
- File a Partnerships and look-through companies income tax return – IR7
- ir7-2024.pdf
- File a Partnership income loss attribution – IR7P
- ir7p-2024.pdf
- Tax types - Choosing the right account type
- ACC levies set for 2023, 2024 and 2025
- Paying your tax
Key Contacts
| Role | Organisation | Person |
|---|---|---|
| Dev lead | MBIE | Eric Krassoi Peach |
| Module Coder | Nicholson Consulting | Linda Martis |
| Code Supply | Stats NZ MSD | Ivan Welsh Marc de Boer |
| Peer review (code) | Stats NZ | Rodney Jer |
| Peer review (documentation) | IRD IRD | Joanne Butterfield Scott Anderson |
| Module Steward | Stats NZ | Cori Qian |
Module business rules
Defining spell start and end dates
Start and end dates are based on EMS, IR3, IR4s, IR7 dates. The return period column in the tables is considered to be the end date and the start of that period (monthly for EMS and annual tax year for other tables) will be the start date.
Person identifier snz_uid
A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh.
Minimum income threshold
Currently no minimum threshold applied.
Interoperability with other modules
This module is in the business landscape so could be that it is most closely associated with business profile modules and employment spells. It is also in the socioeconomic landscape where income is a determinant of positive social outcomes, so associated with social housing, education modules, health modules, driving modules, employment assistance modules etc.
Duplication in data
APC code removes duplication in WHP appearing in IR3 data and WHP, S02, P02, C02 appearing in EMS data. Individuals can have multiple income types (eg: sole trader and partner receiving PAYE deducted income) in one financial year. These are not treated as duplicates as they have different amounts.
Taxable income indicator
Taxable income indicator to be hard coded in modules. Within self employment module, all income is taxable.
Code logic:
The code is based on the rules used for APC and derived data tables in the IDI.
Steps 1 to 3 are mainly collating data from various IDI source tables. Rest of the steps are identifying wages and salary and excluding self-employment paye deducted income.
-
We use ir_ems table to obtain wages and salary data with the intention of identifying and relabelling self-employed income i.e, individuals earning PAYE deducted or withholding tax deducted income.
1.1 Individuals will be relabelled as ‘C01’ - Company director/shareholder receiving PAYE deducted income, ‘P01’ - Partner receiving PAYE deducted income, ‘S01’ - Sole Trader receiving PAYE deducted income, ‘C02’ - Company director/shareholder receiving WHT deducted income, ‘P02’ Partner receiving withholding tax deducted income and ‘S02’ Sole Trader receiving withholding tax deducted income.
1.2 We extract all monthly data from ir_ems that have ir_ems_income_source_code ‘W&S’ and ‘WHP’, which stands for wages and salary and withholding payment, and where return period between 2000 and 2024 (latest year). Period starts at 2000 as first records of the EMS table begin in April 1999.
1.3 Withholding type code will be P (paye deductions) or W (withholding payments). Where an individual has both codes for a given period, we assign the code P.
-
Identify self-employed individuals from 3 IR3 tables. This is for the purpose of relabelling individuals who appear in EMS data, and also to identify other self-employed individuals. To do so, we will
2.1 Extract data from IR3 tables, specifically ir_clean.ird_rtns_keypoints_ir3, ir_ir3_2000_to_2014 and ir_ir3_2013_to_2020 from IDI_Adhoc. Table prioritisation will occur with ir_clean.ird_rtns_keypoints_ir3 being the first source of data, followed by adhoc 2000_to_2014 and adhoc 2013_to_2020.
2.2 We will use data from partnership income and net profit column.
2.3 Remove duplicates in tables by using maximum of location number, return_version_number and ird_timestamp_date.
-
Identify self-employed individuals from ird cross reference, ir20 and ir4s tables, particularly partners and company directors, in order to relabel them in ir_ems data. To do so, we will
3.1 Extract data from ird_cross_reference where ir_xrf_reference_type_code (code for shareholder/partner etc) is PTR (partner), DIR, SHR or EOH (company director), and customer entity type code from ird_customers is ‘I’ (implying payee is an individual, not an organisation). This identified the type of employee-employer relationship.
3.2 Extract data from ird_attachments_ir20 for those individuals whose customer entity type code from ird_customers is ‘I’ (implying payee is an individual, not an organisation). (These individuals are partners). This implies tax form has been filed for payee-payer relationship for period in question and evidence of ‘Company director/shareholder’ relationship exists.
3.3 Extract data from ird_attachments_ir4s for those individuals whose customer entity type code from ird_customers is ‘I’ (implying payee is an individual, not an organisation). (These individuals are company directors). This implies tax form has been filed for payee-payer relationship for period in question and evidence of ‘partner’ relationship exists.
-
Identify sole traders earning PAYE deducted income from ir_ems data (S01-EMS) where all the below conditions are met
4.1 income source code is W&S and withholding type code is ‘P’ in ir_ems data (implying PAYE deductions), and4.2 snz_ird_uid is equal to snz_employer_ird_uid (implying payee equal to payer), and
4.3 customer entity type code from ird_customers is ‘I’ (implying payee is an individual, not an organisation), and
4.4 Either
4.4.1 Tax registration type code is GST in ird_tax_registration table, within the time peroid of interest, or
4.4.2 Net profit exists within IR3 data extracted above in Step 2 within the time period of interest -
Identify company directors and partners earning PAYE deducted income from ir_ems (P01-EMS, C01-EMS) data where all the below conditions are met
5.1 Individuals identified in step 3 exist in the ir_ems data (matched using unique identifiers and unique employer identifiers within the time period of interest), and
5.2 Payer not equal to payee, i.e., the individuals do not have the same unique identifiers (snz_uid) and unique employer identifiers (snz_employer_uid), and
5.3 Withholding type code is ‘P’ in ir_ems data (implying PAYE deductions) and income source code is W&S in ir_ems_data (implying wages and salary)
-
Identify company directors, partners and sole traders earning withholding payments from ir_ems data (S02-EMS, P02-EMS, C02-EMS) where all the below conditions are met
6.1 Individuals identified in step 3 exist in the ir_ems data (identified using unique identifiers and unique employer identifiers within the time period of interest, and
6.2 Withholding type code is ‘W’ in ir_ems data (implying withholding tax) and income source code is ‘WHP’ in ir_ems_data (implying withholding payments), and
6.3 customer entity type code from ird_customers is ‘I’ (implying payee is an individual, not an organisation).
6.4 Those not labelled as company directors and partners (because they do not exist in Step 3), but are identified as individuals in ird_customers will be labelled as sole traders. The rest will continue to be labelled as WHP-EMS based on their EMS data in Step 1.
-
Identification of self-employed individuals from ir_ems data is complete. These individuals are self-employed who earn PAYE or WHT deducted income.
-
Identify sole trader from IR3 data (S00-IR3) using IR3 table created in Step 2. Their income comes from the net profit column, given it is not equal 0.
-
Identify company director/shareholder using IR4s data from ird_attachments_ir4s (C00-IR4). Their income comes from the total shareholder salary column, given it is not equal 0. We exclude any income imputations since that is not reliable. Also criteria for customer entity type code from ird_customers is ‘I’ must also be met, so that we identify individuals and not organisations.
-
Identify partner using IR20 data from ird_attachments_ir20 (P00-IR7). Their income comes from the total share of income column, given it is not equal 0. Also, criteria for customer entity type code from ird_customers is ‘I’ must also be met, so that we identify individuals and not organisations. Additionally, they should exist in IR3 table created in Step 2 and their partnership income should not be 0.
-
Identify self-employed income with withholding tax deductions identified in IR3 table (WHP-IR3) created in Step 2. Their income comes from withholding payments amount column, given it is not equal to 0.
-
Aggregate all the income identified from various tables (highlighted in yellow). These will be aggregated in tax year format. Their incomes will be the sum of their monthly (from ems only) and annual incomes.
-
There may be duplication in data for withholding payments identified through EMS and IR3 datasets. Where the unique identifier and time period is the same for an individual earning withholding payments, delete their EMS data (WHP-EMS, S02-EMS, P02-EMS, C02-EMS) so that only one copy of their income (WHP-IR3) exists.
-
To summarise, all self-employment income have now been identified and tagged as ‘C00-IR4’, ‘C01-EMS’, ‘C02-EMS’, ‘P00-IR7’, ‘P01-EMS’, ‘P02-EMS’, ‘S00-IR3’, ‘S01-EMS’, ‘S02-EMS’, ‘WHP-EMS’, ‘WHP-IR3’. We exclude all remaining data from EMS with income source W&S.
Parameters
The following parameters should be supplied to this module to run it in the database:
- {targetdb}: The SQL database on which the spell datasets are to be created.
- {idicleanversion}: The IDI Clean version that the spell datasets need to be based on.
- {targetschema}: The project schema under the target database into which the spell datasets are to be created.
- {projprefix}: A (short) prefix that enables you to identify the spell dataset easily in the schema, and prevent overwriting any existing datasets that have the same name.
Dependencies
[IDI_Clean_YYYYMM].[ir_clean].[ird_ems]
[IDI_Clean_YYYYMM].[ir_clean].[ird_rtns_keypoints_ir3]
[IDI_Adhoc].[clean_read_IR].[ir_ir3_2000_to_2014]
[IDI_Adhoc].[clean_read_IR].[ir_ir3_2013_to_2020]
[IDI_Clean_YYYYMM].[ir_clean].[ird_attachments_ir4s]
[IDI_Clean_YYYYMM].[ir_clean].[ird_attachments_ir20]
[IDI_Clean_YYYYMM].[ir_clean].[ird_customers]
[IDI_Clean_YYYYMM].[ir_clean].[ird_tax_registrations]
[IDI_Clean_YYYYMM].[ir_clean].[ird_cross_reference]
Outputs
{targetdb}.{targetschema}.{projprefix}_self_employment_{refresh}
Variable Descriptions
| Aspect | Variables | Description |
|---|---|---|
| Entity | snz_uid | Current refresh snz_uid |
| Period | period_start_date period_end_date | Payment start date Payment end date |
| Event | source_frequency | Frequency of income recorded in source data Note that self employment is recorded as both monthly and annual since it is sourced from multiple tables with different source frequencies |
| Information | income_source | Type of income being derived in this module |
| Note that: - C00-IR4 is Company director/shareholder income from IR4S table - C01-EMS is Company director/shareholder PAYE deducted income from EMS table - C02-EMS is Company director/shareholder WHT deducted income from EMS table - P00-IR7 is Partnership income from IR20/IR7 table - P01-EMS is Partner PAYE deducted income from EMS table - P02-EMS is Partner WHT deducted income from EMS table - S00-IR3 is Sole trader income from IR3 tables - S01-EMS is Sole Trader PAYE deducted income from EMS table - S02-EMS is Sole Trader WHT deducted income from EMS table - WHP-EMS is Withholding payments from EMS table - WHP-IR3 is Withholding payments from IR3 tables | ||
| taxable_income_indicator | Indicator for whether income is taxable Note that: 1:income is taxable 0:income is not taxable Self employment income is taxable | |
| gross_income | Income before tax |
Module Version & Change History
| Date | Version | Comments |
|---|---|---|
| 2023-07 | Original code sourced from APC code created by Stats NZ (Ivan Welsh) in the IDI | |
| 2024-06 | Initial | streamlining and preparation for code module by Linda Martis, Nicholson Consulting |
Code
:setvar targetdb "{targetdb}"
:setvar targetschema "{targetschema}"
:setvar projprefix "{projprefix}"
:setvar idicleanversion "{idicleanversion}"
/* Assign the target database to which all the components need to be created in. */
use $(targetdb);
/*
This script combines a number of different source of IR3 data into a single table.
Sources:IR3 table, adhoc 2000_to_2014, adhoc 2013_to_2020
Duplicates are removed from the data by filtering for the maximum return_version_nbr,
ird_timestamp_date, source_code, snz_unique_nbr and finally location_nbr
Create the storage table. This follows the format of the refresh IR3 table,
but without the leading ir_ir3_ prefix.
*/
DROP TABLE IF EXISTS #CombinedIR3;
CREATE TABLE #CombinedIR3 (
snz_uid INT NOT NULL
,snz_ird_uid INT NOT NULL
,location_nbr INT NOT NULL
,return_period_date DATE NOT NULL
,snz_unique_nbr BIGINT NOT NULL
,return_version_nbr SMALLINT NULL
,tot_pship_income_amt DECIMAL(13,2) NULL
,net_profit_amt DECIMAL(13,2) NULL
,income_imp_ind VARCHAR(1) NOT NULL
,tot_wholding_paymnts_amt DECIMAL(13,2) NULL
,sum_all_income_amt DECIMAL(13,2) NULL
,ird_timestamp_date DATE NULL
/* 1 for refresh IR3 table, 2 for adhoc 2000_to_2014, 3 for adhoc 2013_to_2020 */
,source_table_code TINYINT NOT NULL
,PRIMARY KEY (snz_ird_uid, return_period_date)
);
/* Combine the source tables into one */
INSERT INTO #CombinedIR3
SELECT con.snz_uid, con.snz_ird_uid, ir_ir3_location_nbr, ir_ir3_return_period_date, ir_ir3_snz_unique_nbr, ir_ir3_return_version_nbr,
ir_ir3_tot_pship_income_amt,
ir_ir3_net_profit_amt,
ir_ir3_income_imp_ind,
ir_ir3_tot_wholding_paymnts_amt,
isnull(ir_ir3_tot_pship_income_amt, 0) +
isnull(ir_ir3_net_profit_amt,0)+
isnull(ir_ir3_tot_wholding_paymnts_amt,0)
AS sum_all_income_amt,
ir_ir3_ird_timestamp_date, source_table_code
FROM(
SELECT *, ROW_NUMBER() OVER (PARTITION BY snz_ird_uid, ir_ir3_return_period_date ORDER BY ir_ir3_return_version_nbr desc, ir_ir3_ird_timestamp_date desc, source_table_code asc, ir_ir3_snz_unique_nbr desc, ir_ir3_location_nbr desc) AS row_nbr
FROM (
SELECT snz_ird_uid, ir_ir3_location_nbr, ir_ir3_return_period_date, ir_ir3_snz_unique_nbr, ir_ir3_return_version_nbr,
ir_ir3_tot_pship_income_amt,
ir_ir3_net_profit_amt,
ir_ir3_income_imp_ind,
ir_ir3_tot_wholding_paymnts_amt,
ir_ir3_ird_timestamp_date, 1 AS source_table_code
FROM $(idicleanversion).ir_clean.ird_rtns_keypoints_ir3
UNION ALL
SELECT snz_ird_uid, ir_ir3_location_nbr, ir_ir3_return_period_date, ir_ir3_snz_unique_nbr, ir_ir3_return_version_nbr,
ir_ir3_tot_pship_income_amt,
ir_ir3_net_profit_amt,
ir_ir3_income_imp_ind,
ir_ir3_tot_wholding_paymnts_amt,
ir_ir3_ird_timestamp_date, 2 AS source_table_code
FROM IDI_Adhoc.clean_read_IR.ir_ir3_2000_to_2014
UNION ALL
SELECT snz_ird_uid, location_number AS ir_ir3_location_nbr, period AS ir_ir3_return_period_date,
ROW_NUMBER() OVER (PARTITION BY snz_ird_uid, period ORDER BY return_version, partnership_income desc, selfemployed_income desc, withholding_payment desc) AS ir_ir3_snz_unique_nbr,
return_version AS ir_ir3_return_version_nbr,
partnership_income AS ir_ir3_tot_pship_income_amt,
selfemployed_income AS ir_ir3_net_profit_amt,
'N' AS ir_ir3_income_imp_ind,
withholding_payment AS ir_ir3_tot_wholding_paymnts_amt,
CAST(timestamp AS DATE) AS ir_ir3_ird_timestamp_date, 3 AS source_table_code
FROM IDI_Adhoc.clean_read_IR.ir_ir3_2013_to_2020
) tu
) t5
INNER JOIN $(idicleanversion).security.concordance con
ON con.snz_ird_uid = t5.snz_ird_uid
WHERE row_nbr = 1
AND YEAR(ir_ir3_return_period_date) BETWEEN 2000 AND year(GETDATE());
RAISERROR('Inserted %d rows from IR3 into #CombinedIR3', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
This script brings EMS data for specific income sources into one table.
Data is stored in calendar year format mth_01_amt = January and so on
*/
/* Create the equivalent of the data.income_cal_yr table */
DROP TABLE IF EXISTS #IncomeCalYr;
CREATE TABLE #IncomeCalYr (
year_nbr SMALLINT NOT NULL
,snz_uid INT NOT NULL
,snz_ird_uid INT NOT NULL
,snz_employer_ird_uid INT NOT NULL
,income_source_code CHAR(7) NOT NULL
,withholding_type_code CHAR(1) NOT NULL
,mth_01_amt DECIMAL(19,2) NOT NULL
,mth_02_amt DECIMAL(19,2) NOT NULL
,mth_03_amt DECIMAL(19,2) NOT NULL
,mth_04_amt DECIMAL(19,2) NOT NULL
,mth_05_amt DECIMAL(19,2) NOT NULL
,mth_06_amt DECIMAL(19,2) NOT NULL
,mth_07_amt DECIMAL(19,2) NOT NULL
,mth_08_amt DECIMAL(19,2) NOT NULL
,mth_09_amt DECIMAL(19,2) NOT NULL
,mth_10_amt DECIMAL(19,2) NOT NULL
,mth_11_amt DECIMAL(19,2) NOT NULL
,mth_12_amt DECIMAL(19,2) NOT NULL
,tot_yr_amt DECIMAL(19,2) NOT NULL
,PRIMARY KEY (snz_ird_uid, year_nbr, snz_employer_ird_uid, income_source_code, withholding_type_code)
);
/*
STEP 1: Populate the income cal with EMS data
*/
INSERT INTO #IncomeCalYr WITH (TABLOCK) (
year_nbr, snz_uid, snz_ird_uid, snz_employer_ird_uid, income_source_code, withholding_type_code,
mth_01_amt, mth_02_amt, mth_03_amt, mth_04_amt, mth_05_amt, mth_06_amt,
mth_07_amt, mth_08_amt, mth_09_amt, mth_10_amt, mth_11_amt, mth_12_amt, tot_yr_amt
)
SELECT year_nbr
,snz_uid
,snz_ird_uid
,snz_employer_ird_uid
,income_source_code
,withholding_type_code
,ISNULL(m1, 0) AS mth_01_amt
,ISNULL(m2, 0) AS mth_02_amt
,ISNULL(m3, 0) AS mth_03_amt
,ISNULL(m4, 0) AS mth_04_amt
,ISNULL(m5, 0) AS mth_05_amt
,ISNULL(m6, 0) AS mth_06_amt
,ISNULL(m7, 0) AS mth_07_amt
,ISNULL(m8, 0) AS mth_08_amt
,ISNULL(m9, 0) AS mth_09_amt
,ISNULL(m10, 0) AS mth_10_amt
,ISNULL(m11, 0) AS mth_11_amt
,ISNULL(m12, 0) AS mth_12_amt
,ISNULL(m1, 0) + ISNULL(m2, 0) + ISNULL(m3, 0) + ISNULL(m4, 0) + ISNULL(m5, 0) + ISNULL(m6, 0) +
ISNULL(m7, 0) + ISNULL(m8, 0) + ISNULL(m9, 0) + ISNULL(m10, 0) + ISNULL(m11, 0) + ISNULL(m12, 0) AS tot_yr_amt
FROM (SELECT YEAR(ir_ems_return_period_date) AS year_nbr
,CONCAT('m', MONTH(ir_ems_return_period_date)) AS month_label
,MAX(snz_uid) AS snz_uid
,MAX(snz_ird_uid) AS snz_ird_uid
,snz_employer_ird_uid
,ir_ems_income_source_code + '-EMS' AS income_source_code
,MIN(ir_ems_withholding_type_code) AS withholding_type_code
,SUM(ir_ems_gross_earnings_amt) as amt
FROM $(idicleanversion).ir_clean.ird_ems
WHERE YEAR(ir_ems_return_period_date) BETWEEN 2000 - 1 AND year(GETDATE())
AND ■■■■■■■■■■■
AND ir_ems_income_source_code in ('W&S','WHP')
GROUP BY YEAR(ir_ems_return_period_date)
,MONTH(ir_ems_return_period_date)
,snz_ird_uid
,snz_employer_ird_uid
,ir_ems_income_source_code
) AS month_groups
PIVOT (SUM(amt) FOR month_label IN (m1, m2, m3, m4, m5, m6, m7, m8, m9, m10, m11, m12)) AS pivoted;
RAISERROR('Inserted %d rows from EMS into #IncomeCalYr', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
Update W&S income when it is actually self employed income
This step adds the (S01) - Sole Trader recieving PAYE deducted income source
Rules: (A) W&S income
(B) PAYE
(C) Payee = Payer
(D) Paye is an individual
(E) Evidence of self employed: IR3 net profit <> 0 or GST registered
*/
UPDATE inc WITH (TABLOCK)
SET income_source_code = 'S01-EMS'
FROM #IncomeCalYr inc
WHERE inc.income_source_code = 'W&S-EMS' /* Wages & Salaries */
AND inc.withholding_type_code = 'P' /* PAYE */
AND inc.snz_ird_uid = inc.snz_employer_ird_uid /* payEE = payER */
AND EXISTS ( SELECT 1
FROM $(idicleanversion).ir_clean.ird_customers cus
WHERE cus.snz_ird_uid = inc.snz_ird_uid
AND cus.ir_cus_entity_type_code = 'I') /* payEE is an individual */
AND (EXISTS (SELECT 1
FROM $(idicleanversion).ir_clean.ird_tax_registrations reg /* GST registered */
WHERE reg.snz_ird_uid = inc.snz_ird_uid
AND reg.ir_treg_tax_type_code = 'GST'
AND inc.year_nbr BETWEEN YEAR(reg.ir_treg_treg_start_date)
AND YEAR(reg.ir_treg_treg_end_date)
)
OR EXISTS ( SELECT 1
FROM #CombinedIR3 ir3
WHERE ir3.snz_ird_uid = inc.snz_ird_uid
AND ir3.net_profit_amt <> 0 /* net profit non zero */
AND inc.year_nbr BETWEEN YEAR(ir3.return_period_date) - 1
AND YEAR(ir3.return_period_date)
)
)
RAISERROR('Updated %d rows from W&S to S01', 0, 1, @@ROWCOUNT) WITH NOWAIT;
/*
Build temp table to hold self employed income details from
cross reference, IR20 & IR4
*/
DROP TABLE IF EXISTS #tmp_sei_individual;
CREATE TABLE #tmp_sei_individual (
snz_ird_uid INT NOT NULL,
snz_employer_ird_uid INT NOT NULL,
start_cal_year_nbr INT NOT NULL,
end_cal_year_nbr INT NOT NULL,
sei_type_code CHAR(1) NOT NULL
);
INSERT INTO #tmp_sei_individual (
snz_ird_uid, snz_employer_ird_uid, start_cal_year_nbr, end_cal_year_nbr, sei_type_code
)
/* Relationships in cross ref table */
SELECT cln.ir_xrf_to_snz_ird_uid AS snz_ird_uid,
cln.ir_xrf_from_snz_ird_uid AS snz_employer_ird_uid,
cln.ir_xrf_first_year_nbr AS start_cal_year_nbr,
COALESCE(cln.ir_xrf_latest_year_nbr, YEAR(cln.ir_xrf_applied_date)) AS end_cal_year_nbr,
CASE
WHEN cln.ir_xrf_reference_type_code = 'PTR' THEN 'P'
WHEN cln.ir_xrf_reference_type_code IN ('DIR', 'SHR', 'EOH') THEN 'C'
ELSE 'U'
END
FROM $(idicleanversion).ir_clean.ird_cross_reference cln
WHERE cln.ir_xrf_reference_type_code IN ('PTR', 'DIR', 'SHR', 'EOH')
AND EXISTS ( SELECT 1
FROM $(idicleanversion).ir_clean.ird_customers cus
WHERE cus.snz_ird_uid = cln.ir_xrf_to_snz_ird_uid
AND cus.ir_cus_entity_type_code = 'I'
)
/* Relationships in IR20 table */
UNION
SELECT cln.snz_ird_uid AS snz_ird_uid,
cln.snz_employer_ird_uid AS snz_employer_ird_uid,
YEAR(cln.ir_ir20_return_period_date) - 1 AS start_cal_year_nbr,
YEAR(cln.ir_ir20_return_period_date) AS end_cal_year_nbr,
'P' AS sei_type_code
FROM $(idicleanversion).ir_clean.ird_attachments_ir20 cln
WHERE EXISTS ( SELECT 1
FROM $(idicleanversion).ir_clean.ird_customers cus
WHERE cus.snz_ird_uid = cln.snz_ird_uid
AND cus.ir_cus_entity_type_code = 'I'
)
/* Relationships in IR4 table */
UNION
SELECT cln.snz_ird_uid AS snz_ird_uid,
cln.snz_employer_ird_uid AS ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ AS start_cal_year_nbr,
9999 AS end_cal_year_nbr,
'C' AS sei_type_code
FROM $(idicleanversion).ir_clean.ird_attachments_ir4s cln
WHERE EXISTS ( SELECT 1
FROM $(idicleanversion).ir_clean.ird_customers cus
WHERE cus.snz_ird_uid = cln.snz_ird_uid
AND cus.ir_cus_entity_type_code = 'I'
);
RAISERROR('Inserted %d rows for cross referencing', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
Update W&S income when it is actually self employed income
This step adds the (C01) - Company director/shareholder recieving PAYE deducted income
and (P01) - Partner recieving PAYE deducted income
Rules: (A) W&S income
(B) PAYE
(C) Payee <> Payer
(D) Payee is an individual
(E) Evidence of self employed: IR20 or IR4 or cross ref
(F) Favour C02 over P02
*/
UPDATE inc WITH (TABLOCk)
SET income_source_code = CASE WHEN sei.sei_type_code = 'C' THEN 'C01-EMS'
WHEN sei.sei_type_code = 'P' THEN 'P01-EMS'
ELSE 'U01-EMS'
END
FROM #IncomeCalYr inc
JOIN ( SELECT snz_ird_uid, snz_employer_ird_uid,
start_cal_year_nbr, end_cal_year_nbr,
sei_type_code = MIN(sei_type_code) /* favour 'C' over 'P' */
FROM #tmp_sei_individual
GROUP BY snz_ird_uid, snz_employer_ird_uid, start_cal_year_nbr, end_cal_year_nbr
) sei ON sei.snz_ird_uid = inc.snz_ird_uid
AND sei.snz_employer_ird_uid = inc.snz_employer_ird_uid
AND inc.year_nbr BETWEEN sei.start_cal_year_nbr AND sei.end_cal_year_nbr
WHERE inc.income_source_code = 'W&S-EMS'
AND inc.withholding_type_code = 'P'
AND inc.snz_ird_uid <> inc.snz_employer_ird_uid;
RAISERROR('Updated %d rows from W&S to C01/P01', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
Update WHP income when it is actually self employed income
This step adds the (C02) - Company director/shareholder recieving WHT deducted income
and (P02) - Partner recieving WHT deducted income
and (S02) - Sole Trader recieving WHT deducted income
Rules: (A) WHP income
(B) WHT
(C) Payee is an individual
(D) Evidence of self employed: then C02 or P02 else S02
(E) Favour C02 over P02, THEN S02
*/
UPDATE inc WITH (TABLOCK)
SET income_source_code = CASE WHEN sei.sei_type_code = 'C' THEN 'C02-EMS'
WHEN sei.sei_type_code = 'P' THEN 'P02-EMS'
ELSE 'S02-EMS'
END
FROM #IncomeCalYr inc
LEFT JOIN ( SELECT snz_ird_uid, snz_employer_ird_uid,
start_cal_year_nbr, end_cal_year_nbr,
sei_type_code = MIN(sei_type_code)
FROM #tmp_sei_individual
GROUP BY snz_ird_uid, snz_employer_ird_uid, start_cal_year_nbr, end_cal_year_nbr
) sei ON sei.snz_ird_uid = inc.snz_ird_uid
AND sei.snz_employer_ird_uid = inc.snz_employer_ird_uid
AND inc.year_nbr BETWEEN sei.start_cal_year_nbr AND sei.end_cal_year_nbr
WHERE inc.income_source_code = 'WHP-EMS'
AND inc.withholding_type_code = 'W'
AND EXISTS ( SELECT 1
FROM $(idicleanversion).ir_clean.ird_customers cus
WHERE cus.snz_ird_uid = inc.snz_ird_uid
AND cus.ir_cus_entity_type_code = 'I'
);
RAISERROR('Updated %d rows from W&S to C02/P02/S02', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
This script converts #IncomeCalYr to tax year format mth_01_amt = April and so on
*/
/* Create the equivalent of the data.income_tax_yr table */
DROP TABLE IF EXISTS #IncomeTaxYr;
CREATE TABLE #IncomeTaxYr (
year_nbr SMALLINT NOT NULL
,snz_uid INT NOT NULL
,snz_ird_uid INT NOT NULL
,snz_employer_ird_uid INT NOT NULL
,income_source_code CHAR(7) NOT NULL
,withholding_type_code CHAR(1) NOT NULL
,mth_01_amt DECIMAL(19,2) NULL
,mth_02_amt DECIMAL(19,2) NULL
,mth_03_amt DECIMAL(19,2) NULL
,mth_04_amt DECIMAL(19,2) NULL
,mth_05_amt DECIMAL(19,2) NULL
,mth_06_amt DECIMAL(19,2) NULL
,mth_07_amt DECIMAL(19,2) NULL
,mth_08_amt DECIMAL(19,2) NULL
,mth_09_amt DECIMAL(19,2) NULL
,mth_10_amt DECIMAL(19,2) NULL
,mth_11_amt DECIMAL(19,2) NULL
,mth_12_amt DECIMAL(19,2) NULL
,tot_yr_amt DECIMAL(19,2) NULL
);
/*
Build the income tax year table from cal year table
*/
INSERT INTO #IncomeTaxYr WITH (TABLOCK) (
year_nbr, snz_uid, snz_ird_uid, snz_employer_ird_uid, income_source_code, withholding_type_code,
mth_01_amt, mth_02_amt, mth_03_amt, mth_04_amt, mth_05_amt, mth_06_amt,
mth_07_amt, mth_08_amt, mth_09_amt, mth_10_amt, mth_11_amt, mth_12_amt,
tot_yr_amt
)
SELECT inc.year_nbr AS year_nbr
,inc.snz_uid AS snz_uid
,inc.snz_ird_uid AS snz_ird_uid
,inc.snz_employer_ird_uid AS snz_employer_ird_uid
,inc.income_source_code AS income_source_code
,inc.withholding_type_code AS withholding_type_code
,ISNULL(inc.prev_apr_amt, 0) AS mth_01_amt
,ISNULL(inc.prev_may_amt, 0) AS mth_02_amt
,ISNULL(inc.prev_jun_amt, 0) AS mth_03_amt
,ISNULL(inc.prev_jul_amt, 0) AS mth_04_amt
,ISNULL(inc.prev_aug_amt, 0) AS mth_05_amt
,ISNULL(inc.prev_sep_amt, 0) AS mth_06_amt
,ISNULL(inc.prev_oct_amt, 0) AS mth_07_amt
,ISNULL(inc.prev_nov_amt, 0) AS mth_08_amt
,ISNULL(inc.prev_dec_amt, 0) AS mth_09_amt
,ISNULL(inc.this_jan_amt, 0) AS mth_10_amt
,ISNULL(inc.this_feb_amt, 0) AS mth_11_amt
,ISNULL(inc.this_mar_amt, 0) AS mth_12_amt
,ISNULL(inc.prev_apr_amt, 0) + ISNULL(inc.prev_may_amt, 0) + ISNULL(inc.prev_jun_amt, 0) +
ISNULL(inc.prev_jul_amt, 0) + ISNULL(inc.prev_aug_amt, 0) + ISNULL(inc.prev_sep_amt, 0) +
ISNULL(inc.prev_oct_amt, 0) + ISNULL(inc.prev_nov_amt, 0) + ISNULL(inc.prev_dec_amt, 0) +
ISNULL(inc.this_jan_amt, 0) + ISNULL(inc.this_feb_amt, 0) + ISNULL(inc.this_mar_amt, 0) AS tot_yr_amt
FROM (SELECT ISNULL(this_yr.year_nbr, prev_yr.year_nbr + 1) AS year_nbr
,ISNULL(this_yr.snz_uid, prev_yr.snz_uid) AS snz_uid
,ISNULL(this_yr.snz_ird_uid, prev_yr.snz_ird_uid) AS snz_ird_uid
,ISNULL(this_yr.snz_employer_ird_uid, prev_yr.snz_employer_ird_uid) AS snz_employer_ird_uid
,ISNULL(this_yr.income_source_code, prev_yr.income_source_code) AS income_source_code
,ISNULL(this_yr.withholding_type_code, prev_yr.withholding_type_code) AS withholding_type_code
,ISNULL(prev_yr.mth_04_amt, 0) AS prev_apr_amt
,ISNULL(prev_yr.mth_05_amt, 0) AS prev_may_amt
,ISNULL(prev_yr.mth_06_amt, 0) AS prev_jun_amt
,ISNULL(prev_yr.mth_07_amt, 0) AS prev_jul_amt
,ISNULL(prev_yr.mth_08_amt, 0) AS prev_aug_amt
,ISNULL(prev_yr.mth_09_amt, 0) AS prev_sep_amt
,ISNULL(prev_yr.mth_10_amt, 0) AS prev_oct_amt
,ISNULL(prev_yr.mth_11_amt, 0) AS prev_nov_amt
,ISNULL(prev_yr.mth_12_amt, 0) AS prev_dec_amt
,ISNULL(this_yr.mth_01_amt, 0) AS this_jan_amt
,ISNULL(this_yr.mth_02_amt, 0) AS this_feb_amt
,ISNULL(this_yr.mth_03_amt, 0) AS this_mar_amt
FROM #IncomeCalYr this_yr
FULL JOIN #IncomeCalYr prev_yr
ON prev_yr.year_nbr = this_yr.year_nbr - 1
AND prev_yr.snz_uid = this_yr.snz_uid
AND prev_yr.snz_ird_uid = this_yr.snz_ird_uid
AND prev_yr.snz_employer_ird_uid = this_yr.snz_employer_ird_uid
AND prev_yr.income_source_code = this_yr.income_source_code
AND prev_yr.withholding_type_code = this_yr.withholding_type_code
) inc
WHERE (0 <> ISNULL(inc.prev_apr_amt, 0) + ISNULL(inc.prev_may_amt, 0) + ISNULL(inc.prev_jun_amt, 0) +
ISNULL(inc.prev_jul_amt, 0) + ISNULL(inc.prev_aug_amt, 0) + ISNULL(inc.prev_sep_amt, 0) +
ISNULL(inc.prev_oct_amt, 0) + ISNULL(inc.prev_nov_amt, 0) + ISNULL(inc.prev_dec_amt, 0) +
ISNULL(inc.this_jan_amt, 0) + ISNULL(inc.this_feb_amt, 0) + ISNULL(inc.this_mar_amt, 0))
AND inc.year_nbr BETWEEN 2000 AND year(GETDATE());
RAISERROR('Inserted %d rows converting cal_yr to tax_yr', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
This script adds self employment data from tables other than EMS into #IncomeTaxYr table
*/
/*
Add self employed income from IR3 to tax_yr table - Sole trader
Adds the (S00) - Sole Trader income from IR3
Rules: (A) IR3 net profit amount <> 0
*/
INSERT INTO #IncomeTaxYr WITH (TABLOCK) (
year_nbr, snz_uid, snz_ird_uid, snz_employer_ird_uid, income_source_code,
withholding_type_code, tot_yr_amt
)
SELECT YEAR(ir3.return_period_date) AS year_nbr
,MIN(ir3.snz_uid) AS snz_uid
,MIN(ir3.snz_ird_uid) AS snz_ird_uid
,MIN(ir3.snz_ird_uid) AS snz_employer_ird_uid
,'S00-IR3' AS income_source_code
,'X' AS withholding_type_code
,SUM(ir3.net_profit_amt) AS tot_yr_amt
FROM #CombinedIR3 ir3
WHERE ir3.net_profit_amt <> 0
AND YEAR(ir3.return_period_date) BETWEEN 2000 AND year(GETDATE())
GROUP BY YEAR(ir3.return_period_date), ir3.snz_ird_uid;
RAISERROR('Inserted %d rows for S00-IR3', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
Add self employed income from IR4 to tax_yr table
Adds the (C00) - Company director/shareholder income from IR4S
Rules: (A) IR■■■■■■■■ exists
*/
INSERT INTO #IncomeTaxYr WITH (TABLOCK) (
year_nbr, snz_uid, snz_ird_uid, snz_employer_ird_uid, income_source_code,
withholding_type_code, tot_yr_amt
)
SELECT YEAR(ir4.ir_ir4_return_period_date) AS year_nbr
,MIN(ir4.snz_uid) AS snz_uid
,MIN(ir4.snz_ird_uid) AS snz_ird_uid
,MIN(ir4.snz_employer_ird_uid) AS snz_employer_ird_uid
,'C00-IR4' AS income_source_code
,'X' AS withholding_type_code
,SUM(ir4.ir_ir4_tot_sholder_sal_809_amt) AS tot_yr_amt
FROM $(idicleanversion).ir_clean.ird_attachments_ir4s ir4
WHERE EXISTS ( SELECT 1
FROM $(idicleanversion).ir_clean.ird_customers cus
WHERE cus.snz_ird_uid = ir4.snz_ird_uid
AND cus.ir_cus_entity_type_code = 'I'
)
AND YEAR(ir4.ir_ir4_return_period_date) BETWEEN 2000 AND year(GETDATE())
AND ir4.ir_ir4_income_imp_ind <> 'Y' /* Exclude the 'imputed' incomes as they be wrong */
AND ir4.ir_ir4_tot_sholder_sal_809_amt <> 0
GROUP BY YEAR(ir4.ir_ir4_return_period_date), ir4.snz_ird_uid, ir4.snz_employer_ird_uid;
RAISERROR('Inserted %d rows for C00-IR4', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
Add self employed income from IR20 to tax_yr table
Adds the (P00) - Partnership income from I20
Rules: (A) IR20 tot share income <> 0
*/
INSERT INTO #IncomeTaxYr WITH (TABLOCK) (
year_nbr, snz_uid, snz_ird_uid, snz_employer_ird_uid, income_source_code,
withholding_type_code, tot_yr_amt
)
SELECT YEAR(ir20.ir_ir20_return_period_date) AS year_nbr,
MIN(ir20.snz_uid) AS snz_uid,
MIN(ir20.snz_ird_uid) AS snz_ird_uid,
MIN(ir20.snz_employer_ird_uid) AS snz_employer_ird_uid,
'P00-IR7' AS income_source_code,
'X' AS withholding_type_code,
SUM(ir20.ir_ir20_tot_share_of_inc_865_amt) AS tot_yr_amt
FROM $(idicleanversion).ir_clean.ird_attachments_ir20 ir20
WHERE ir20.ir_ir20_tot_share_of_inc_865_amt <> 0
AND EXISTS ( SELECT 1
FROM $(idicleanversion).ir_clean.ird_customers cus
WHERE cus.snz_ird_uid = ir20.snz_ird_uid
AND cus.ir_cus_entity_type_code = 'I'
)
AND EXISTS ( SELECT 1
FROM #CombinedIR3 ir3
WHERE ir3.snz_ird_uid = ir20.snz_ird_uid
AND ir3.return_period_date = ir20.ir_ir20_return_period_date
AND ir3.tot_pship_income_amt <> 0
)
AND YEAR(ir20.ir_ir20_return_period_date) BETWEEN 2000 AND year(GETDATE())
GROUP BY YEAR(ir20.ir_ir20_return_period_date), ir20.snz_ird_uid, ir20.snz_employer_ird_uid;
RAISERROR('Inserted %d rows for P00-IR7', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
Add self employed income from IR3 to tax_yr table - witholding payments
Adds the (WHP-IR3) - Other income from IR3
Rules: (A) IR3 tot_wholding_paymnts_amt <> 0
*/
INSERT INTO #IncomeTaxYr WITH (TABLOCK) (
year_nbr, snz_uid, snz_ird_uid, snz_employer_ird_uid, income_source_code,
withholding_type_code, tot_yr_amt
)
SELECT YEAR(ir3.return_period_date) AS year_nbr,
MIN(ir3.snz_uid) AS snz_uid,
MIN(ir3.snz_ird_uid) AS snz_ird_uid,
MIN(ir3.snz_ird_uid) AS snz_employer_ird_uid,
'WHP-IR3' AS income_source_code,
'W' AS withholding_type_code,
SUM(ir3.tot_wholding_paymnts_amt) AS tot_yr_amt
FROM #CombinedIR3 ir3
WHERE ir3.tot_wholding_paymnts_amt <> 0
AND YEAR(ir3.return_period_date) BETWEEN 2000 AND year(GETDATE())
GROUP BY YEAR(ir3.return_period_date), ir3.snz_ird_uid;
RAISERROR('Inserted %d rows for WHP-IR3', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
Deleting wages and salary from #IncomeTaxYr table as it is not self employment. All remaining data is self employment
*/
DELETE FROM #IncomeTaxYr
FROM #IncomeTaxYr pop
WHERE pop.income_source_code IN ('W&S-EMS')
RAISERROR('Deleted %d ■■■■■■■■■■■, @@ROWCOUNT) WITH NOWAIT;
GO
/*
Update the table to aggregate income sources from #IncomeTaxYr table
We don't care about the employer or withholding type code
So create a final table that aggregates on these
Note: data stored as tax year mth_01_amt = April and so on
*/
DROP TABLE IF EXISTS #AggregateIRIncome;
CREATE TABLE #AggregateIRIncome (
year_nbr SMALLINT NOT NULL,
snz_uid INT NOT NULL,
income_source_code CHAR(7) NOT NULL
,mth_01_amt DECIMAL(19,2) NOT NULL
,mth_02_amt DECIMAL(19,2) NOT NULL
,mth_03_amt DECIMAL(19,2) NOT NULL
,mth_04_amt DECIMAL(19,2) NOT NULL
,mth_05_amt DECIMAL(19,2) NOT NULL
,mth_06_amt DECIMAL(19,2) NOT NULL
,mth_07_amt DECIMAL(19,2) NOT NULL
,mth_08_amt DECIMAL(19,2) NOT NULL
,mth_09_amt DECIMAL(19,2) NOT NULL
,mth_10_amt DECIMAL(19,2) NOT NULL
,mth_11_amt DECIMAL(19,2) NOT NULL
,mth_12_amt DECIMAL(19,2) NOT NULL
,tot_yr_amt DECIMAL(19,2) NOT NULL
);
INSERT INTO #AggregateIRIncome WITH (TABLOCK) (
year_nbr, snz_uid, income_source_code,
mth_01_amt, mth_02_amt, mth_03_amt, mth_04_amt, mth_05_amt, mth_06_amt,
mth_07_amt, mth_08_amt, mth_09_amt, mth_10_amt, mth_11_amt, mth_12_amt,
tot_yr_amt
)
SELECT year_nbr,
MIN(snz_uid) AS snz_uid,
income_source_code,
sum(ISNULL(mth_01_amt,0)),
sum(ISNULL(mth_02_amt,0)),
sum(ISNULL(mth_03_amt,0)),
sum(ISNULL(mth_04_amt,0)),
sum(ISNULL(mth_05_amt,0)),
sum(ISNULL(mth_06_amt,0)),
sum(ISNULL(mth_07_amt,0)),
sum(ISNULL(mth_08_amt,0)),
sum(ISNULL(mth_09_amt,0)),
sum(ISNULL(mth_10_amt,0)),
sum(ISNULL(mth_11_amt,0)),
sum(ISNULL(mth_12_amt,0)),
SUM(ISNULL(tot_yr_amt,0)) AS tot_yr_amt
FROM #IncomeTaxYr
WHERE snz_uid IS NOT NULL
GROUP BY year_nbr, snz_uid, income_source_code;
RAISERROR('Inserted %d rows when aggregating', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
Delete double counting of WHP income from EMS when it exists in IR3.
*/
DELETE FROM #AggregateIRIncome
FROM #AggregateIRIncome pop
WHERE EXISTS (SELECT 1
FROM #AggregateIRIncome pop2
WHERE pop.snz_uid = pop2.snz_uid
AND pop.year_nbr = pop2.year_nbr
AND pop2.income_source_code = 'WHP-IR3')
AND pop.income_source_code IN ('WHP-EMS', 'C02-EMS', 'P02-EMS', 'S02-EMS')
RAISERROR('Deleted %d ■■■■■■■■■■■, @@ROWCOUNT) WITH NOWAIT;
GO
/*
Create a table using #AggregateIRIncome
Data converted from tax year to calendar year format mth_01_amt = January and so on
*/
DROP TABLE IF EXISTS $(targetschema).$(projprefix)_self_employment;
CREATE TABLE $(targetschema).$(projprefix)_self_employment (
snz_uid INT NOT NULL
,period_start_date DATE NOT NULL
,period_end_date DATE NOT NULL
,source_frequency CHAR(7) NOT NULL
,income_source CHAR(7) NOT NULL
,taxable_income_indicator TINYINT NOT NULL
,gross_income DECIMAL(19,2) NULL
);
WITH table_unpivot as (
SELECT
[snz_uid]
,month_nbr
,year_nbr
,CASE WHEN income_source_code LIKE ('%EMS') THEN 'monthly' /* EMS data is monthly format */
WHEN income_source_code NOT LIKE ('%EMS') THEN 'annual' /* IR3, IR4, IR7, IR20 data is tax year format */
END as source_frequency
,income_source_code as income_source
,1 as taxable_income_indicator
,gross_income
FROM (
select
snz_uid
,year_nbr
,income_source_code
,mth_01_amt, mth_02_amt, mth_03_amt, mth_04_amt, mth_05_amt, mth_06_amt,mth_07_amt, mth_08_amt, mth_09_amt, mth_10_amt, mth_11_amt, mth_12_amt,tot_yr_amt
FROM #AggregateIRIncome
) source_table
UNPIVOT
(gross_income FOR month_nbr IN
(mth_01_amt, mth_02_amt, mth_03_amt, mth_04_amt, mth_05_amt, mth_06_amt,mth_07_amt, mth_08_amt, mth_09_amt, mth_10_amt, mth_11_amt, mth_12_amt,tot_yr_amt)
) AS unpvt
)
,table_concat as (
SELECT *
FROM table_unpivot t1
WHERE CONCAT(month_nbr,source_frequency) != 'tot_yr_amtmonthly' /* excluding rows that are just the sum of EMS monthly amounts */
)
,table_months as (
select
tx.*
,CASE WHEN month_nbr = 'mth_01_amt' THEN '04'
WHEN month_nbr = 'mth_02_amt' THEN '05'
WHEN month_nbr = 'mth_03_amt' THEN '06'
WHEN month_nbr = 'mth_04_amt' THEN '07'
WHEN month_nbr = 'mth_05_amt' THEN '08'
WHEN month_nbr = 'mth_06_amt' THEN '09'
WHEN month_nbr = 'mth_07_amt' THEN '10'
WHEN month_nbr = 'mth_08_amt' THEN '11'
WHEN month_nbr = 'mth_09_amt' THEN '12'
WHEN month_nbr = 'mth_10_amt' THEN '01'
WHEN month_nbr = 'mth_11_amt' THEN '02'
WHEN month_nbr = 'mth_12_amt' THEN '03'
WHEN month_nbr = 'tot_yr_amt' THEN '03' /* always annual. so start date will always be 1-Apr prev year */
END as month_cal_yr
FROM table_concat tx
)
,table_calyr as (
select
t2.*
,CASE WHEN month_cal_yr in (04,05,06,07,08,09,10,11,12) THEN year_nbr-1 /* previous year number */
WHEN month_cal_yr in (01,02,03) THEN year_nbr /* current year number */
END as year_cal_yr
FROM table_months t2
)
INSERT INTO $(targetschema).$(projprefix)_self_employment
select snz_uid
,CASE WHEN month_nbr = 'tot_yr_amt' THEN DATEFROMPARTS(year_cal_yr-1, month_cal_yr+1, 01) /* start date is 1 Apr previous year */
WHEN month_nbr != 'tot_yr_amt' THEN DATEFROMPARTS(year_cal_yr, month_cal_yr, 01) /* start date is start of month */
END as 'period_start_date'
,EOMONTH(DATEFROMPARTS(year_cal_yr, month_cal_yr, 01)) as 'period_end_date' /* end date is end of month */
,source_frequency
,income_source
,taxable_income_indicator
,gross_income
FROM table_calyr
WHERE gross_income <> 0;
RAISERROR('Created table in Sandpit with %d ■■■■■■■■■■■, @@ROWCOUNT) WITH NOWAIT;
GO
ALTER TABLE $(targetdb).$(targetschema).$(projprefix)_self_employment
REBUILD PARTITION = ALL WITH (DATA_COMPRESSION=PAGE);
DROP TABLE IF EXISTS #CombinedIR3;
DROP TABLE IF EXISTS #IncomeCalYr;
DROP TABLE IF EXISTS #tmp_sei_individual;
DROP TABLE IF EXISTS #IncomeTaxYr;
DROP TABLE IF EXISTS #AggregateIRIncome