Module output
SQL:[IDI_Community].[inc_investments].investments_YYYYMM
SAS: libname cmti ODBC dsn=idi_community_srvprd schema=inc_investments; proc print data = cmti.investments_YYYYMM
How to access a code module in the Data Lab : Read here
Purpose of Investments module
The purpose of this module is to construct the income through investments for people in New Zealand and their relevant income spells.
Within this module, Investment income includes:
- Rental income
- Dividend income
- Interest income
- Māori Authority Distribution
- Portfolio Entity Investment Income
- Estate and / or Trust Income
Key concepts
Investment income
Investment income is the income, net profit or loss received from investments such as rent, Māori land or other leased land, dividends from New Zealand companies, income as the beneficiary of an estate or trust, royalties, interest from the following: banks, other financial institutions, bonds, stocks, money market funds, debentures or securities.
Total income spell
An income spell is the minimum payment period per income type. Investment income in practice may be earned in a daily, weekly, fortnightly, monthly etc depending on the type of investment.
Source frequency
Source frequency is the frequency with which an income type is stored in the data. Source frequency for investment income is annual.
Gross income amount
Gross income amount is the taxable income before tax that a person receives.
Observed tax
Tax recorded as paid to IRD in the IDI. This is the amount of tax that has been has been deducted from source (eg PAYE and tax filing). Observed tax data for investment income can be sourced from IR3, IR4 and IR7 tables.
Note: observed tax deduction is not the same as actual tax liability since this can only be calculated once all income sources have been identified over a tax period. This information has not been included in the initial version of the module.
Granularity
The modules are setup to be event format: One row per event, events have a person, start and end dates, and descriptive information. The variables below will uniquely identify each event in the above table:
Entity id: snz uid
Period: start end, end date
Event information: income source, taxable income indicator
Data sources
IR3: ird_rtns_keypoints_ir3, ir_ir3_2000_to_2014, ir_ir3_2013_to_2020
IR3 is the income tax return for individuals used to confirm the amount of personal income tax to be paid at the end of each tax year with data available from March 2000 to present. The main purpose of IR3 is to include any monetary payments that have not been taxed at source. IR3 is used for self-employment (filed annually by sole traders) and includes non-zero partnership, or shareholder salary income, as well as rental income. IR3 is mandatory for individuals who earned more than $200 (before tax) income other than salary, wages, interest, dividends, and/or taxable Māori authority distributions. IR3 is filed annually.
Starting from July 2017, IR has extended the supply of IR3 information to the IDI to include further information about investment income, including earnings from interest, dividend, estate trust, overseas, and other sources. Also included are non-taxable income from a range of tax credit entitlements, tax rebate, and student loan-liable income.
There may be a substantial lag in reporting IR3 data as individual businesses and those using tax agents have up to a year after the end of the financial year to file an annual tax return with IR.
Note: 3 IR2 tables (2 ad-hoc tables and 1 clean) are required for creating the timeseries of IR3 data. IR3s belonging to people who reported no self-employment, partnership or shareholder income were excluded for tax years prior to 2019. They are in the ad-hoc tables.
PTS: ir_clean.ird_pts
This table contains information on personal tax summary (PTS) with data available from March 2000 to March 2020. PTS is a tax return for individual taxpayers to show their individual income and tax deduction details for a given tax year. This table can be used to capture interest income and dividend income.
Autocalc: ir_clean.ird_autocalc_information:
This table shows information used by IR to calculate automatic assessments with data available from April 2019 to present. This auto-calc process is applied to people whose income is only salary, wages, interest or dividends. This information is received by IR from employers and payers of investment income. Assessments are calculated and refunds are issued automatically.
Practical notes
Period
The Investment income information is from 1999 onwards.
Negative amounts
No exclusions made for negative amounts.
Staged approach to include tax within the total income module
Stage 1 of module creation will only include gross income amounts. Stage 2 (pending) will include observed tax deducted at source since they are available in EMS and MSD tables. This can be used to calculate the tax at a tax year level which includes tax deducted at source and from tax filings. Stage 3 (pending) aims to derive tax at a monthly level. This is complex and will require support from SME to apply rules to derive tax. This can be used to calculate tax for any time period.
References
- Document : experimental-income-estimates-from-linked-admin-data-methods-and-results-revised.pdf
- When we work out your tax for you: income tax assessments
- 2024 Individual income tax assessment - end of year process
- Investment income reporting
Key Contacts
| Role | Organisation | Person |
|---|---|---|
| Dev lead | MBIE | Eric Krassoi Peach |
| Module Coder | Nicholson Consulting | Linda Martis |
| Code Supply | Stats NZ MSD | Ivan Welsh Marc de Boer |
| Peer review (code) | Independent | Vinay Benny |
| Peer review (documentation) | IRD | Joanne Butterfield |
| IRD | Scott Anderson | |
| Module Steward | Stas NZ | Cori Qian |
Module business rules
Defining spell start and end dates
Start and end dates are based on IR3, AutoCalc and PTS dates. The return period column in the tables is considered to be the end date. Start will be that start of the annual tax year (1-Apr)
Person identifier snz_uid
A global unique identifier created by Statistics NZ. There is a snz_uid for each distinct identity in the IDI. This identifier is changed and reassigned each refresh
Minimum income threshold
Currently no minimum threshold applied
Interoperability with other modules
This module is in the business landscape so could be that it is most closely associated with business profile modules and employment spells. It is also in the socioeconomic landscape where income is a determinant of positive social outcomes, so associated with social housing, education modules, health modules, driving modules, employment assistance modules etc.
Duplication in data
Interest and dividend data comes from IR3, PTS and Autocalc IR tables. APC code applies prioritisation of IR3 > AC > PTS to record interest and dividends and eliminate duplications.
Taxable income indicator
Taxable income indicator to be hard coded in modules. Within investment income module, all income is taxable.
Code logic:
The code is based on the rules used for APC and derived data tables in the IDI.
- Identify individuals earning investment income from 3 IR3 tables. To do so, we will
a. Extract data from IR3 tables, specifically ir_clean.ird_rtns_keypoints_ir3, ir_ir3_2000_to_2014 and ir_ir3_2013_to_2020 from IDI_Adhoc. Table prioritisation will occur with ir_clean.ird_rtns_keypoints_ir3 being the first source of data, followed by adhoc 2000_to_2014 and adhoc 2013_to_2020.
b. We will use data which contains rent amounts, gross interest amounts, dividends and estate trust income amounts.
c. Remove duplicates in tables by using maximum of location number, return_version_number and ird_timestamp_date. - Using the data obtained above in point 1,
a. aggregate rental income by person’s unique identifiers and time period. Relabel income as S03-IR3.
b. aggregate gross interest by person’s unique identifiers and time period. Relabel income as INT-IR3.
c. aggregate gross dividend by person’s unique identifiers and time period. Relabel income as DIV-IR3.
d. aggregate estate trust income by person’s unique identifiers and time period. Relabel income as EST-IR3. - Identify individuals earning interest and dividend income from personal tax summary table (ir_clean.ird_pts table).
a. Remove duplicates in table by using maximum of snz_uid and pts_timestamp_date. - Using the data obtained above in point 3,
a. Relabel interest income as INT-PTS.
b. Relabel dividend income as DIV-PTS. - Identify individuals earning interest, dividend, māori authority distributions and portfolio entity investment income from autocalc table (ir_clean.ird_autocalc_information table).
a. Remove duplicates in table by using maximum of snz_uid, return_version_nbr number and ac_processing_date. - Using the data obtained above in point 5,
a. Relabel interest income as INT-AA.
b. Relabel dividend income as DIV-AA.
c. Relabel māori authority distributions income as MAD-AA.
d. Relabel portfolio entity investment income as PIE-AA. - To summarise, all the labelled income above becomes the investment income.
a. For interest and dividend income, persons with income recorded in IR3, PTS and/or autocalc table within the same time period are considered as duplcates. To ensure only one entry for interest and divident income is recorded, we will prioritise IR3 data first, AA data second and PTS data third (IR3 > AC > PTS)
Parameters
The following parameters should be supplied to this module to run it in the database:
- {targetdb}: The SQL database on which the spell datasets are to be created.
- {idicleanversion}: The IDI Clean version that the spell datasets need to be based on.
- {targetschema}: The project schema under the target database into which the spell datasets are to be created.
- {projprefix}: A (short) prefix that enables you to identify the spell dataset easily in the schema, and prevent overwriting any existing datasets that have the same name.
Dependencies
[IDI_Clean_YYYYMM].[ir_clean].[ird_rtns_keypoints_ir3]
[IDI_Adhoc].[clean_read_IR].[ir_ir3_2000_to_2014]
[IDI_Adhoc].[clean_read_IR].[ir_ir3_2013_to_2020]
[IDI_Clean_YYYYMM].[ir_clean].[ird_pts]
[IDI_Clean_YYYYMM].[ir_clean].[ird_autocalc_information]
Outputs
{targetdb}.{targetschema}.{projprefix}_investments_{refresh}
Variable Descriptions
| Aspect | Variables | Description |
|---|---|---|
| Entity | snz_uid | Current refresh snz_uid |
| Period | period_start_date period_end_date | Payment start date Payment end date |
| Event | source_frequency | Frequency of income recorded in source data Note that investments are recorded in IR3, PTS, and Autocalc tables as annual income |
| Information | income_source | Type of income being derived in this module |
| Note that: - INT-IR3 is gross interest income from IR3 tables - INT-AA is gross interest income from Autocalc table - INT-PTS is gross interest income from PTS table - DIV-IR3 is gross dividend income from IR3 tables - DIV-AA is gross dividend income from Autocalc table - DIV-PTS is gross dividend income from PTS table - EST-IR3 is estate trust income from IR3 tables - MAD-AA is Total Maori Authority Distribution from Autocalc table - PIE-AA is Total Portfolio Entity Investment income from Autocalc table - S03-IR3 is rental income income from IR3 tables | ||
| taxable_income_indicator | Indicator for whether income is taxable Note that: 1:income is taxable 0:income is not taxable Investments are taxable | |
| gross_income | Income before tax |
Module Version & Change History
| Date | Version | Comments |
|---|---|---|
| 2023-07 | Original code sourced from APC code created by Stats NZ (Ivan Welsh) in the IDI | |
| 2024-06 | Initial | streamlining and preparation for code module by Linda Martis, Nicholson Consulting |
Code
:setvar targetdb "{targetdb}"
:setvar targetschema "{targetschema}"
:setvar projprefix "{projprefix}"
:setvar idicleanversion "{idicleanversion}"
/* Assign the target database to which all the components need to be created in. */
use $(targetdb);
/*
This script combines a number of different source of IR3 data into a single table.
Sources:IR3 table, adhoc 2000_to_2014, adhoc 2013_to_2020
Duplicates are removed from the data by filtering for the maximum return_version_nbr,
ird_timestamp_date, source_code, snz_unique_nbr and finally location_nbr.
Create the storage table. This follows the format of the refresh IR3 table,
but without the leading ir_ir3_ prefix.
*/
DROP TABLE IF EXISTS #CombinedIR3;
CREATE TABLE #CombinedIR3 (
snz_uid INT NOT NULL
,snz_ird_uid INT NOT NULL
,location_nbr INT NOT NULL
,return_period_date DATE NOT NULL
,snz_unique_nbr BIGINT NOT NULL
,return_version_nbr SMALLINT NULL
,income_imp_ind VARCHAR(1) NOT NULL
,net_rents_826_amt DECIMAL(13,2) NULL
,gross_interest_amt DECIMAL(13,2) NULL
,gross_dividend_amt DECIMAL(13,2) NULL
,estate_trust_income_amt DECIMAL(13,2) NULL
,sum_all_income_amt DECIMAL(13,2) NULL
,ird_timestamp_date DATE NULL
/* 1 for refresh IR3 table, 2 for adhoc 2000_to_2014, 3 for adhoc 2013_to_2020 */
,source_table_code TINYINT NOT NULL
,PRIMARY KEY (snz_ird_uid, return_period_date)
);
/* Combine the source tables into one */
INSERT INTO #CombinedIR3
SELECT con.snz_uid AS snz_uid, con.snz_ird_uid AS snz_ird_uid, ir_ir3_location_nbr AS location_nbr, ir_ir3_return_period_date AS return_period_date, ir_ir3_snz_unique_nbr AS snz_unique_nbr, ir_ir3_return_version_nbr AS return_version_nbr,
ir_ir3_income_imp_ind AS income_imp_ind,
ir_ir3_net_rents_826_amt AS net_rents_826_amt,
ir_ir3_gross_interest_amt AS gross_interest_amt, ir_ir3_gross_dividend_amt AS gross_dividend_amt, ir_ir3_estate_trust_income_amt AS estate_trust_income_amt,
isnull(ir_ir3_net_rents_826_amt,0)+
isnull(ir_ir3_gross_interest_amt,0)+
isnull(ir_ir3_gross_dividend_amt,0)+
isnull(ir_ir3_estate_trust_income_amt,0)
AS sum_all_income_amt,
ir_ir3_ird_timestamp_date as ird_timestamp_date, source_table_code
FROM(
SELECT *, ROW_NUMBER() OVER (PARTITION BY snz_ird_uid, ir_ir3_return_period_date ORDER BY ir_ir3_return_version_nbr desc, ir_ir3_ird_timestamp_date desc, source_table_code asc, ir_ir3_snz_unique_nbr desc, ir_ir3_location_nbr desc) AS row_nbr
FROM (
SELECT snz_ird_uid, ir_ir3_location_nbr, ir_ir3_return_period_date, ir_ir3_snz_unique_nbr, ir_ir3_return_version_nbr,
ir_ir3_income_imp_ind,
ir_ir3_net_rents_826_amt,
ir_ir3_gross_interest_amt, ir_ir3_gross_dividend_amt, ir_ir3_estate_trust_income_amt,
ir_ir3_ird_timestamp_date, 1 AS source_table_code
FROM $(idicleanversion).ir_clean.ird_rtns_keypoints_ir3
UNION ALL
SELECT snz_ird_uid, ir_ir3_location_nbr, ir_ir3_return_period_date, ir_ir3_snz_unique_nbr, ir_ir3_return_version_nbr,
ir_ir3_income_imp_ind,
ir_ir3_net_rents_826_amt,
ir_ir3_gross_interest_amt, ir_ir3_gross_dividend_amt, ir_ir3_estate_trust_income_amt,
ir_ir3_ird_timestamp_date, 2 AS source_table_code
FROM IDI_Adhoc.clean_read_IR.ir_ir3_2000_to_2014
UNION ALL
SELECT snz_ird_uid, location_number AS ir_ir3_location_nbr, period AS ir_ir3_return_period_date,
ROW_NUMBER() OVER (PARTITION BY snz_ird_uid, period ORDER BY return_version, gross_interest desc, rents desc, gross_dividend desc, estate_income desc) AS ir_ir3_snz_unique_nbr,
return_version AS ir_ir3_return_version_nbr,
'N' AS ir_ir3_income_imp_ind,
rents AS ir_ir3_net_rents_826_amt,
gross_interest AS ir_ir3_gross_interest_amt, gross_dividend AS ir_ir3_gross_dividend_amt,
estate_income AS ir_ir3_estate_trust_income_amt,
CAST(timestamp AS DATE) AS ir_ir3_ird_timestamp_date, 3 AS source_table_code
FROM IDI_Adhoc.clean_read_IR.ir_ir3_2013_to_2020
) tu
) t5
INNER JOIN $(idicleanversion).security.concordance con
ON con.snz_ird_uid = t5.snz_ird_uid
WHERE row_nbr = 1
AND YEAR(ir_ir3_return_period_date) BETWEEN 2000 AND year(GETDATE());
RAISERROR('Inserted %d rows from IR3 into #CombinedIR3', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
This script creates #IncomeTaxYr table
Only annual data stored from #CombinedIR3
*/
/* Create the equivalent of the data.income_tax_yr table */
DROP TABLE IF EXISTS #IncomeTaxYr;
CREATE TABLE #IncomeTaxYr (
year_nbr DATE NOT NULL
,snz_uid INT NOT NULL
,snz_ird_uid INT NOT NULL
,snz_employer_ird_uid INT NOT NULL
,income_source_code CHAR(7) NOT NULL
,withholding_type_code CHAR(1) NOT NULL
,tot_yr_amt DECIMAL(19,2) NOT NULL
);
/*
Add self employed income from IR3 to tax_yr table - Rental income
Adds the (S03) - Rental income from IR3
Rules: (A) IR3 net rents amount <> 0
*/
INSERT INTO #IncomeTaxYr WITH (TABLOCK) (
year_nbr, snz_uid, snz_ird_uid, snz_employer_ird_uid, income_source_code,
withholding_type_code, tot_yr_amt
)
SELECT (ir3.return_period_date) AS year_nbr
,MIN(ir3.snz_uid) AS snz_uid
,MIN(ir3.snz_ird_uid) AS snz_ird_uid
,MIN(ir3.snz_ird_uid) AS snz_employer_ird_uid
,'S03-IR3' AS income_source_code
,'X' AS withholding_type_code
,SUM(ir3.net_rents_826_amt) AS tot_yr_amt
FROM #CombinedIR3 ir3
WHERE ir3.net_rents_826_amt <> 0
AND YEAR(ir3.return_period_date) BETWEEN 2000 AND year(GETDATE())
GROUP BY (ir3.return_period_date), ir3.snz_ird_uid;
RAISERROR('Inserted %d rows for S03-IR3', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
Add self employed income from IR3 to tax_yr table - Gross interest
Adds the (INT) - Gross interest from IR3
Rules: (A) IR3 gross_interest_amt <> 0
*/
INSERT INTO #IncomeTaxYr WITH (TABLOCK) (
year_nbr, snz_uid, snz_ird_uid, snz_employer_ird_uid, income_source_code,
withholding_type_code, tot_yr_amt
)
SELECT (ir3.return_period_date) AS year_nbr,
MIN(ir3.snz_uid) AS snz_uid,
MIN(ir3.snz_ird_uid) AS snz_ird_uid,
MIN(ir3.snz_ird_uid) AS snz_employer_ird_uid,
'INT-IR3' AS income_source_code,
'X' AS withholding_type_code,
SUM(ir3.gross_interest_amt) AS tot_yr_amt
FROM #CombinedIR3 ir3
WHERE ir3.gross_interest_amt <> 0
AND YEAR(ir3.return_period_date) BETWEEN 2000 AND year(GETDATE())
GROUP BY (ir3.return_period_date), ir3.snz_ird_uid;
RAISERROR('Inserted %d rows for INT-IR3', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
Add self employed income from IR3 to tax_yr table - Gross dividend
Adds the (DIV) - Gross dividend from IR3
Rules: (A) IR3 gross_dividend_amt <> 0
*/
INSERT INTO #IncomeTaxYr WITH (TABLOCK) (
year_nbr, snz_uid, snz_ird_uid, snz_employer_ird_uid, income_source_code,
withholding_type_code, tot_yr_amt
)
SELECT (ir3.return_period_date) AS year_nbr,
MIN(ir3.snz_uid) AS snz_uid,
MIN(ir3.snz_ird_uid) AS snz_ird_uid,
MIN(ir3.snz_ird_uid) AS snz_employer_ird_uid,
'DIV-IR3' AS income_source_code,
'X' AS withholding_type_code,
SUM(ir3.gross_dividend_amt) AS tot_yr_amt
FROM #CombinedIR3 ir3
WHERE ir3.gross_dividend_amt <> 0
AND YEAR(ir3.return_period_date) BETWEEN 2000 AND year(GETDATE())
GROUP BY (ir3.return_period_date), ir3.snz_ird_uid;
RAISERROR('Inserted %d rows for DIV-IR3', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
Add self employed income from IR3 to tax_yr table - Estate trust income
Adds the (EST) - Estate trust income from IR3
Rules: (A) IR3 estate_trust_income_amt <> 0
*/
INSERT INTO #IncomeTaxYr WITH (TABLOCK) (
year_nbr, snz_uid, snz_ird_uid, snz_employer_ird_uid, income_source_code,
withholding_type_code, tot_yr_amt
)
SELECT (ir3.return_period_date) AS year_nbr,
MIN(ir3.snz_uid) AS snz_uid,
MIN(ir3.snz_ird_uid) AS snz_ird_uid,
MIN(ir3.snz_ird_uid) AS snz_employer_ird_uid,
'EST-IR3' AS income_source_code,
'X' AS withholding_type_code,
SUM(ir3.estate_trust_income_amt) AS tot_yr_amt
FROM #CombinedIR3 ir3
WHERE ir3.estate_trust_income_amt <> 0
AND YEAR(ir3.return_period_date) BETWEEN 2000 AND year(GETDATE())
GROUP BY (ir3.return_period_date), ir3.snz_ird_uid;
RAISERROR('Inserted %d rows for EST-IR3', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
This script creates #PTSIncome and #AAIncome table
Only annual data stored from ird.pts and ird_autocalc_information
*/
/*
Deduplicated PTS
Deduplicates the PTS table
*/
DROP TABLE IF EXISTS #PTSIncomeDedup;
CREATE TABLE #PTSIncomeDedup (
year_nbr DATE NOT NULL,
snz_uid INT NOT NULL,
interest_amt DECIMAL(19,2) NULL,
dividend_amt DECIMAL(19,2) NULL,
rwt_amt DECIMAL(19,2) NULL,
dwt_amt DECIMAL(19,2) NULL
,PRIMARY KEY (snz_uid, year_nbr)
);
INSERT INTO #PTSIncomeDedup WITH (TABLOCK) (
year_nbr, snz_uid, interest_amt, dividend_amt, rwt_amt, dwt_amt
)
SELECT year_nbr, snz_uid, interest_amt, dividend_amt, rwt_amt, dwt_amt
FROM (/* Filter for max timestamp */
SELECT *, MAX(ir_pts_snz_unique_nbr) OVER (PARTITION BY snz_uid, year_nbr) AS max_nbr
FROM (/* Filter for valid interest amounts */
SELECT snz_uid,
(ir_pts_return_period_date) AS year_nbr,
ir_pts_tot_interest_amt AS interest_amt,
ir_pts_tot_dividend_amt AS dividend_amt,
ir_pts_tot_rwt_amt AS rwt_amt,
ir_pts_tot_dwt_amt AS dwt_amt,
ir_pts_snz_unique_nbr,
ir_pts_timestamp_date,
MAX(ir_pts_timestamp_date) OVER (PARTITION BY snz_uid,
ir_pts_return_period_date) AS max_timestamp
FROM $(idicleanversion).ir_clean.ird_pts
WHERE YEAR(ir_pts_return_period_date) BETWEEN 2000 AND year(GETDATE())
) t
WHERE ir_pts_timestamp_date = max_timestamp
) t3
WHERE ir_pts_snz_unique_nbr = max_nbr;
RAISERROR('Inserted %d rows for dedpulicated PTS', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/* Create the table for PTS data */
DROP TABLE IF EXISTS #PTSIncome;
CREATE TABLE #PTSIncome (
year_nbr DATE NOT NULL,
snz_uid INT NOT NULL,
income_source_code CHAR(7) NOT NULL,
tot_yr_amt DECIMAL(19,2) NOT NULL,
);
/*
PTS Interest
Adds the INT source from the PTS table. Is deduplicated
first by timestamp then by unique_nbr
*/
INSERT INTO #PTSIncome WITH (TABLOCK) (
year_nbr, snz_uid, income_source_code, tot_yr_amt
)
SELECT year_nbr, snz_uid, 'INT-PTS' AS income_source_code, interest_amt
FROM #PTSIncomeDedup
WHERE interest_amt <> 0;
RAISERROR('Inserted %d rows for INT-PTS', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
PTS Dividends
Adds the DIV source from the PTS table. Is deduplicated
first by timestamp then by unique_nbr
*/
INSERT INTO #PTSIncome WITH (TABLOCK) (
year_nbr, snz_uid, income_source_code, tot_yr_amt
)
SELECT year_nbr, snz_uid, 'DIV-PTS' AS income_source_code, dividend_amt
FROM #PTSIncomeDedup
WHERE dividend_amt <> 0;
RAISERROR('Inserted %d rows for DIV-PTS', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/* Create the table for AA data */
DROP TABLE IF EXISTS #AAIncome;
CREATE TABLE #AAIncome (
year_nbr DATE NOT NULL,
snz_uid INT NOT NULL,
income_source_code CHAR(7) NOT NULL,
tot_yr_amt DECIMAL(19,2) NOT NULL,
);
/*
Deduplicated AA
Deduplicates the AA table
*/
DROP TABLE IF EXISTS #AAIncomeDedup;
CREATE TABLE #AAIncomeDedup (
year_nbr DATE NOT NULL,
snz_uid INT NOT NULL,
interest_amt DECIMAL(19,2) NULL,
dividend_amt DECIMAL(19,2) NULL,
maori_authority_amt DECIMAL(19,2) NULL,
pie_amt DECIMAL(19,2) NULL,
rwt_amt DECIMAL(19,2) NULL,
dwt_amt DECIMAL(19,2) NULL,
pir_amt DECIMAL(19,2) NULL
,PRIMARY KEY (snz_uid, year_nbr)
);
INSERT INTO #AAIncomeDedup WITH (TABLOCK) (
year_nbr, snz_uid, interest_amt, dividend_amt, maori_authority_amt, pie_amt, rwt_amt, dwt_amt, pir_amt
)
SELECT year_nbr, snz_uid, interest_amt, dividend_amt, maori_authority_amt, pie_amt, rwt_amt, dwt_amt, pir_amt
FROM (/* Filter for max version number */
SELECT *, MAX(ir_ac_snz_unique_nbr) OVER (PARTITION BY snz_uid, year_nbr) AS max_nbr
FROM (/* Filter for max processing date */
SELECT *, MAX(ir_ac_return_version_nbr) OVER (PARTITION BY snz_uid, year_nbr) AS max_version
FROM (/* Filter for valid interest amounts */
SELECT snz_uid,
(ir_ac_return_period_date) AS year_nbr,
ir_ac_tot_gross_intrst_amt AS interest_amt,
ir_ac_tot_gross_dvdnd_amt AS dividend_amt,
ir_ac_tot_ma_authrty_distbtn_amt AS maori_authority_amt,
ir_ac_tot_pie_income_amt AS pie_amt,
ir_ac_rwt_on_interest_amt AS rwt_amt,
ir_ac_rwt_on_dvdnd_amt AS dwt_amt,
ir_ac_tot_pie_deductions_amt AS pir_amt,
ir_ac_snz_unique_nbr,
ir_ac_return_version_nbr,
ir_ac_processing_date,
MAX(ir_ac_processing_date) OVER (PARTITION BY snz_uid,
ir_ac_return_period_date) AS max_processing
FROM $(idicleanversion).ir_clean.ird_autocalc_information
WHERE YEAR(ir_ac_return_period_date) BETWEEN 2000 AND year(GETDATE())
) t
WHERE ir_ac_processing_date = max_processing
) t3
WHERE ir_ac_return_version_nbr = max_version
) t4
WHERE ir_ac_snz_unique_nbr = max_nbr;
RAISERROR('Inserted %d rows for dedpulicated AA', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
AA Interest
Adds the INT source from the AA table. Is deduplicated
first by processing date, then timestamp then by unique_nbr
*/
INSERT INTO #AAIncome WITH (TABLOCK) (
year_nbr, snz_uid, income_source_code, tot_yr_amt
)
SELECT year_nbr, snz_uid, 'INT-AA' AS income_source_code, interest_amt
FROM #AAIncomeDedup
WHERE interest_amt <> 0;
RAISERROR('Inserted %d rows for INT-AA', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
AA Dividends
Adds the DIV source from the AA table. Is deduplicated
first by processing date, then timestamp then by unique_nbr
*/
INSERT INTO #AAIncome WITH (TABLOCK) (
year_nbr, snz_uid, income_source_code, tot_yr_amt
)
SELECT year_nbr, snz_uid, 'DIV-AA' AS income_source_code, dividend_amt
FROM #AAIncomeDedup
WHERE dividend_amt <> 0;
RAISERROR('Inserted %d rows for DIV-AA', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
AA Maori authority distributions
Adds the MAD source from the AA table. Is deduplicated
first by processing date, then timestamp then by unique_nbr
*/
INSERT INTO #AAIncome WITH (TABLOCK) (
year_nbr, snz_uid, income_source_code, tot_yr_amt
)
SELECT year_nbr, snz_uid, 'MAD-AA' AS income_source_code, maori_authority_amt
FROM #AAIncomeDedup
WHERE maori_authority_amt <> 0;
RAISERROR('Inserted %d rows for MAD in AA', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
AA PIE Income
Adds the PIE source from the AA table. Is deduplicated
first by processing date, then timestamp then by unique_nbr
*/
INSERT INTO #AAIncome WITH (TABLOCK) (
year_nbr, snz_uid, income_source_code, tot_yr_amt
)
SELECT year_nbr, snz_uid, 'PIE-AA' AS income_source_code, pie_amt
FROM #AAIncomeDedup
WHERE pie_amt <> 0;
RAISERROR('Inserted %d rows for PIE-AA', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
Create table #IRCombined to combine investment sources from IR3, AA and PTS
When combining the IR3, PTS and AA INT and DIV sources, the priority order is IR3 > AA > PTS.
Start by adding all #AggregateIR3Income INT/DIV data to the table, followed by #AAIncome and lastly by #PTSIncome and then remaining investments from #AggregateIR3Income and #AAIncome
*/
/*
IR3 INT/DIV data
*/
DROP TABLE IF EXISTS #IRCombined;
CREATE TABLE #IRCombined (
year_nbr DATE NOT NULL,
snz_uid INT NOT NULL,
income_source_code CHAR(7) NOT NULL
,tot_yr_amt DECIMAL(19,2) NOT NULL
);
INSERT INTO #IRCombined WITH (TABLOCK) (
year_nbr, snz_uid, income_source_code, tot_yr_amt
)
SELECT year_nbr
, MIN(snz_uid) AS snz_uid
, income_source_code
, SUM(tot_yr_amt) AS tot_yr_amt
FROM #IncomeTaxYr
WHERE income_source_code IN ('INT-IR3', 'DIV-IR3', 'TAX-IR3')
GROUP BY year_nbr, snz_uid, income_source_code;
RAISERROR('Inserted %d rows of INT-IR3, DIV-IR3', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
AA INT/DIV data
*/
INSERT INTO #IRCombined WITH (TABLOCK) (
year_nbr, snz_uid, income_source_code, tot_yr_amt
)
SELECT year_nbr, snz_uid, income_source_code, tot_yr_amt
FROM #AAIncome aa
WHERE income_source_code IN ('INT-AA ', 'DIV-AA ')
AND NOT EXISTS ( SELECT 1
FROM ( SELECT *
FROM #IRCombined
WHERE income_source_code IN ('INT-IR3', 'DIV-IR3')
) ir3
WHERE aa.year_nbr = ir3.year_nbr
AND aa.snz_uid = ir3.snz_uid
AND LEFT(aa.income_source_code, 3) = LEFT(ir3.income_source_code, 3)
);
RAISERROR('Inserted %d rows of INT-AA, DIV-AA', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
PTS INT/DIV data
*/
INSERT INTO #IRCombined WITH (TABLOCK) (
year_nbr, snz_uid, income_source_code, tot_yr_amt
)
SELECT year_nbr, snz_uid, income_source_code, tot_yr_amt
FROM #PTSIncome pts
WHERE income_source_code IN ('INT-PTS', 'DIV-PTS')
AND NOT EXISTS ( SELECT 1
FROM ( SELECT *
FROM #IRCombined
WHERE LEFT(income_source_code, 3) IN ('INT', 'DIV')
) ir3aa
WHERE pts.year_nbr = ir3aa.year_nbr
AND pts.snz_uid = ir3aa.snz_uid
AND LEFT(pts.income_source_code, 3) = LEFT(ir3aa.income_source_code, 3)
);
RAISERROR('Inserted %d rows of INT-PTS, DIV-PTS', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
Remaining DerivedIR3 data, excluding zeroes
*/
INSERT INTO #IRCombined WITH (TABLOCK) (
year_nbr, snz_uid, income_source_code, tot_yr_amt
)
SELECT year_nbr
, MIN(snz_uid) AS snz_uid
, income_source_code
, SUM(tot_yr_amt) AS tot_yr_amt
FROM #IncomeTaxYr
WHERE income_source_code NOT IN ('INT-IR3', 'DIV-IR3', 'TAX-IR3',
'INT-AA ', 'DIV-AA ', 'RWT-AA ', 'DWT-AA ',
'INT-PTS', 'DIV-PTS', 'RWT-PTS', 'DWT-PTS')
GROUP BY year_nbr, snz_uid, income_source_code;
RAISERROR('Inserted %d rows of non INT/DIV for DerivedIR3', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
Remaining AA data (PTS has no more data)
*/
INSERT INTO #IRCombined WITH (TABLOCK) (
year_nbr, snz_uid, income_source_code, tot_yr_amt
)
SELECT year_nbr, snz_uid, income_source_code, tot_yr_amt
FROM #AAIncome
WHERE income_source_code NOT IN ('INT-AA ', 'DIV-AA ', 'RWT-AA ', 'DWT-AA ');
RAISERROR('Inserted %d rows of non INT/DIV for AA', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
/*
Create a Sandpit table using #IRCombined
Note: data stored as tax year data. No monthy data.
*/
DROP TABLE IF EXISTS $(targetdb).$(targetschema).$(projprefix)_investments;
CREATE TABLE $(targetdb).$(targetschema).$(projprefix)_investments (
snz_uid INT NOT NULL
,period_start_date DATE NOT NULL
,period_end_date DATE NOT NULL
,source_frequency CHAR(7) NOT NULL
,income_source CHAR(7) NOT NULL
,taxable_income_indicator TINYINT NOT NULL
,gross_income DECIMAL(19,2) NULL
);
INSERT INTO $(targetdb).$(targetschema).$(projprefix)_investments
SELECT
snz_uid
,DATEFROMPARTS(YEAR(year_nbr)-1, MONTH(DATEADD(month,1,year_nbr)), 01) as 'period_start_date'
,year_nbr as 'period_end_date'
,'annual' as source_frequency
,income_source_code as income_source
,1 as taxable_income_indicator
,tot_yr_amt as gross_income
FROM #IRCombined
WHERE tot_yr_amt <> 0
RAISERROR('Inserted %d rows in final table in Sandpit', 0, 1, @@ROWCOUNT) WITH NOWAIT;
GO
ALTER TABLE $(targetdb).$(targetschema).$(projprefix)_investments
REBUILD PARTITION = ALL WITH (DATA_COMPRESSION=PAGE);
DROP TABLE IF EXISTS #CombinedIR3;
DROP TABLE IF EXISTS #IncomeTaxYr;
DROP TABLE IF EXISTS #PTSIncomeDedup;
DROP TABLE IF EXISTS #PTSIncome;
DROP TABLE IF EXISTS #AAIncome;
DROP TABLE IF EXISTS #AAIncomeDedup;
DROP TABLE IF EXISTS #IRCombined