Module Output
SQL:
[IDI_Community].[pent_profile].profile_YYYYMM
[IDI_Community].[pent_pent2snzuid].pent2snzuid_YYYYMM
SAS:
libname cmbp ODBC dsn=idi_community_srvprd schema=pent_profile; proc print data = cmbp.profile_YYYYMM; run ;
libname cmbp ODBC dsn=idi_community_srvprd schema=pent_pent2snzuid; proc print data = cmbp.pent2snzuid_YYYYMM; run ;
How to access a code module in the Data Lab : Read here
Context
This module creates a table of firm and individual level data to the permanent enterprise (pent) x financial year level. PENT data can be linked to individuals through the Fabling and Mare Labour Tables. See Fabling and Mare (2015) for detailed information and [IDI_Adhoc].[clean_read_IR].[pent_emp_mth_FTE_IDI_yyyymm_RFabling] for the data. These PENT-Individual level tables can be linked to individual-level information from the relevant instance of the IDI to generate measures of personal characteristics at the firm level (eg, ethnic diversity, employment of individuals with a disability etc). Basic characteristics (age, sex) are included in the labour tables.
Operational Definition
In the LBD, firms are identified in the Longitudinal Business Frame (LBF), a longitudinal unwinding of Stats NZ’s Business Register (BR). The BR is in turn kept up to date based on administrative data from Inland Revenue’s Client Register, which covers all businesses with an active tax registration and is used to identify births, deaths, and reactivations. When a firm is added to the BR it is allocated an Enterprise number (enterprise_nbr), which is used as primary key in most LBD tables.
Firms are marked as ceased on the BR if they deregister from IR’s client register, if they deregister for GST or PAYE, if they have no GST sales or purchases for more than 12 months, or no IR10 income for more than 2 years. Firms which cease remain on the BR and can be reactivated if they re-register or become active again. Firms can also be ceased based on deregistration from the Companies Register, held by the Companies Office.
A distinction is made between firms which are “economically significant” and those which are not. The threshold for economic significance is very low - firms with more than three employees or $30,000 GST sales are counted as significant (see Fabling and Sanderson 2014 for additional triggers). This definition has been consistent over the life of the LBD. However, the threshold at which firms are required to register for GST, and hence appear on IR’s Client Register, has risen over time. This threshold currently sits at $60,000 p.a, leading to potential undercounting of economically significant firms in the 30-$60K revenue range.
Key Notes
-
The unit of observation for the Firm Profile tables is the PENT x financial year. Around 70 percent of firms in New Zealand work to a financial year ending 31 March. Public sector organisations often work to a June 30 year end, while primary sector firms often have a balance date which reflects the seasonal nature of their industry. To create an annual dataset, firm financial year data is aligned to the March year end (dim_year) with the greatest overlap. For example, data for a firm financial year ending June 2022 would be allocated to the 202203 dim_year, while data for a firm financial year ending December 2022 would be allocated to the 202303 dim_year.
-
Firm structure varies from single plant to multi-level hierarchy. Direct majority ownership links between enterprises are identified in the LBF through the [parent_enterprise_number] and [gte_enterprise_nbr]. The former refers to the immediate parent company and the latter refers to the Group Top Enterprise - the highest enterprise in the parent-subsidiary chain. Within firms, there may be more than one geographic unit (GEO) or Kind of Activity Units (KAUs). These structures are identified on the LBF, with the most commonly used identifier being the Permanent Business Number (PBN), which relates to the plant or establishment level, and is effectively synonymous with the GEO.
-
Stats NZ’s Business Frame (BF) tracks and identifies legal entities, which has a subtle difference to the economic concept of a firm. For instance, an owner of an unincorporated business may decide to incorporate. In such a case, Stats NZ will assign a new identifier and this event will be recorded in the business demography statistics as the exit of an existing firm and the entry of a new firm. If this “new” business has almost the same economic characteristics, e.g., employees, location, ownership, this may be considered as the same firm from an economic perspective. To support longitudinal analysis, Fabling (2011) develops a methodology for repairing enterprise number breaks based on plant (PBN) level employment. Two enterprises are linked as a permanent enterprise (PENT) if, at a point in time, all employing plants associated with one enterprise number are simultaneously and permanently transferred to another enterprise number. PENT-Enterprise links are available in the IDI_ADHOC tables. See Fabling (2011) for further information.
-
Measures included in the Firm Profile module have been individually developed and documented in other Modules. The Firm Profile Module contains only basic information about firms - measures of size, age, firm type, and location. For additional detail and measures of firm performance (gross output, productivity, capital) see Fabling and Mare (2019) and updated notes available through the Productivity Hub.
-
In contrast to individual identifiers in the IDI, Enterprise numbers are consistent across the different iterations (archives) of the LBD. That is, they can be used to link information from one archive to another.
-
The use of PENTs is important for analysis of firm dynamics (entry, exit, growth) and for analysis of changes in performance or activities within firms over time. Enterprise numbers are reasonable for use in static analyses or analysis of aggregate performance over time.
-
Plant-level data (GEO, PBN) is limited to information on industry, location and employment. See the lbd_cm_pent_geo module for information on the links between workers and plants.
-
Firms that have not been allocated an enterprise number by Statistics NZ do not appear on the LBF, even if they appear elsewhere in the data (e.g. exporting or employment data). Consistent with the principle of “keeping everything”, administrative records are held in the database where there is filed data that cannot be linked to firm ids. For instance, individuals importing goods for personal use may be present in the Customs data. As such, the rate of non-linking to the LBF does not necessarily reflect failure in the matching technology.
References & Contacts
- Fabling, R (2011). Keeping it Together: Tracking Firms in New Zealand’s Longitudinal Business Database. Motu Working Paper 11-01. March 2011.
- Fabling, R., & Mare, D. C. (2015). Addressing the absence of hours information in linked employer-employee data. Motu Working Paper 15-17. October 2015.
- Fabling, R., & Sanderson, L. (2016). A Rough Guide to New Zealand’s Longitudinal Business Database 2nd Ed. Motu Working Paper 16-03. February 2016.
- Fabling, R., & Mare, D. C. (2019). Improved productivity measurement in New Zealand Longitudinal Business Database. Motu Working Paper 19-03. April 2019.
- Productivity Hub. Longitudinal Business Database.
Community of Interest
| Who | Involvement |
|---|---|
| Lynda Sanderson | Lead Subject Matter Expert |
| Joe Bancolita | Initial draft of the code |
| Corey Allen | Contributor |
| Christopher Bell | Contributor |
| Rodney Jer | Contributor |
| Jeroen Kole | Contributor |
| Samuel Mortlock | Contributor |
| Guanyu “Fish” Zheng | Contributor |
Parameters
The following parameters should be supplied to this module to run it in the database:
- IDI_Sandpit: The SQL database on which the module dataset is to be created.
- {targetschema}: The project schema under the target database into which the module dataset is to be created.
- {projprefix}: A (short) prefix that enables you to identify the spell dataset easily in the schema, and prevent overwriting any existing datasets that have the same name.
- [IDI_Clean_202210]: Source IDI refresh, e.g. “IDI_Clean_202206”
- {lbdcleanversion}: The LBD refresh where the module will be based upon, e.g. “ibuldd_clean_archive_202112”.
- {auxdbsrc}: Auxillary database, typically the source of the labour tables, not expected to change but parameterised in case, e.g. “ibuldd_research_datalab”.
- {penttbldate}: Version date of the labour tables to be used in YYYYMMDD, e.g. 202111020
- {module}: The name of this module, e.g. “pent_profile”
- {codedb}: Database where the code will be stored, i.e. IDI_UserCode
- {codesch}: Schema where the code will be stored, i.e. community_code for the CM
Dependencies
[ibuldd_clean].[dbo].[pent_bal_ind_idi_$(penttbldate)_rfabling]
[idi_adhoc].[clean_read_ir].[pent_IDI_$(penttbldate)_rfabling]
Outputs
{codedb}.{targetschema}.pent_profile
{codedb}.{targetschema}.pent2snzuid
Variable Descriptions
Profile Output
| Aspect | Variables | Description |
|---|---|---|
| Entity | pent | The Permanent Enterprise Number of the firm. See Fabling (2011) |
| Period | dim_year_key | Financial year (aligned to closest March year, in YYYY03) |
| Event information | lbf_birth | Lbf birth year |
| first_employment | First employment year | |
| first_gst | First GST year | |
| birth_yr | The earliest birth date of a pent. If no year, this will be high end dated (9999) | |
| business_type96_code | 1996 Statistics New Zealand business type code | |
| inst_sector96_code | 1996 Statistics New Zealand institutional sector code | |
| always_pfp | Always private for profit indicator | |
| anz06_4d | Annual indicator of the firm’s industry | |
| anz06_annual_imputed | Annual indicator of imputation of the firm’s industry | |
| anz06_annual_method | Annual indicator of method of imputation of the firm’s industry | |
| anz06_perm | “Permanent” industry | |
| anz06_perm_method | “Permanent” industry method of derivation | |
| nzsioc_lvl3 | New Zealand Standard Industry Output Categories based on ANZSIC06. NZSIOC level 3 is the standard for annual statistics for coherence comparing statistics from different sources | |
| pf_ind | TBD | |
| sales_ex_gst | Difference of monthly GST on GST sales (excluding adjustments) from GST sales (GST-inclusive), aggregated to firm’s financial year | |
| purch_ex_gst | Difference of monthly GST on GST purchases (excluding adjustments) from GST purchases (GST-inclusive), aggregated to firm’s financial year | |
| main_activity_gst_exempt | Indicator = 1 if firm’s main product of primary activity is gst exempt - finance and real estate services | |
| wp | Total WP labour input in the financial year | |
| wp_unknown_trans | Total WP count in dim_year_key where the WP transition-based adjustment is not implementable | |
| wp_count | Total WP labour input count in dim_year_key (not adjusted) | |
| rme | Average monthly employee headcount in the financial year | |
| fte | Average monthly FTE employment in the financial year | |
| rme_{rc} | rme that have been spread over regional council codes | |
| fte_{rc} | fte that have been spread over regional council codes |
Linking Output
| Aspect | Variables | Description |
|---|---|---|
| Entity | snz_uid | Individual’s unique identifier (per IDI refresh) |
| pent T | he Permanent Enterprise Number of the firm | |
| pbn_nbr L | ocation (plant) of the enterprise where the individual is working during dim_month_key | |
| Period | dim_month_key | Financial year (aligned to closest March year, in YYYY03) |
Module Version & Change History
| Date | Version Comments |
|---|---|
| 1-Sep-22 | L Sanderson - v0 Base Code |
| 22-Sep-22 | J Bancolita - First review |
| 27-Sep-22 | J Bancolita - Modularise |
| 30-Sep-22 | J Bancolita - Parameterise |
| 1-Oct-22 | J Bancolita - Revise after component codes |
| 15-Oct-22 | J Bancolita - Helpers and Utilities |
| 31-Oct-22 | J Bancolita - Revisions from pent_firm |
| 19-Jan-24 | T Van Loenhout - Streamlining and documenting for modular code |
| 05 -Jun-24 | T Van Loenhout - Changing output names and updating source datasets |
Code
/* Set Parameters */
/*PARAMETERS
SQLCMD only (Activate by clicking Query->SQLCMD Mode)
*/
:setvar idicleanversion "{idicleanversion}"
:setvar codedb "{codedb}"
:setvar targetdb "{targetdb}"
:setvar targetschema "{targetschema}"
:setvar penttbldate "{penttbldate}"
:setvar module "{module}"
USE [$(codedb)];
/* For wide format, this section makes sure that all of the gathered pents and balance years are present */
DROP TABLE IF EXISTS [$(targetdb)].[$(targetschema)].[pent_dyk]
CREATE TABLE [$(targetdb)].[$(targetschema)].[pent_dyk] (
pent char(10) not null
, dim_year_key int not null
, primary key clustered(pent, dim_year_key)
);
GO
INSERT INTO [$(targetdb)].[$(targetschema)].[pent_dyk]
SELECT
pent
,dim_year_key
FROM [$(targetdb)].[$(targetschema)].[pent_industry]
UNION
SELECT
pent
,dim_year_key
FROM [$(targetdb)].[$(targetschema)].[pent_size_L]
UNION
SELECT
pent
,dim_year_key
FROM [$(targetdb)].[$(targetschema)].[pent_size_S]
UNION
SELECT
pent
,dim_year_key
FROM [$(targetdb)].[$(targetschema)].[pent_type]
UNION
SELECT
pent
,dim_year_key
FROM [$(targetdb)].[$(targetschema)].[pent_geo_year_emp]
UNION
SELECT
pent
,dim_year_key = end_date
FROM [$(targetdb)].[$(targetschema)].[pent_age]
/* Now create the roll up view */
IF OBJECT_ID('[$(targetschema)].[$(module)]','V') IS NOT NULL
DROP VIEW [$(targetschema)].[$(module)]
GO
CREATE VIEW [$(targetschema)].[$(module)] AS
SELECT
pent.pent
, pent.dim_year_key
, a.lbf_birth
, a.first_employment
, a.first_gst
, birth_yr = a.earliest_data_year
, bt.business_type96_code
, bt.inst_sector96_code
, bt.always_pfp
, anz06_4d = an_ind.anz06_4d
, anz06_annual_imputed = an_ind.imputed
, anz06_annual_method = an_ind.method
, anz06_perm = bal.anz06_4d
, anz06_perm_method = bal.anz06_method
, bal.nzsioc_lvl3
, bal.pf_ind
, s.sales_ex_gst
, s.purch_ex_gst
, s.main_activity_gst_exempt
, l.rme
, l.fte
, l.wp
, l.wp_unknown_trans
, l.wp_count
,[rme_1]
,[rme_2]
,[rme_3]
,[rme_4]
,[rme_5]
,[rme_6]
,[rme_7]
,[rme_8]
,[rme_9]
,[rme_12]
,[rme_13]
,[rme_14]
,[rme_15]
,[rme_16]
,[rme_17]
,[rme_18]
,[fte_1]
,[fte_2]
,[fte_3]
,[fte_4]
,[fte_5]
,[fte_6]
,[fte_7]
,[fte_8]
,[fte_9]
,[fte_12]
,[fte_13]
,[fte_14]
,[fte_15]
,[fte_16]
,[fte_17]
,[fte_18]
FROM [$(targetdb)].[$(targetschema)].[pent_dyk] pent
LEFT JOIN [ibuldd_clean].[dbo].[pent_bal_ind_IDI_$(penttbldate)_rfabling] bal ON pent.pent = bal.pent
LEFT JOIN [$(targetdb)].[$(targetschema)].[pent_age] a ON pent.pent = a.pent
LEFT JOIN [$(targetdb)].[$(targetschema)].[pent_industry] an_ind ON pent.pent = an_ind.pent AND pent.dim_year_key = an_ind.dim_year_key
LEFT JOIN [$(targetdb)].[$(targetschema)].[pent_size_L] l ON pent.pent = l.pent AND pent.dim_year_key = l.dim_year_key
LEFT JOIN [$(targetdb)].[$(targetschema)].[pent_size_S] s ON pent.pent = s.pent AND pent.dim_year_key = s.dim_year_key
LEFT JOIN [$(targetdb)].[$(targetschema)].[pent_type] bt ON pent.pent = bt.pent AND pent.dim_year_key = bt.dim_year_key
LEFT JOIN [$(codedb)].[$(targetschema)].[pent_geo_year_rme_rc_wide] rme ON pent.pent = rme.pent AND pent.dim_year_key = rme.dim_year_key
LEFT JOIN [$(codedb)].[$(targetschema)].[pent_geo_year_fte_rc_wide] fte ON pent.pent = fte.pent AND pent.dim_year_key = fte.dim_year_key
GO
/* Linking to individuals */
IF OBJECT_ID('[$(targetschema)].[pent2snzuid]','V') IS NOT NULL
DROP VIEW [$(targetschema)].[pent2snzuid]
GO
CREATE VIEW [$(targetschema)].[pent2snzuid] AS
SELECT
snz_uid
, pent.pent
, [pbn_nbr] = ir_ems_pbn_nbr
, [dim_month_key] = CONCAT(YEAR(ir_ems_return_period_date), RIGHT(CONCAT('0',MONTH(ir_ems_return_period_date)),2) )
FROM [$(idicleanversion)].[ir_clean].[ird_ems] ems
INNER JOIN [IDI_Adhoc].[clean_read_IR].[pent_IDI_$(penttbldate)_rfabling] pent on ems.ir_ems_enterprise_nbr = pent.enterprise_nbr
GO