Business Profile Code Module: FIRM

tori.vanloenhout
21 March 2024

Module Output

SQL:
[IDI_Community].[pent_profile].profile_YYYYMM
[IDI_Community].[pent_pent2snzuid].pent2snzuid_YYYYMM
SAS:
libname cmbp ODBC dsn=idi_community_srvprd schema=pent_profile; proc print data = cmbp.profile_YYYYMM; run ;
libname cmbp ODBC dsn=idi_community_srvprd schema=pent_pent2snzuid; proc print data = cmbp.pent2snzuid_YYYYMM; run ;
How to access a code module in the Data Lab : Read here

Context

This module creates a table of firm and individual level data to the permanent enterprise (pent) x financial year level. PENT data can be linked to individuals through the Fabling and Mare Labour Tables. See Fabling and Mare (2015) for detailed information and [IDI_Adhoc].[clean_read_IR].[pent_emp_mth_FTE_IDI_yyyymm_RFabling] for the data. These PENT-Individual level tables can be linked to individual-level information from the relevant instance of the IDI to generate measures of personal characteristics at the firm level (eg, ethnic diversity, employment of individuals with a disability etc). Basic characteristics (age, sex) are included in the labour tables.

Operational Definition

In the LBD, firms are identified in the Longitudinal Business Frame (LBF), a longitudinal unwinding of Stats NZ’s Business Register (BR). The BR is in turn kept up to date based on administrative data from Inland Revenue’s Client Register, which covers all businesses with an active tax registration and is used to identify births, deaths, and reactivations. When a firm is added to the BR it is allocated an Enterprise number (enterprise_nbr), which is used as primary key in most LBD tables.

Firms are marked as ceased on the BR if they deregister from IR’s client register, if they deregister for GST or PAYE, if they have no GST sales or purchases for more than 12 months, or no IR10 income for more than 2 years. Firms which cease remain on the BR and can be reactivated if they re-register or become active again. Firms can also be ceased based on deregistration from the Companies Register, held by the Companies Office.

A distinction is made between firms which are “economically significant” and those which are not. The threshold for economic significance is very low - firms with more than three employees or $30,000 GST sales are counted as significant (see Fabling and Sanderson 2014 for additional triggers). This definition has been consistent over the life of the LBD. However, the threshold at which firms are required to register for GST, and hence appear on IR’s Client Register, has risen over time. This threshold currently sits at $60,000 p.a, leading to potential undercounting of economically significant firms in the 30-$60K revenue range.

Key Notes

  1. The unit of observation for the Firm Profile tables is the PENT x financial year. Around 70 percent of firms in New Zealand work to a financial year ending 31 March. Public sector organisations often work to a June 30 year end, while primary sector firms often have a balance date which reflects the seasonal nature of their industry. To create an annual dataset, firm financial year data is aligned to the March year end (dim_year) with the greatest overlap. For example, data for a firm financial year ending June 2022 would be allocated to the 202203 dim_year, while data for a firm financial year ending December 2022 would be allocated to the 202303 dim_year.

  2. Firm structure varies from single plant to multi-level hierarchy. Direct majority ownership links between enterprises are identified in the LBF through the [parent_enterprise_number] and [gte_enterprise_nbr]. The former refers to the immediate parent company and the latter refers to the Group Top Enterprise - the highest enterprise in the parent-subsidiary chain. Within firms, there may be more than one geographic unit (GEO) or Kind of Activity Units (KAUs). These structures are identified on the LBF, with the most commonly used identifier being the Permanent Business Number (PBN), which relates to the plant or establishment level, and is effectively synonymous with the GEO.

  3. Stats NZ’s Business Frame (BF) tracks and identifies legal entities, which has a subtle difference to the economic concept of a firm. For instance, an owner of an unincorporated business may decide to incorporate. In such a case, Stats NZ will assign a new identifier and this event will be recorded in the business demography statistics as the exit of an existing firm and the entry of a new firm. If this “new” business has almost the same economic characteristics, e.g., employees, location, ownership, this may be considered as the same firm from an economic perspective. To support longitudinal analysis, Fabling (2011) develops a methodology for repairing enterprise number breaks based on plant (PBN) level employment. Two enterprises are linked as a permanent enterprise (PENT) if, at a point in time, all employing plants associated with one enterprise number are simultaneously and permanently transferred to another enterprise number. PENT-Enterprise links are available in the IDI_ADHOC tables. See Fabling (2011) for further information.

  4. Measures included in the Firm Profile module have been individually developed and documented in other Modules. The Firm Profile Module contains only basic information about firms - measures of size, age, firm type, and location. For additional detail and measures of firm performance (gross output, productivity, capital) see Fabling and Mare (2019) and updated notes available through the Productivity Hub.

  5. In contrast to individual identifiers in the IDI, Enterprise numbers are consistent across the different iterations (archives) of the LBD. That is, they can be used to link information from one archive to another.

  6. The use of PENTs is important for analysis of firm dynamics (entry, exit, growth) and for analysis of changes in performance or activities within firms over time. Enterprise numbers are reasonable for use in static analyses or analysis of aggregate performance over time.

  7. Plant-level data (GEO, PBN) is limited to information on industry, location and employment. See the lbd_cm_pent_geo module for information on the links between workers and plants.

  8. Firms that have not been allocated an enterprise number by Statistics NZ do not appear on the LBF, even if they appear elsewhere in the data (e.g. exporting or employment data). Consistent with the principle of “keeping everything”, administrative records are held in the database where there is filed data that cannot be linked to firm ids. For instance, individuals importing goods for personal use may be present in the Customs data. As such, the rate of non-linking to the LBF does not necessarily reflect failure in the matching technology.

References & Contacts

  1. Fabling, R (2011). Keeping it Together: Tracking Firms in New Zealand’s Longitudinal Business Database. Motu Working Paper 11-01. March 2011.
  2. Fabling, R., & Mare, D. C. (2015). Addressing the absence of hours information in linked employer-employee data. Motu Working Paper 15-17. October 2015.
  3. Fabling, R., & Sanderson, L. (2016). A Rough Guide to New Zealand’s Longitudinal Business Database 2nd Ed. Motu Working Paper 16-03. February 2016.
  4. Fabling, R., & Mare, D. C. (2019). Improved productivity measurement in New Zealand Longitudinal Business Database. Motu Working Paper 19-03. April 2019.
  5. Productivity Hub. Longitudinal Business Database.

Community of Interest

Who Involvement
Lynda Sanderson Lead Subject Matter Expert
Joe Bancolita Initial draft of the code
Corey Allen Contributor
Christopher Bell Contributor
Rodney Jer Contributor
Jeroen Kole Contributor
Samuel Mortlock Contributor
Guanyu “Fish” Zheng Contributor

Parameters

The following parameters should be supplied to this module to run it in the database:

  1. IDI_Sandpit: The SQL database on which the module dataset is to be created.
  2. {targetschema}: The project schema under the target database into which the module dataset is to be created.
  3. {projprefix}: A (short) prefix that enables you to identify the spell dataset easily in the schema, and prevent overwriting any existing datasets that have the same name.
  4. [IDI_Clean_202210]: Source IDI refresh, e.g. “IDI_Clean_202206”
  5. {lbdcleanversion}: The LBD refresh where the module will be based upon, e.g. “ibuldd_clean_archive_202112”.
  6. {auxdbsrc}: Auxillary database, typically the source of the labour tables, not expected to change but parameterised in case, e.g. “ibuldd_research_datalab”.
  7. {penttbldate}: Version date of the labour tables to be used in YYYYMMDD, e.g. 202111020
  8. {module}: The name of this module, e.g. “pent_profile”
  9. {codedb}: Database where the code will be stored, i.e. IDI_UserCode
  10. {codesch}: Schema where the code will be stored, i.e. community_code for the CM

Dependencies

[ibuldd_clean].[dbo].[pent_bal_ind_idi_$(penttbldate)_rfabling] 
[idi_adhoc].[clean_read_ir].[pent_IDI_$(penttbldate)_rfabling]

Outputs

{codedb}.{targetschema}.pent_profile
{codedb}.{targetschema}.pent2snzuid

Variable Descriptions

Profile Output

Aspect Variables Description
Entity pent The Permanent Enterprise Number of the firm. See Fabling (2011)
Period dim_year_key Financial year (aligned to closest March year, in YYYY03)
Event information lbf_birth Lbf birth year
first_employment First employment year
first_gst First GST year
birth_yr The earliest birth date of a pent. If no year, this will be high end dated (9999)
business_type96_code 1996 Statistics New Zealand business type code
inst_sector96_code 1996 Statistics New Zealand institutional sector code
always_pfp Always private for profit indicator
anz06_4d Annual indicator of the firm’s industry
anz06_annual_imputed Annual indicator of imputation of the firm’s industry
anz06_annual_method Annual indicator of method of imputation of the firm’s industry
anz06_perm “Permanent” industry
anz06_perm_method “Permanent” industry method of derivation
nzsioc_lvl3 New Zealand Standard Industry Output Categories based on ANZSIC06. NZSIOC level 3 is the standard for annual statistics for coherence comparing statistics from different sources
pf_ind TBD
sales_ex_gst Difference of monthly GST on GST sales (excluding adjustments) from GST sales (GST-inclusive), aggregated to firm’s financial year
purch_ex_gst Difference of monthly GST on GST purchases (excluding adjustments) from GST purchases (GST-inclusive), aggregated to firm’s financial year
main_activity_gst_exempt Indicator = 1 if firm’s main product of primary activity is gst exempt - finance and real estate services
wp Total WP labour input in the financial year
wp_unknown_trans Total WP count in dim_year_key where the WP transition-based adjustment is not implementable
wp_count Total WP labour input count in dim_year_key (not adjusted)
rme Average monthly employee headcount in the financial year
fte Average monthly FTE employment in the financial year
rme_{rc} rme that have been spread over regional council codes
fte_{rc} fte that have been spread over regional council codes

Linking Output

Aspect Variables Description
Entity snz_uid Individual’s unique identifier (per IDI refresh)
pent T he Permanent Enterprise Number of the firm
pbn_nbr L ocation (plant) of the enterprise where the individual is working during dim_month_key
Period dim_month_key Financial year (aligned to closest March year, in YYYY03)

Module Version & Change History

Date Version Comments
1-Sep-22 L Sanderson - v0 Base Code
22-Sep-22 J Bancolita - First review
27-Sep-22 J Bancolita - Modularise
30-Sep-22 J Bancolita - Parameterise
1-Oct-22 J Bancolita - Revise after component codes
15-Oct-22 J Bancolita - Helpers and Utilities
31-Oct-22 J Bancolita - Revisions from pent_firm
19-Jan-24 T Van Loenhout - Streamlining and documenting for modular code
05 -Jun-24 T Van Loenhout - Changing output names and updating source datasets

Code

/* Set Parameters */

/*PARAMETERS
SQLCMD only (Activate by clicking Query->SQLCMD Mode)
*/

:setvar idicleanversion "{idicleanversion}"
:setvar codedb "{codedb}"
:setvar targetdb "{targetdb}"
:setvar targetschema "{targetschema}"
:setvar penttbldate "{penttbldate}"
:setvar module "{module}"

USE [$(codedb)];

/* For wide format, this section makes sure that all of the gathered pents and balance years are present */
DROP TABLE IF EXISTS [$(targetdb)].[$(targetschema)].[pent_dyk]

CREATE TABLE [$(targetdb)].[$(targetschema)].[pent_dyk] (
pent char(10) not null
, dim_year_key int not null
, primary key clustered(pent, dim_year_key)
);
GO

INSERT INTO [$(targetdb)].[$(targetschema)].[pent_dyk]

	SELECT
	pent
	,dim_year_key 
	FROM [$(targetdb)].[$(targetschema)].[pent_industry]

	UNION

	SELECT 
	pent
	,dim_year_key 
	FROM [$(targetdb)].[$(targetschema)].[pent_size_L]

	UNION

	SELECT 
	pent
	,dim_year_key 
	FROM [$(targetdb)].[$(targetschema)].[pent_size_S]

	UNION

	SELECT
	pent
	,dim_year_key 
	FROM [$(targetdb)].[$(targetschema)].[pent_type]

	UNION

	SELECT 
	pent
	,dim_year_key 
	FROM [$(targetdb)].[$(targetschema)].[pent_geo_year_emp]

	UNION

	SELECT 
	pent
	,dim_year_key = end_date 
	FROM [$(targetdb)].[$(targetschema)].[pent_age]

/* Now create the roll up view */
IF OBJECT_ID('[$(targetschema)].[$(module)]','V') IS NOT NULL
	DROP VIEW [$(targetschema)].[$(module)]
GO

CREATE VIEW [$(targetschema)].[$(module)] AS
SELECT 
	    pent.pent
		, pent.dim_year_key
		, a.lbf_birth
        , a.first_employment
        , a.first_gst
        , birth_yr = a.earliest_data_year
		, bt.business_type96_code
		, bt.inst_sector96_code
		, bt.always_pfp
		, anz06_4d = an_ind.anz06_4d
		, anz06_annual_imputed = an_ind.imputed
		, anz06_annual_method = an_ind.method
		, anz06_perm = bal.anz06_4d
		, anz06_perm_method = bal.anz06_method
		, bal.nzsioc_lvl3
		, bal.pf_ind	
		, s.sales_ex_gst	
		, s.purch_ex_gst
		, s.main_activity_gst_exempt
		, l.rme	
		, l.fte	
		, l.wp	
		, l.wp_unknown_trans 
		, l.wp_count
		,[rme_1]
		,[rme_2]
		,[rme_3]
		,[rme_4]
		,[rme_5]
		,[rme_6]
		,[rme_7]
		,[rme_8]
		,[rme_9]
		,[rme_12]
		,[rme_13]
		,[rme_14]
		,[rme_15]
		,[rme_16]
		,[rme_17]
		,[rme_18]
		,[fte_1]
		,[fte_2]
		,[fte_3]
		,[fte_4]
		,[fte_5]
		,[fte_6]
		,[fte_7]
		,[fte_8]
		,[fte_9]
		,[fte_12]
		,[fte_13]
		,[fte_14]
		,[fte_15]
		,[fte_16]
		,[fte_17]
		,[fte_18]
		FROM [$(targetdb)].[$(targetschema)].[pent_dyk] pent
		LEFT JOIN [ibuldd_clean].[dbo].[pent_bal_ind_IDI_$(penttbldate)_rfabling] bal ON pent.pent = bal.pent
		LEFT JOIN [$(targetdb)].[$(targetschema)].[pent_age] a ON pent.pent = a.pent 
		LEFT JOIN [$(targetdb)].[$(targetschema)].[pent_industry] an_ind ON pent.pent = an_ind.pent AND pent.dim_year_key = an_ind.dim_year_key
		LEFT JOIN [$(targetdb)].[$(targetschema)].[pent_size_L] l ON pent.pent = l.pent AND pent.dim_year_key = l.dim_year_key
		LEFT JOIN [$(targetdb)].[$(targetschema)].[pent_size_S]  s  ON pent.pent = s.pent AND pent.dim_year_key = s.dim_year_key
		LEFT JOIN [$(targetdb)].[$(targetschema)].[pent_type] bt ON pent.pent = bt.pent AND pent.dim_year_key = bt.dim_year_key
		LEFT JOIN [$(codedb)].[$(targetschema)].[pent_geo_year_rme_rc_wide] rme ON pent.pent = rme.pent AND pent.dim_year_key = rme.dim_year_key
		LEFT JOIN [$(codedb)].[$(targetschema)].[pent_geo_year_fte_rc_wide] fte ON pent.pent = fte.pent AND pent.dim_year_key = fte.dim_year_key
GO

/* Linking to individuals */
IF OBJECT_ID('[$(targetschema)].[pent2snzuid]','V') IS NOT NULL
DROP VIEW [$(targetschema)].[pent2snzuid]
GO

CREATE VIEW [$(targetschema)].[pent2snzuid]  AS

SELECT 
snz_uid
, pent.pent
, [pbn_nbr] = ir_ems_pbn_nbr
, [dim_month_key] = CONCAT(YEAR(ir_ems_return_period_date), RIGHT(CONCAT('0',MONTH(ir_ems_return_period_date)),2) )
FROM [$(idicleanversion)].[ir_clean].[ird_ems] ems
INNER JOIN [IDI_Adhoc].[clean_read_IR].[pent_IDI_$(penttbldate)_rfabling] pent on ems.ir_ems_enterprise_nbr = pent.enterprise_nbr	
GO