Module Output
SQL: [IDI_Community].[pent_size_l].size_l_YYYYMM
SAS: libname cmbp ODBC dsn=idi_community_srvprd schema=pent_size_l; proc print data = cmbp.size_l_YYYYMM ; run ;
How to access a code module in the Data Lab : Read here
Context
This module creates a table of permanent enterprises with the size of the business or firm by labour input. The size of the business or firm is essential to data collection, analysing drivers of economic growth and appropriate policy targeting. Several measures of size can be used, including labour input, revenue, and assets. A range of potential measures of labour input are available, and useful for different purposes. Relevant distinctions include:
-
Headcount in which each worker is given a weight of one, regardless of the number of hours worked vs full-time equivalent (FTE) in which workers are weighted to reflect their hours worked e.g., a half-time worker will be weighted at 0.5.
-
Employees, who work for the firm for some form of payment under a contract of service vs Persons engaged (or related terms), which includes the self-employed and working proprietors and may include unpaid workers (e.g., family members contributing to the business but not directly remunerated).
For descriptive analysis, firms are often classified into broad size classes, based on employment or labour input. While there is not a uniform classification system, the following groups are commonly used in New Zealand (Small Business Council 2019):
- sole trader - 0 employees
- micro business - 1 to 5 FTE employees
- small business - 6 to 19 FTE employees
- medium business - 20 to 49 FTE employees
- small and medium enterprise (SME) - 6 to 49 FTE employees
- large business - 50 or more FTE employees
Other countries use different thresholds depending on the size of their economy, relationships to other markets, and the size profile of businesses in that country. For example, the OECD makes the following distinctions:
- Micro enterprise - fewer than 10 employees
- Small enterprise - 10-49 employees
- Medium sized enterprises - 50-249 employees
- Large enterprises - 250+ employees
Operational Definition
The core measure of employment used in the LBD and IDI is derived from pay-as-you-earn (PAYE) tax data submitted by employers to Inland Revenue. This can be supplemented through the IDI using administrative information on shareholders and business owners from annual tax returns including the IR3 personal tax return, the IR7 partnership return, and the IR4S Company Shareholders return.
Researchers commonly use measures of labour input (employees and working proprietors) developed by Fabling and Mare (2016). These are available as [ibuldd_research_datalab].[STATSNZ_RFabling].[pent_year_L_IDI_YYYYMMDD] and [ibuldd_research_datalab].[STATSNZ_RFabling].[pent_pbn_month_L_IDI_YYYYMMDD] (commonly referred to as the Fabling and Mare Labour tables).
The tables are well documented and contain substantial information beyond the key variables listed below. Users should familiarise themselves with the definitions available in Fabling and Mare (2015).
The Labour tables are built from the source datasets above and provide the following key data:
-
Annual tables of labour input at the firm (PENT) level including:
-
RME (rolling mean employment): mean of the monthly headcount of employees over the firm’s financial year, excluding identifiable working proprietors.
-
FTE (full time equivalent): Adjusted version of RME in which headcount measures are adjusted to allow for observable deviations from full time employment (defined here as 40 hours per week). Such deviations occur if an individual’s earnings are below the level implied by the legal minimum wage or if an individual holds more than one job concurrently. Excludes identifiable working proprietors.
-
Working proprietor counts: annual count of working proprietors based on annual tax returns (individuals receiving income as an owner, partner or shareholder) and PAYE records (records where the employer and employee have the same tax identifier). Counts are adjusted for individuals who are WPs at more than one firm during a year, and for an expected labour input of 0.5 in the first and final year of a firm’s operation (eg, where a firm operates in the preceding year but not in the following year it is assumed to have ceased mid-way through the current year).
-
Gross earnings of employees: total wages and salaries paid to employees during the year. PAYE earnings of working proprietors are excluded to be consistent with the RME employment measure above.
-
-
Monthly tables of employment at the PBN level:
- Headcount and FTE-adjusted measures of employment at the plant level, based on Stats NZ’s allocation of workers to geographic locations - see the Business Profile Code Module: GEOGRAPHIC UNIT and Fabling and Mare (2020) for further information and caveats.
References & Contacts
- Fabling, R., & Mare, D. C. (2015). Addressing the absence of hours information in linked employer-employee data. Motu Working Paper 15-17. October 2015.
- Fabling, R., & Sanderson, L. (2016). A Rough Guide to New Zealand’s Longitudinal Business Database 2nd Ed. Motu Working Paper 16-03. February 2016.
- Fabling, R., & Mare, D. C. (2020). Measuring commute patterns over time: Using administrative data to identify where employees live and work. Motu Working Paper 20-05. July 2020. (https://motu-www.motu.org.nz/wpapers/20_05.pdf)
- Small Business Council (2019). Defining Small Business: Recommendations of the New Zealand Small Business Council for the Minister of Small Business.
- OECD (2015), “Enterprises by size”, in Entrepreneurship at a Glance 2015, OECD Publishing, Paris. DOI: (https://doi.org/10.1787/entrepreneur_aag-2015-7-en)
Community of Interest
| Who | Involvement |
|---|---|
| Lynda Sanderson | Lead Subject Matter Expert |
| Joe Bancolita | Initial draft of the code |
| Corey Allen | Contributor |
| Christopher Bell | Contributor |
| Rodney Jer | Contributor |
| Jeroen Kole | Contributor |
| Samuel Mortlock | Contributor |
| Guanyu “Fish” Zheng | Contributor |
Parameters
The following parameters should be supplied to this module to run it in the database:
- IDI_Sandpit: The SQL database on which the module dataset is to be created.
- {targetschema}: The project schema under the target database into which the module dataset is to be created.
- {projprefix}: A (short) prefix that enables you to identify the spell dataset easily in the schema, and prevent overwriting any existing datasets that have the same name.
- {auxdbsrc}: Auxillary database, typically the source of the labour tables, not expected to change but parameterised in case, e.g. “ibuldd_research_datalab”.
- {penttbldate}: Version date of the labour tables to be used in YYYYMMDD, e.g. 202111020
- {module}: The name of this module, e.g. “size_L”
Dependencies
[idi_adhoc].[clean_read_ir].[pent_year_l_idi_$(penttbldate)_rfabling]
[idi_adhoc].[clean_read_ir].[pent_wp_yr_idi_$(penttbldate)_rfabling]
Outputs
{targetdb}.{targetschema}.pent_size_L
Variable Descriptions
| Aspect | Variables | Description |
|---|---|---|
| Entity | pent | The Permanent Enterprise Number of the firm. See Fabling (2011). |
| Period | dim_year_key | Financial year (aligned to closest March year, in YYYY03). |
| Event information | rme | Average monthly employee headcount in dim_year_key (rolling mean employment). |
| fte | Average monthly FTE employment in dim_year_key (RME adjusted for known deviations from full-time employment). | |
| wp | Total WP labour input in dim_year_key (adjusted for observable entry/exit and multi-firm WPs, but not for potential but unobserved exit). | |
| wp_unknown_trans | Total WP count in dim_year_key where the WP transition-based adjustment is not implementable. | |
| wp_count | Total WP labour input count in dim_year_key (not adjusted). |
Module Version & Change History
| Date | Version Comments |
|---|---|
| 01 September 2022 | L Sanderson - v0 Base Code |
| 22 September 2022 | J Bancolita - First review |
| 27 September 2022 | J Bancolita - Modularise |
| 30 September 2022 | J Bancolita - Parameterise |
| 19 January 2024 | T Van Loenhout - Streamlining and documenting for modular code |
| 05 June 2024 | T Van Loenhout - Changing output names and updating source datasets |
Code
/* Set Parameters */
/*PARAMETERS
SQLCMD only (Activate by clicking Query->SQLCMD Mode)
*/
:setvar targetdb "{targetdb}"
:setvar targetschema "{targetschema}"
:setvar penttbldate "{penttbldate}"
:setvar module "{module}"
DROP TABLE IF EXISTS [$(targetdb)].[$(targetschema)].[$(module)];
CREATE TABLE [$(targetdb)].[$(targetschema)].[$(module)] (
pent char(10) NOT NULL
, dim_year_key int NOT NULL
, rme float NOT NULL
, fte float NOT NULL
, wp float NOT NULL
, wp_unknown_trans float NOT NULL
, wp_count float
PRIMARY KEY clustered(pent, dim_year_key));
INSERT INTO [$(targetdb)].[$(targetschema)].[$(module)](
pent
, dim_year_key
, rme
, fte
, wp
, wp_unknown_trans
, wp_count)
SELECT
l.pent
, l.dim_year_key
, l.rme_no_wp as rme
, l.fte
, l.wp
, l.wp_unknown_trans
, w.wp_unadjust as wp_count
FROM [IDI_Adhoc].[clean_read_IR].[pent_year_l_idi_$(penttbldate)_rfabling] l
LEFT JOIN (
SELECT
pent
,dim_year_key
,count(distinct(snz_uid)) as wp_unadjust
FROM [IDI_Adhoc].[clean_read_IR].[pent_WP_yr_IDI_$(penttbldate)_rfabling]
GROUP BY
pent
,dim_year_key
) w ON l.pent = w.pent and l.dim_year_key = w.dim_year_key
;