A provocative new paper argues that US labour market data contains several biases that systematically understate the level of unemployment in the US. The authors find that correcting for the biases results in the US unemployment rate being 2ppt higher on average since 2001.
The Biases to US Labour Data
The Current Population Survey (CPS) is a monthly survey of US households conducted by the Bureau of Labor Statistics (BLS). It is the primary source of labour market statistics for economists, policymakers and investors. According to authors H.J. Ahn and J.D. Hamilton, however, the unemployment rate, labour force participation rate and duration of unemployment calculated from the survey suffer from the following internal inconsistencies:
• Rotation biases arising from disengagement and stigmas associated with being unemployed.
• Missing observations, which are non-random and an increasing source of bias in recent years.
• Number preferences of respondents when providing information on the duration of job search.
• Mismatches between the duration of job search and the labour force status of an individual.
• Inconsistencies between the unemployment hazard rates and reported duration of unemployment.
Fundamentally, these issues imply that if one of the main reported statistics is correct, the other must be incorrect. For example, two equivalent ways of calculating the probability that a person who was unemployed last month and remains unemployed this month are:
This article is only available to Macro Hive subscribers. Sign-up to receive world-class macro analysis with a daily curated newsletter, podcast, original content from award-winning researchers, cross market strategy, equity insights, trade ideas, crypto flow frameworks, academic paper summaries, explanation and analysis of market-moving events, community investor chat room, and more.
A provocative new paper argues that US labour market data contains several biases that systematically understate the level of unemployment in the US. The authors find that correcting for the biases results in the US unemployment rate being 2ppt higher on average since 2001.
The Biases to US Labour Data
The Current Population Survey (CPS) is a monthly survey of US households conducted by the Bureau of Labor Statistics (BLS). It is the primary source of labour market statistics for economists, policymakers and investors. According to authors H.J. Ahn and J.D. Hamilton, however, the unemployment rate, labour force participation rate and duration of unemployment calculated from the survey suffer from the following internal inconsistencies:
- Rotation biases arising from disengagement and stigmas associated with being unemployed.
- Missing observations, which are non-random and an increasing source of bias in recent years.
- Number preferences of respondents when providing information on the duration of job search.
- Mismatches between the duration of job search and the labour force status of an individual.
- Inconsistencies between the unemployment hazard rates and reported duration of unemployment.
Fundamentally, these issues imply that if one of the main reported statistics is correct, the other must be incorrect. For example, two equivalent ways of calculating the probability that a person who was unemployed last month and remains unemployed this month are:
- A duration-based measure. This calculates the ratio of individuals who are unemployed in period t with a reported duration greater than 4 weeks to the total number of individuals unemployed at t-1. Across the sample period (2001-2020), the CPS data suggests that this probability amounted to a 68.6% chance of remaining unemployed.
- A flow-based measure. This looks at the subset of individuals who are unemployed in period t-1 and either employed, unemployed or not in the labour force in period t. You can then calculate the number of unemployed-to-unemployed continuations as a fraction of the sum. The authors estimate this probability to be 53.0%.
If all magnitudes were measured accurately, the two estimates should be equivalent. Instead they differ wildly, which the authors also find for their estimates of the unemployment and labour force participation rates. When correcting for these inconsistencies, the paper shows that the unadjusted BLS figures understate the unemployment rate and labour-force participation rate by about two percentage points on average. They also suggest that the average unemployment durations are far below those reported by the BLS and that they mis-represent what happened to average durations during the Great Recession.
Below we see a graph highlighting the contributions of the inconsistencies to the higher estimates of the unemployment and labour force participation rates. Correcting for rotation biases only (red) would add half a percentage point to the unemployment rate and 1.1% to the labour-force participation rate. The green line shows the contribution of also taking into consideration missing observations. The black line is the BLS figure.
Chart 1: Inconsistencies Contribute to Lower BLS Estimates
Source: Page 59 of “Measuring Labor-Force Participation and the Incidence and Duration of Unemployment”
How Does the CPS Work?
Since 2001, around 60,000 housing units are contacted each month to fill in the survey. If an individual is over sixteen, not in the armed forces or in an institution such as a prison or nursing home, they can be categorised as:
- Employed (E): If during the reference week of the survey month the individual did any work at all for pay.
- Unemployed (U): If during the reference week of the survey month the individual is not E but was available for work and made specific efforts to find employment during the previous 4 weeks.
- Not in the Labour Force (N): If during the reference week of the survey month the individual is neither E nor U.
Within any given month, an eighth of the qualifying 60,000 households (7,500) will be being interviewed for the first time (denoted rotation 1). Another eighth are interviewed for the second, third or fourth time (rotations 2, 3 or 4). Otherwise stated, an individual contacted for the first time will be contacted by the interviewer for four consecutive months.
After four months, the household is not interviewed for the next 8 months. Rotation 5 are thus those individuals surveyed one year after the first interview. Rotations 6, 7 and 8 represent the final three interviews for a household, which take place for each of the following 3 months after rotation five.
The goal of surveying the same individual over time is to determine how their employment status changes. Since 1994, an individual who is unemployed for two consecutive months will not be asked by the interviewer again for the duration of unemployment in the second month. Instead he or she simply adds time elapsed since the previous interview to the previous answer. Thus, new unemployment duration data is only collected in rotations 1 and 5, or in the other rotations for someone who was E, N or missing from the sample the month before.
How to Correct the Biases
The paper is the first to:
- Construct a dataset in which all identities relating to stocks and flows are respected. This is done by adding a fourth category – Missing (M). By doing so, all individuals are accounted for, including those whose information is unavailable in a particular rotation or is inconsistent with the information reported for that individual in other rotations.
- Model statistically the way in which people’s answers change the more times they have been interviewed. In theory, they construct a counterfactual question asking: ‘If a group of households in rotation in month were being interviewed for the first time instead of the time, how would their answers have been different?’
Combining the two steps allows for a fully reconciled description of stocks and flows in the CPS data. This means that the authors can adjust estimates of the unemployment rates for missing observations on a month-by-month basis, accounting for rotational bias and the other measurement issues described.
The key advantage of their approach is that their data on stocks and flows are internally consistent by construction, always satisfying the accounting identities. It also allows them to document the biases and be the first to evidence cyclical features. That is, the biases introduced by missing observations have increased over time and are bigger when the labour market is slack.
How Bad is the Problem Without Their Adjustments?
Rotation Bias – The CPS data shows that the average unemployment and labour force participation rates decline sharply as a function of rotation group. Individuals in rotation 1 report, on average, a 6.6% unemployment rate and 65.8% participation rate. Those in rotation 8, however, report an average of 5.7% and 64.2%, respectively. This means that if you follow a fixed group of individuals over time, on average outflows from unemployment seem to exceed inflows. Such biases affect any inferences you draw from the CPS.
Missing observations – Households in the CPS have become less likely to answer surveys or to provide answers to all the questions. As such, missing observations are not randomly drawn from the overall population, skewing statistics. The authors find that individuals recorded as missing are more likely than the general population to be unemployed (employed individuals have 6.4% probability of being missing next month, compared to 8.8% for unemployed individuals). As such, we see a notable decrease in the unemployment rate among a fixed group of individuals across interviews. This comes hand-in-hand with an increase in the number of individuals moving from ‘unemployed’ to ‘not in the labour force’.
Number Preference – The authors show significant reporting errors arising from number preference. In terms of unemployment durations, respondents are more likely to report:
- Spells as an integer number of months.
- Longer spells as either 6 months, 1 year, 18 months or longer than 99 weeks.
- Shorter spells as an even number of weeks.
Mismatches between the duration of job search and the labour force status of an individual – The authors consider the status of individuals in rotation 2 who had been counted as not in the labour force when surveyed in rotation 1. They find that two-thirds of individuals reported as N in month 1 but U in month 2 have been looking for jobs for more than 4 weeks despite the fact that they did not report actively seeking employment in the previous month. A similar mismatch arises with those travelling from E to U, with 26% of individuals stating that they have been looking for jobs for longer than 4 weeks. The authors assume, however, that this is not down to misreporting, rather the desire of individuals to search for employment on-the-job.
What Happens After the Correction:
To reconcile stocks and flows in the CPS data, the authors need to correct for biases. Bias correction can be established by generating a counterfactual scenario. If we imagine that an interviewer uses a different ‘technology’ in each rotation, and that each technology obtains data that means different things for each rotation, we can pose the following questions:
- If an individual in rotation x had instead been interviewed using the technology used in rotation y, how would their answers have differed?
- Which interview technology should be used as a baseline summary of the data?
The counterfactual scenario is generated in a statistical fashion and the authors suggest that ‘the first-interview concept of labour-force status’ (i.e. the answers given in the first interview) should be the one that is used. This interview concept will minimise rotation bias with the following justifications:
- Disengagement – People become less engaged the more times they are interviewed and tend towards answers that they think will end the interview more quickly. The CPS interview is more onerous if the respondent says that they have worked at more than one job. The number of people reporting more than one job drops sharply across rotations. Similar results are found for people becoming ‘retired’ or ‘disabled’ in subsequent interviews due to the abbreviated set of labour-force questions these categories must answer.
- Stigma – Some people may perceive a stigma in reporting to an official government agency that they are continually searching for a job without success. This could interact with the disengagement effect – someone who feels more stigma may become less engaged. It may result in individuals reporting in subsequent interviews that they did not actively search for work even though they did, which would show up as an increase in N and a decrease in U in later rotations.
The CPS also allows another member of the household to report the labour-force status of all adults living there. In this case, unemployment does not fall as quickly across rotations as it does for self-responders. Indeed, the authors find that self-responders account for 50% of total observations but two-thirds of rotation bias.
Once the authors correct for biases, they no longer need to remove missing observations. They impute a labour-force status for them based on information those individuals provided in other rotations.
Bottom Line
When correcting for biases and missing observations, the unemployment rate and labour-force participation rate are on average 2.0% and 2.1% higher than the BLS figures, respectively. These can be seen in panels C and D below, where blue represents the estimates from the paper. The main source of error, particularly for the unemployment rate, comes from the misclassification of individuals who are initially categorised as N rather than U.
Panel A is a visual description of the example provided earlier, with the blue line representing the adjusted estimates of the probability that individuals who are unemployed remain so in a subsequent rotation. Panel B, similar to panel C, shows a higher adjusted proportion of people becoming unemployed each month relative to the civilian noninstitutional population (black) and the BLS adjusted series (green).
After adjustments, unemployment durations (captured in panel E) are estimated to only be about 15 weeks, 11 weeks lower than the BLS reports. The proportion of individuals unemployed for less than 5 weeks is 36.4%, within the range of other academic pieces. The adjusted series is also less cyclically variable and shows a smaller change during the Great Recession.
Chart 2: Adjusted Unemployment Rates Are Substantially Higher Than BLS Data
Source: Page 54 of “Measuring Labor-Force Participation and the Incidence and Duration of Unemployment”
To view the full paper, please click here.
Sam van de Schootbrugge is a macro research economist taking a one year industrial break from his Ph.D. in Economics. He has 2 years of experience working in government and has an MPhil degree in Economic Research from the University of Cambridge. His research expertise are in international finance, macroeconomics and fiscal policy.
(The commentary contained in the above article does not constitute an offer or a solicitation, or a recommendation to implement or liquidate an investment or to carry out any other transaction. It should not be used as a basis for any investment decision or other decision. Any investment decision should be based on appropriate professional advice specific to your needs.)