Finding Needles in Haystacks: Multiple-Imputation Record Linkage Using Machine Learning
Author
Abstract
Suggested Citation
DOI: 10.29412/res.wp.2022.11
Download full text from publisher
Other versions of this item:
- John M. Abowd & Joelle Abramowitz & Margaret C. Levenstein & Kristin McCue & Dhiren Patki & Trivellore Raghunathan & Ann M. Rodgers & Matthew D. Shapiro & Nada Wasi & Dawn Zinsser, 2021. "Finding Needles in Haystacks: Multiple-Imputation Record Linkage Using Machine Learning," Working Papers 21-35, Center for Economic Studies, U.S. Census Bureau.
References listed on IDEAS
- P. Lahiri & Michael D. Larsen, 2005. "Regression Analysis With Linked Data," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 222-230, March.
- Brown, Charles & Medoff, James, 1989.
"The Employer Size-Wage Effect,"
Journal of Political Economy, University of Chicago Press, vol. 97(5), pages 1027-1059, October.
- Charles Brown & James L. Medoff, 1989. "The Employer Size-Wage Effect," NBER Working Papers 2870, National Bureau of Economic Research, Inc.
- John M. Abowd & Bryce E. Stephens & Lars Vilhuber & Fredrik Andersson & Kevin L. McKinney & Marc Roemer & Simon Woodcock, 2009.
"The LEHD Infrastructure Files and the Creation of the Quarterly Workforce Indicators,"
NBER Chapters, in: Producer Dynamics: New Evidence from Micro Data, pages 149-230,
National Bureau of Economic Research, Inc.
- John M. Abowd & Bryce E. Stephens & Lars Vilhuber & Fredrik Andersson & Kevin L. McKinney & Marc Roemer & Simon Woodcock, 2002. "The LEHD Infrastructure Files and the Creation of the Quarterly Workforce Indicators," Longitudinal Employer-Household Dynamics Technical Papers 2002-05, Center for Economic Studies, U.S. Census Bureau.
- John Abowd & Bryce Stephens & Lars Vilhuber & Fredrik Andersson & Kevin L. McKinney & Marc Roemer & Simon Woodcock, 2006. "The LEHD Infrastructure Files and the Creation of the Quarterly Workforce Indicators," Longitudinal Employer-Household Dynamics Technical Papers 2006-01, Center for Economic Studies, U.S. Census Bureau.
- Roee Gutman & Christopher C. Afendulis & Alan M. Zaslavsky, 2013. "A Bayesian Procedure for File Linking to Analyze End-of-Life Medical Costs," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(501), pages 34-47, March.
- John M. Abowd & Martha H. Stinson, 2013. "Estimating Measurement Error in Annual Job Earnings: A Comparison of Survey and Administrative Data," The Review of Economics and Statistics, MIT Press, vol. 95(5), pages 1451-1467, December.
- Oi, Walter Y. & Idson, Todd L., 1999. "Firm size and wages," Handbook of Labor Economics, in: O. Ashenfelter & D. Card (ed.), Handbook of Labor Economics, edition 1, volume 3, chapter 33, pages 2165-2214, Elsevier.
- Rebecca C. Steorts & Rob Hall & Stephen E. Fienberg, 2016. "A Bayesian Approach to Graphical Record Linkage and Deduplication," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1660-1672, October.
- Martha J. Bailey & Connor Cole & Morgan Henderson & Catherine Massey, 2020.
"How Well Do Automated Linking Methods Perform? Lessons from US Historical Data,"
Journal of Economic Literature, American Economic Association, vol. 58(4), pages 997-1044, December.
- Martha Bailey & Connor Cole & Morgan Henderson & Catherine Massey, 2017. "How Well Do Automated Linking Methods Perform? Lessons from U.S. Historical Data," NBER Working Papers 24019, National Bureau of Economic Research, Inc.
- Nicholas Bloom & Fatih Guvenen & Benjamin S. Smith & Jae Song & Till von Wachter, 2018. "The Disappearing Large-Firm Wage Premium," AEA Papers and Proceedings, American Economic Association, vol. 108, pages 317-322, May.
- Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
- Timothy Dunne & J. Bradford Jensen & Mark J. Roberts, 2009. "Producer Dynamics: New Evidence from Micro Data," NBER Books, National Bureau of Economic Research, Inc, number dunn05-1, February.
- Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
Citations
Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
Cited by:
- Pablo Ottonello & Wenting Song & Sebastian Sotelo, 2024.
"An Anatomy of Firms’ Political Speech,"
Staff Working Papers
24-37, Bank of Canada.
- Pablo Ottonello & Wenting Song & Sebastian Sotelo, 2024. "An Anatomy of Firms’ Political Speech," NBER Working Papers 32923, National Bureau of Economic Research, Inc.
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.- John M. Abowd & Joelle Abramowitz & Margaret C. Levenstein & Kristin McCue & Dhiren Patki & Trivellore Raghunathan & Ann M. Rodgers & Matthew D. Shapiro & Nada Wasi, 2019. "Optimal Probabilistic Record Linkage: Best Practice for Linking Employers in Survey and Administrative Data," Working Papers 19-08, Center for Economic Studies, U.S. Census Bureau.
- Nicholas Bloom & Scott W. Ohlmacher & Cristina J. Tello-Trillo & Melanie Wallskog, 2021.
"Pay, Productivity and Management,"
NBER Working Papers
29377, National Bureau of Economic Research, Inc.
- Nicholas Bloom & Scott W. Ohlmacher & Cristina J. Tello-Trillo & Melanie Wallskog, 2022. "Pay, productivity and management," POID Working Papers 032, Centre for Economic Performance, LSE.
- Nicholas Bloom & Scott W. Ohlmacher & Cristina J. Tello-Trillo & Melanie Wallskog, 2022. "Pay, productivity and management," CEP Discussion Papers dp1846, Centre for Economic Performance, LSE.
- Nicholas Bloom & Scott Ohlmacher & Cristina Tello-Trillo & Melanie Wallskog, 2021. "Pay, Productivity and Management," Working Papers 21-31, Center for Economic Studies, U.S. Census Bureau.
- Bloom, Nicholas & Ohlmacher, Scott W. & Tello-Trillo, Cristina J. & Wallskog, Melanie, 2022. "Pay, productivity and management," LSE Research Online Documents on Economics 117854, London School of Economics and Political Science, LSE Library.
- Henry Hyatt & Erika McEntarfer & John Haltiwanger, 2014. "Cyclical Reallocation of Workers Across Large and Small Employers," 2014 Meeting Papers 735, Society for Economic Dynamics.
- Brianna Cardiff-Hicks & Francine Lafontaine & Kathryn Shaw, 2015.
"Do Large Modern Retailers Pay Premium Wages?,"
ILR Review, Cornell University, ILR School, vol. 68(3), pages 633-665, May.
- Brianna Cardiff-Hicks & Francine Lafontaine & Kathryn Shaw, 2014. "Do Large Modern Retailers Pay Premium Wages?," NBER Working Papers 20313, National Bureau of Economic Research, Inc.
- Babina, Tania & Ma, Wenting & Moser, Christian & Ouimet, Paige & Zarutskie, Rebecca, 2019.
"Pay, Employment, and Dynamics of Young Firms,"
MPRA Paper
95382, University Library of Munich, Germany.
- Tania Babina & Wenting Ma & Christian Moser & Paige Ouimet & Rebecca Zarutskie, 2019. "Pay, Employment, and Dynamics of Young Firms," Working Papers 19-23, Center for Economic Studies, U.S. Census Bureau.
- Tania Babina & Wenting Ma & Christian Moser & Paige P. Ouimet & Rebecca Zarutskie, 2019. "Pay, Employment, and Dynamics of Young Firms," Opportunity and Inclusive Growth Institute Working Papers 21, Federal Reserve Bank of Minneapolis.
- Emin Dinlersoz & Henry Hyatt & Hubert Janicki, 2019.
"Who Works for Whom? Worker Sorting in a Model of Entrepreneurship with Heterogeneous Labor Markets,"
Review of Economic Dynamics, Elsevier for the Society for Economic Dynamics, vol. 34, pages 244-266, October.
- Hubert Janicki & Henry Hyatt & Emin Dinlersoz, 2015. "Who Works for Whom? Worker Sorting in a Model of Entrepreneurship with Heterogeneous Labor Markets," 2015 Meeting Papers 104, Society for Economic Dynamics.
- Dinlersoz, Emin & Hyatt, Henry R. & Janicki, Hubert P., 2016. "Who Works for Whom? Worker Sorting in a Model of Entrepreneurship with Heterogeneous Labor Markets," IZA Discussion Papers 9693, Institute of Labor Economics (IZA).
- Emin M. Dinlersoz & Henry R. Hyatt & Hubert P. Janicki, 2015. "Who Works for Whom? Worker Sorting in a Model of Entrepreneurship with Heterogeneous Labor Markets," Working Papers 15-08, Center for Economic Studies, U.S. Census Bureau.
- Ouimet, Paige & Zarutskie, Rebecca, 2014.
"Who works for startups? The relation between firm age, employee age, and growth,"
Journal of Financial Economics, Elsevier, vol. 112(3), pages 386-407.
- Paige Ouimet & Rebecca Zarutskie, 2011. "Who Works for Startups? The Relation between Firm Age, Employee Age, and Growth," Working Papers 11-31, Center for Economic Studies, U.S. Census Bureau.
- Paige P. Ouimet & Rebecca Zarutskie, 2013. "Who works for startups? The relation between firm age, employee age, and growth," Finance and Economics Discussion Series 2013-75, Board of Governors of the Federal Reserve System (U.S.).
- Melanie Jones & Ezgi Kaya, 2023.
"The UK gender pay gap: Does firm size matter?,"
Economica, London School of Economics and Political Science, vol. 90(359), pages 937-952, July.
- Jones, Melanie & Kaya, Ezgi, 2022. "The UK Gender Pay Gap: Does Firm Size Matter?," GLO Discussion Paper Series 1149, Global Labor Organization (GLO).
- Egger, Hartmut & Jahn, Elke & Kornitzky, Stefan, 2022.
"How does the position in business group hierarchies affect workers’ wages?,"
Journal of Economic Behavior & Organization, Elsevier, vol. 194(C), pages 244-263.
- Egger, Hartmut & Jahn, Elke J. & Kornitzky, Stefan, 2021. "How Does the Position in Business Group Hierarchies Affect Workers' Wages?," IZA Discussion Papers 14956, Institute of Labor Economics (IZA).
- Jaime Arellano-Bover, 2024.
"Career Consequences of Firm Heterogeneity for Young Workers: First Job and Firm Size,"
Journal of Labor Economics, University of Chicago Press, vol. 42(2), pages 549-589.
- Arellano-Bover, Jaime, 2020. "Career Consequences of Firm Heterogeneity for Young Workers: First Job and Firm Size," IZA Discussion Papers 12969, Institute of Labor Economics (IZA).
- Hartmut Egger & Elke Jahn & Stefan Kornitzky, 2021. "How Does the Position in Business Group Hierarchies Affect Workers’ Wages?," Working Papers 213, Bavarian Graduate Program in Economics (BGPE).
- Jahn, Elke & Egger, Hartmut & Kornitzky, Stefan, 2021. "Does the Position in Business Group Hierarchies Affect Workers' Wages?," VfS Annual Conference 2021 (Virtual Conference): Climate Economics 242374, Verein für Socialpolitik / German Economic Association.
- Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
- Carstensen, Kai & Heinrich, Markus & Reif, Magnus & Wolters, Maik H., 2020.
"Predicting ordinary and severe recessions with a three-state Markov-switching dynamic factor model,"
International Journal of Forecasting, Elsevier, vol. 36(3), pages 829-850.
- Heinrich, Markus & Carstensen, Kai & Reif, Magnus & Wolters, Maik, 2017. "Predicting Ordinary and Severe Recessions with a Three-State Markov-Switching Dynamic Factor Model. An Application to the German Business Cycle," VfS Annual Conference 2017 (Vienna): Alternative Structures for Money and Banking 168206, Verein für Socialpolitik / German Economic Association.
- Kai Carstensen & Markus Heinrich & Magnus Reif & Maik H. Wolters, 2017. "Predicting Ordinary and Severe Recessions with a Three-State Markov-Switching Dynamic Factor Model. An Application to the German Business Cycle," CESifo Working Paper Series 6457, CESifo.
- Kai Carstensen & Markus Heinrich & Magnus Reif & Maik H. Wolters, 2019. "Predicting Ordinary and Severe Recessions with a Three-State Markov-Switching Dynamic Factor Model," Jena Economics Research Papers 2019-006, Friedrich-Schiller-University Jena.
- Hou-Tai Chang & Ping-Huai Wang & Wei-Fang Chen & Chen-Ju Lin, 2022. "Risk Assessment of Early Lung Cancer with LDCT and Health Examinations," IJERPH, MDPI, vol. 19(8), pages 1-12, April.
- Wang, Qiao & Zhou, Wei & Cheng, Yonggang & Ma, Gang & Chang, Xiaolin & Miao, Yu & Chen, E, 2018. "Regularized moving least-square method and regularized improved interpolating moving least-square method with nonsingular moment matrices," Applied Mathematics and Computation, Elsevier, vol. 325(C), pages 120-145.
- Kelly D. Edmiston, 2007. "The role of small and large businesses in economic development," Economic Review, Federal Reserve Bank of Kansas City, vol. 92(Q II), pages 73-97.
- Mkhadri, Abdallah & Ouhourane, Mohamed, 2013. "An extended variable inclusion and shrinkage algorithm for correlated variables," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 631-644.
- Lucian Belascu & Alexandra Horobet & Georgiana Vrinceanu & Consuela Popescu, 2021. "Performance Dissimilarities in European Union Manufacturing: The Effect of Ownership and Technological Intensity," Sustainability, MDPI, vol. 13(18), pages 1-19, September.
- Candelon, B. & Hurlin, C. & Tokpavi, S., 2012.
"Sampling error and double shrinkage estimation of minimum variance portfolios,"
Journal of Empirical Finance, Elsevier, vol. 19(4), pages 511-527.
- Candelon, B. & Hurlin, C. & Tokpavi, S., 2011. "Sampling error and double shrinkage estimation of minimum variance portfolios," Research Memorandum 002, Maastricht University, Maastricht Research School of Economics of Technology and Organization (METEOR).
- Bertrand Candelon & Christophe Hurlin & Sessi Tokpavi, 2012. "Sampling Error and Double Shrinkage Estimation of Minimum Variance Portfolios," Post-Print hal-01385835, HAL.
More about this item
Keywords
administrative data; machine learning; multiple imputation; probabilistic record linkage; survey data;All these keywords.
JEL classification:
- C13 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Estimation: General
- C18 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Methodolical Issues: General
- C81 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Methodology for Collecting, Estimating, and Organizing Microeconomic Data; Data Access
NEP fields
This paper has been announced in the following NEP Reports:- NEP-BIG-2022-10-31 (Big Data)
- NEP-CMP-2022-10-31 (Computational Economics)
Statistics
Access and download statisticsCorrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:fip:fedbwp:94891. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Spozio (email available below). General contact details of provider: https://edirc.repec.org/data/frbbous.html .
Please note that corrections may take a couple of weeks to filter through the various RePEc services.