Document Type
Article
Publication Date
10-2022
Publisher
Cambridge University Press
Source Publication
Political Analysis
Source ISSN
1047-1987
Original Item ID
DOI: 10.1017/pan.2021.15
Abstract
Political scientists increasingly use supervised machine learning to code multiple relevant labels from a single set of texts. The current “best practice” of individually applying supervised machine learning to each label ignores information on inter-label association(s), and is likely to under-perform as a result. We introduce multi-label prediction as a solution to this problem. After reviewing the multi-label prediction framework, we apply it to code multiple features of (i) access to information requests made to the Mexican government and (ii) country-year human rights reports. We find that multi-label prediction outperforms standard supervised learning approaches, even in instances where the correlations among one’s multiple labels are low.
Recommended Citation
Erlich, Aaron; Dantas, Stefano G.; Bagozzi, Benjamin E.; Berliner, Daniel; and Palmer-Rubin, Brian, "Multi-Label Prediction for Political Text-as-Data" (2022). Political Science Faculty Research and Publications. 133.
https://epublications.marquette.edu/polisci_fac/133
Comments
Accepted version. Political Analysis, Vol. 30, No. 4 (October 2022): 463-480. DOI.