The Index of Major Literary Prizes in the US
The Index of Major Literary Prizes in the US includes two related datasets:
- The first is a dataset of the winners and judges of prizes for prose, poetry, or unspecified genre between 1918 and 2020 with a purse of $10,000 and over.
- The second contains records for volumes in the HathiTrust Digital Library written by authors who won a prize in the prize winners dataset.
1. Major Literary Prize Winners and Judges
This dataset includes information about the winners and judges of literary prizes (for prose, poetry, or unspecified genres) between 1918 and 2020 with a purse of $10,000 and over. The dataset includes details about the winners of 52 unique prizes awarded by 22 institutions. For a subset of 39 prizes, it includes details about judges; not every prize has complete judge data. The dataset does not include prizes awarded specifically for children’s literature, nonfiction, drama, or translation.
Data Table
The details about winners and judges includes information about their gender and education (if and where the winners/judges attended college, MFA programs, or other graduate programs, if applicable). This information was collected by hand and is described in more detail below.
Additionally, the dataset also includes persistent identifiers for authors, such as VIAF, LCCN, and Wikidata numbers.
Collection and Creation
The data about prizes and winner/judge demographics was collected by hand mainly from institutional websites. Gender and higher education data for individuals was collected from author biographies, interviews, and other materials. Some information about judges not listed on websites was obtained through correspondence with institutions. Claire Grossman, Juliana Spahr, and Stephanie Young are the principal investigators, did the majority of the data gathering, and are responsible for any errors. They were assisted by Jennifer Chukwu, Clare Lilliston, Jordan Pruett, Esther Vinarov, and Betty He. Richard Jean So provided significant support for this project.
Gender information was provisionally labeled by the research team based on pronouns used by author in biographical notes at the time research was completed. It is possible a judge/winner’s gender identity and/or pronoun may have changed subsequently. This information is intended to enable study of broad patterns over time and not as definitive statements on any individual identity. The possible gender values are “male,” “female,” “nonbinary/he,” “nonbinary/they,” “unknown,” and “No Winner”; nonbinary was used only when the term appeared in the individuals’ biography.
Higher education information was labelled by the research team based on whether the individual mentioned that they attended (even if they did not graduate from) an institution. Again, this information is intended to enable the study of broad patterns over time and is not meant to be definitive. The possible MFA degree values are the name of institution, “No Winner,” or blank (in most cases, a blank means it is unlikely that the individual attended an MFA program, because higher education affiliations were listed in biographical notes but did not include an MFA, or because the team was unable to locate any educational information about the individual).
The possible “elite education” values are “Barnard College,” “Brown University,” “Columbia University,” “Cornell University,” “Dartmouth College,” “Harvard University,” “Princeton University,” “Radcliffe College,” “Stanford University,” “University of Pennsylvania,” “University of Chicago,” “Yale University,” “No Winner,” or blank. The possible “graduate degree” values (including masters, PhD, JD, and medical degrees) are “graduate,” “No Winner,” or blank.
At a later stage, persistent identifiers for winners and judges, such as VIAF, LCCN, and Wikidata identifiers, were added by Matt Miller computationally.
Please report any errors and/or corrections via this Google Form.
Description
The columns in the dataset include:
person_id: unique numeric identifier for each name; assigned alphabetically by first name
full_name: pen names were used; in case of name change, most recent name was used
given_name: first name; includes middle name, if used
last_name: last name
gender: provisionally labeled by research team based on pronouns used by author in biographical notes at the time research was completed; it is possible a judge/winner’s gender identity and/or pronoun may have changed subsequently; intended for study of broad patterns over time and not as definitive statements on any individual identity; values are “male,” “female,” “nonbinary/he,” “nonbinary/they,” “unknown,” and “No Winner”; nonbinary was used only when the term appeared in the individuals’ biography.
elite_institution: individual mentioned they attended (even if they did not graduate from) one of the listed institutions; intended for study of broad patterns over time and not as definitive; values are “Barnard College,” “Brown University,” “Columbia University,” “Cornell University,” “Dartmouth College,” “Harvard University,” “Princeton University,” “Radcliffe College,” “Stanford University,” “University of Pennsylvania,” “University of Chicago,” “Yale University,” “No Winner,” or blank (means unlikely as individual listed higher education affiliations in biographical notes but did not include an elite institution or unable to locate any educational information about the individual); intended for study of broad patterns over time but not as definitive.
graduate_degree: individual mentioned they attended (even if they did not graduate from) a graduate program (includes masters, PhD, JD, and medical degrees); values are “graduate,” “No Winner,” or blank (means unlikely as individual listed higher education affiliations in biographical notes but did not include a graduate degree or unable to locate any educational information about the individual); intended for study of broad patterns over time but not as definitive.
mfa_degree: individual mentioned they attended (even if they did not graduate from) an MFA program; values are name of institution, “No Winner,” or blank (means unlikely as individual listed higher education affiliations in biographical notes but did not include an MFA or unable to locate any educational information about the individual); intended for study of broad patterns over time and not as definitive.
iowa_mfa_person_id: values are either a number that corresponds to the Post45 Iowa Writers’ Workshop “People” table, “missing” (means that the individual’s biographical materials suggest they attended Iowa for an MFA but a corresponding entry could not be found in the Iowa dataset which ends in 2014 and does not include graduates of the MFA in playwriting), “unknown” (unable to locate any educational information about the individual), “No Winner,” or blank (means that the individual did not list University of Iowa in their biographical notes or unable to locate any educational information about the individual)
stegner: individual mentioned they were awarded a Wallace Stegner Fellowship at Stanford; the Stegner program does not award degrees but it resembles an MFA program in pedagogy except it is not unusual for those admitted to already have an MFA; we thus treat it as the equivalent of an MFA (and not a prize); values are either “Stegner,” “No Winner,” or blank (means that the individual did not mention the Stegner Fellowship in their biographical notes or unable to locate any educational information about the individual)
role: values are “winner” or “judge”
prize_institution: nonprofit organization that oversees the prize
prize_name: name of prize; for the Gold Medal Awards from the American Academy of Arts and Letters, we only included awards categorized as fiction and poetry; for the Morton Dauwen Zabel Award from American Academy of Arts and Letters, we excluded periodic awards given specifically for “Criticism”; for the National Book Award, we only included prizes for poetry and fiction; for the Academy of American Poets, we only included the Academy of American Poets Fellowship, the Lenore Marshall Poetry Prize, and the Wallace Stevens Award; for the Poet Laureate Consultant in Poetry to the Library of Congress, we included the US Consultants in Poetry but did not include the three Special Bicentennial Consultants that served in an advisory role from 1999-2000 and excluded William Carlos Williams (who was named as Laureate, but did not serve); for the Pulitzer Prize, we only included prizes for fiction and poetry; for the MacArthur Fellowships, we included those who were categorized by the MacArthur website as “poetry” and most of those categorized as “fiction and nonfiction” (if a writer exclusively published journalistic nonfiction or essay, they were not included).
prize_year: year awarded; in the case of the Poet Laureate Consultant in Poetry to the Library of Congress, which begins in September and continues until May, we included entries for the Laurate under both years
prize_genre: values are “poetry,” “prose” (“prose” includes prizes for “short stories,” “essays,” “fiction,” and “novel”), and “no genre” (prize has no genre requirement, as in the MacArthur Fellowship or the Whiting Award)
prize_type: values are “career” (prize is awarded to author on basis of overall career) or “book” (prize is awarded to author for a specific book)
prize_amount: value here is the amount of money awarded in 2022; amounts change over time, which we do not track
title_of_winning_book: if “prize_type” is “book,” then the awarded book title is listed (if the jury awarded more than one book in same year, titles for both are listed); other values are “No Winner,” and blank (prize was not awarded for a specific book)
This dataset also includes various persistent identifiers:
- author_lccn – Author’s LCCN from id.loc.gov
- author_viaf – Author viaf.org cluster number
- author_wikidata – Author’s Wikdiata Q number
Citation
@article{grossman2022,
author = {Grossman, Claire and Spahr, Juliana and Young, Stephanie},
editor = {Sinykin, Dan and Walsh, Melanie},
title = {The {Index} of {Major} {Literary} {Prizes} in the {US}},
journal = {Post45 Data Collective},
date = {2022-12-05},
url = {data.post45.org/the-index-of-major-literary-prizes-in-the-us/},
doi = {10.18737/CNJV1733p4520221212},
langid = {en},
abstract = {The Index of Major Literary Prizes in the US includes
datasets to the winners and judges of prizes for prose, poetry, or
unspecified genre between 1918 and 2020 with a purse of \$10,000 and
over.}
}