Planned intervention: On Thursday 19/09 between 05:30-06:30 (UTC), Zenodo will be unavailable because of a scheduled upgrade in our storage cluster.
Published September 9, 2022 | Version v3
Conference paper Open

Text Simplification of College Admissions Instructions: A Professionally Simplified and Verified Corpus

  • 1. University of Southern Mississippi
  • 2. The University of Texas at Austin

Description

This dataset contains three unique sets of texts. The first set of texts (ORIGINAL) is original college admissions instructions from the websites of colleges and universities in the United States. The second set of texts (SIMPLIFIED) is simplified college admissions instructions from the websites of colleges and universities in the United States. The third set of texts (ORIGINAL TO SIMPLIFIED ALIGNMENTS) are documents that pair--line by line--the original and simplified text to explore what information appears in the original that does or does not appear in the simplified version and how the simplified versions are altered as a result of the simplification process.

The texts, written in English (from US institutions), were manually simplified by an author of this paper who is a native English speaker. The author has a doctoral degree in education and has worked professionally in US postsecondary education for over a decade, including work in undergraduate admissions. Thus the author engaged with their professional insight to simplify without losing critical information necessary for its comprehension and understanding.

To determine whether the simplification of admissions application instructions were acceptable—that is to say they did not lose critical information or accuracy between the pre- and post-simplification process—we engaged with ten subject-matter experts (SMEs). Each simplified text was verified by 2 SMEs independently; in total, we engaged with 10 SMEs, who volunteered their time.

All ten of the SMEs had professional backgrounds in U.S. postsecondary admissions, having worked at least five years full-time in college admissions offices in the United States. These SMEs were identified through professional networks and snowball methods, as several of our SMEs knew colleagues from different institutions or educational entities who would serve as high-quality, knowledgable SMEs.

Moreover, we engaged with a diverse group of SMEs from different institution types (i.e., community colleges, public four-year institutions, private liberal arts colleges) and with various lengths of experience to capture the potential variability of admissions and financial aid parlance, jargon, and communication style. As the first study of its kind, identifying SMEs from diverse backgrounds provided more generalizability and reliability of findings, thus informing future research and practice regarding the communication of admissions application instructions to students and their support networks. Four subject-matter experts worked at public, four-year universities, four worked at private, four-year universities, and two worked at public, two-year community colleges.

To perform the acceptability judgement, the SME was presented with both pre- and post-simplification texts in real time over a Zoom video conference meeting. Then, we asked the SME to read the pre-simplified (original) text, followed by the post-simplification (simplified) text and determine whether the simplified text was acceptable. For example, changing the verb “submit” to “complete” is not acceptable because “submit” implies the documentation or information is being submitted by a submitter to a submittee, while “complete” only implies the documentation or information is completed and not directed to any educational stakeholder. If a simplification was deemed unacceptable by one or more SMEs, we asked the SME what simplification would be acceptable through an iterative process in real time across all texts in this study. Once the SME provided their feedback and we integrated their feedback into the simplified text, the same SME was again asked to read both the both pre- and post-simplification texts in real time and render their acceptability judgement. If at any time there was an instance where a lexical item (e.g., single word, acronym, initialism, compound adjective), sentence, or paragraph could not be simplified, the pre-simplified section of that text was used.

Files

coling22.pdf

Files (53.6 MB)

Name Size Download all
md5:759f32779ca5ccd3258b37e5a1a7582d
733.0 kB Download
md5:81368e19beebf6797e0b71de8d699285
710.8 kB Preview Download
md5:ac3a44607a9d95d1b3bdf77f2104e212
383.6 kB Preview Download
md5:4462967c559e6bb074abe153697db052
51.8 MB Preview Download
md5:d3f2d88d43b5366094de5b4a15fa2ab6
2.2 kB Preview Download

Additional details

Related works

Is derived from
Conference paper: https://arxiv.org/abs/2209.04529 (URL)