Stylistics Comparison of Human and AI Writing

Christopher Sean Harris, Evan Krikorian, Tim Tran, Aria Tiscareño, Prince Musimiki, and Katelyn Houston

Conclusion

Limitations and Future Implications

We are reporting on limitations of our study here because they were amplified during our data analysis. Initially, we intended to collect data on items that the Show Readability Statistics feature in the Word grammar checker could reveal and then augment that with R. Haswell’s (1984) Intra-Subject Paired Comparison rubric. The initial data set was thin and did not reveal much about the complexities of style, so we opted to add additional measures from Corbett and Connors (1999): sentence variety and sentence openers. While we examined the distance between the longest and shortest sentences and paragraphs, we could have examined how many sentences were five or ten words more or less than the average, which would have lessened the impact of outliers. Corbett and Connors (1999) presume that average paragraph length of 7.6 sentences indicates the author writes with well-developed paragraphs. However, we have noted that writers can use many words or many sentences yet not say much. For that matter, examining sentence types and types of words would be useful, as in our study we noted that Bard and GPT-4 have their own distinct styles. Does AI write in a feast or famine style, by either employing meaty or bony words?

In an era when antiracist writing assessment and awareness of how computers can erase dialect, Haswell’s rubric is an important tool for assessing writing. The rubric is trend-based but comparative and binary. In a pre/post test environment, the rubric assesses which draft is better, much better, or the same. With human and machine written texts on the same topic, we can easily rate which is better using the rubric. Assessment and devaluations do not need five rating categories and rubrics that fill pages with text. They need to tell raters and students, more importantly, if an essay accomplishes x, y, and z or if one does so better than another, which makes assessment transparent. Based on the reactions of our student researchers, stylistically superior essays may not necessarily be better than their counterparts.

Conclusion

While Google Bard can write more like a human than GPT-4, humans retain their edge over AI writing, and increasingly so as they progress through college, though that edge is not much. While the conclusion needs more study, we found that humans wrote more meaningful texts than AI and AI is less flexible in its stylistic variety. Further studies need to be conducted on human perceptions of quality in AI writing, and this small study would benefit from replication with different populations.

Evan concludes that converting each artificially- and human-produced essay into quantifiable data yielded more surprises than he was expecting. Although his earlier reflection proceeding the data collection approached AI as a largely-overblown issue, he realizes now that there may be more issues than what we may think in our composition classrooms. The biggest concern is how these AI models outpace our freshman composition students while being outmatched by the senior-standing and graduate students. Months earlier, models like ChatGPT could barely compete with high school-level composition, but now? In catching up to (and, for certain students, “surpassing”) the writing expectations and typical conventions of these beginning college students, I worry greatly about when and what “threshold” AI will reach before its generative capabilities finally plateau. If (or when) artificial intelligence does supersede the writing caliber of all of our students, will this problem be so big as to be approachable? Or will humans finally “best” AI, not in winning the battle for higher-quality writing, but for their ability to continue producing (and defining) “average” writing?

Note: To compensate the planet for the energy and we used and carbon we emitted to create AI texts, we purchased five tons of carbon credit. We intend to share our AI-generated texts with the WAC Clearinghouse AI resource page curated by Anna Mills so others can avoid the detrimental environmental impact of employing AI computational power.