Reconsidering Writing Pedagogy in the Era of ChatGPT: Results of a Usability Study of ChatGPT in Academic Writing - Lee-Ann Kastman Breuch, Asmita Ghimire, Kathleen Bolander, Stuart Deets, Alison Obright, Jessica Remcheck

Reconsidering Writing Pedagogy in the Era of ChatGPT

Lee-Ann Kastman Breuch, Asmita Ghimire, Kathleen Bolander, Stuart Deets, Alison Obright, and Jessica Remcheck

Qualitative Student Comments about ChatGPT

Our usability test included qualitative student comments in multiple forms. For example, we gathered “think aloud” comments from students as they read each ChatGPT text in production. In addition, we asked students to explain each of four ratings they gave for each five ChatGPT texts included in the usability test. Each usability test also included eight debriefing interview questions which provided opportunities for student reflection. This collection of questions yielded 37 qualitative comments per student, and each response in the usability test was regarded as one unit of analysis. Given our complete participant pool of 32 students, our study yielded 1,184 qualitative comments.

To directly address our research questions, we reduced our data set to focus specifically on qualitative comments that addressed student impressions about using ChatGPT texts, how they would use ChatGPT texts, and how ChatGPT might impact their writing process. We selected qualitative comments from four of the five usability tasks and four debriefing questions. This yielded a data set of 17 qualitative responses per student, roughly half of the qualitative comments provided by each student, for a total of 544 qualitative units of analysis. Given this data set, through team discussions, we started our coding by identifying 45 individual codes. Through continued discussion of the data, we sorted these codes into eight categories, which we described as “filters.” As our team member Alison Obright described, filters became a productive way to think about the lenses students used to think about ChatGPT texts. As we described these filters, we developed a codebook to guide our coding. As shown in Table 2, the “filters” in our codebook ranged from “low order concerns” which included gut-level responses to texts and surface concerns, to “logic/organization” comments in which students offered critiques of the ways ChatGPT had organized texts and arguments. Using these filters, we coded all 544 units of analysis, and each unit was coded by at least two team members to strengthen inter-rater reliability. Initial coding using the eight filters yielded strong agreement among coding pairs, thus we continued to work to code the entire data set in this manner. Any coding disagreements were resolved through discussion by the coding pair. Data reported here reflects the final coding agreements.

Table 2. Codebook: Qualitative comment “filters” reflecting student perspectives of ChatGPT

Low order concerns as filter	A basic form of analysis; this theme captures how students used their understanding of sentence structure, grammar, voice, and flow to evaluate the text. Draws from the Writing Center language for "low order concerns."
Ethics as filter	A basic form of analysis students used. This theme category captures how students used external policies to evaluate the text, and how students expressed ethical concerns about using ChatGPT text instead of their own. Comments in this filter include worries about plagiarism, privacy, and getting "caught" using the technology.
Logic/organization as filter	A complex form of analysis students used. This theme captures the way students evaluated the arguments, organization, or logical inspirations of the text. This analysis theme includes both external systems for understanding information and a deeper, more personal evaluation of logic/organization that the student would expect based on their prompt. An example of this filter might occur in any kind of student comment that suggests how the text could be more thorough or detailed to provide a deeper or more meaningful statement or argument.
Information literacy as filter	A complex form of analysis students used. Several codes indicate a theme of drawing on information literacy skills that students have been taught or have incorporated into intuitive evaluations of text. This filter would include student comments about a lack of citations, the presence or absence of correct or accurate information, and notes about their need to present a different prompt to the technology. These evaluations demonstrate how students leverage information literacy skills to evaluate a text and evaluate their own request for information. Comments may address questions about the credibility of ChatGPT texts and trustworthiness of sources used in ChatGPT texts as a form of information literacy.
Self and experience as filter	This advanced form of analysis includes students thinking through their own experiences and understandings of the text, the information requested, and their own experiences as a student. Comments in this theme capture a feeling of wanting to do things themselves, commenting on the increased or decreased effort using ChatGPT would require, and worries about how this technology could put their learning at risk. This is a large category with many different codes but generally, codes were grouped in this category if the student was primarily filtering their evaluation of the text through deeply personal knowledge or experience. This filter may also apply to social justice concerns, integrity of education, and concerns around desire for rigorous education.
Circumstance as filter	This is a more advanced form of analysis that students used pertaining to the student's set of individual experiences, situations, or rhetorical choices. This theme speaks to the way students would filter information through their academic identity, their major, or content from a specific class or professor. The students evaluate the text for how well it matches class content, how much it would meet the requirements of the field, what information is included or missing, and how much analysis the text was performing.
Process as filter	This filter is a way that students might evaluate how ChatGPT can impact their writing process as they are working on a paper. Students may comment on how ChatGPT texts may help them generate ideas for a paper topic, provide sample writings, generate outlines, offer citations, organize ideas, articulate topic sentences, etc.
Function as filter	This filter would be applied to comments in which students express curiosity about the technological capabilities of ChatGPT. Comments might address ChatGPT's technological affordances such as speed of production, how ChatGPT is technically working, where and how it is pulling data and sources, and other technological capabilities.

These filters reflected a range of student comments about ChatGPT as it intersected with verbal academic writing. As Figure 7 shows, in the data set we coded, the filters of “process,” “information literacy,” and “low order concerns” occurred most often. These filters were distributed differently for each of the 16 response items in the usability evaluation, as Figure 8 shows; yet these filters were often among the most frequently coded filters. The process, information literacy, and low order concerns filters occurred most frequently in our coded responses. The “function” filter also stood out as prominent to us. In the section below we describe each filter and how they appeared in our dataset.

Figure 7. Overall frequency of coded filters in student comments about ChatGPT

Figure 8. Distribution of coded filters in data set.

Process as filter

The most frequently coded filter in our study was “process,” a reference to stages and activities involved in academic writing tasks. Process also alludes to the idea that writing does not happen all at once, but rather occurs across time and often with multiple iterations (Adler-Kassner & Wardle, 2016). Process can be described in simplistic terms as “prewrite, write, and rewrite” or in more rhetorical terms such as “invention, arrangement, and style,” or yet in other terms such as “prewriting, drafting, and editing.” In our study, the filter language we coded as “process” reflects ways that students described how ChatGPT can impact their writing process as they are working on a paper. As we noted in our codebook, “Students may comment on how ChatGPT texts may help them generate ideas for a paper topic, provide sample writings, generate outlines, offer citations, organize ideas, articulate topic sentences, etc.” Students used this filter when thinking about how ChatGPT texts could “jumpstart” their writing, provide ideas or even a template. Some students described ChatGPT texts as outlines for a larger paper. Examples of student comments in this filter include:

“I like that it kind of gave me ideas so I know how to move forward with the literacy narrative.” (Participant UU, ST1 PT3)
“It seems like it could be really useful and like helping me structure my own, like research paper or something like that.” (Participant NN, ST4 PT3)
“I could see myself using this for inspiration for actually designing an experiment.” (Participant L, ST4 PT3)
“So to get like a good idea, or to like, start out your paper, I feel like this is somewhere to start.” (Participant F, ST4 PT3)

This filter was important because it illustrated the many ways students noted that ChatGPT might be a helpful tool in various parts of their writing processes, whether that involved getting started, organizing content, or editing a version of student writing.

Low order concerns as filter

The second most frequently coded filter in our review of student comments was “low order concerns.” This filter language comes from writing center scholarship where low order concerns refer to “matters related to surface appearance, correctness, and standard rules of written english” (McAndrew and Reigstad, 2001). Our codebook referred to comments in this category in this way “A basic form of analysis; this theme captures how students used their understanding of sentence structure, grammar, voice, and flow to evaluate the text. Draws from the Writing Center language for low order concerns.” Comments that were assigned this code demonstrate how students thought about writing mechanics such as sentence lengths and structure, the “correctness” of writing in terms of grammar, and the voice and style of ChatGPT including transitions between ideas and topics. This code was also used to describe uncritical or surface level comments like “its good” or “I think the text is fine.” Examples of this filter include:

“I don't think it's a very good essay, and I think it would be clear that you use some sort of AI near the bottom when it starts to repeat itself, it doesn't really sound natural” (Participant BB, ST5 PT2)
“Instantly, I'm like really shocked because it's super long” (Participant O, ST4 Talk Aloud)
“I think some of what they said had potential to be compelling, or interesting in some way. But you know, based on the fact that I wouldn’t really use any of this, it’s a pretty low score” (Participant G, ST4 PT4)
“It’s well-written.” (Participant TT, ST4 PT4)

This category was important for us because it demonstrates how students may uncritically look at the text produced by ChatGPT. While many comments made by students about ChatGPT texts indicate deep, critical thinking about the text, many instructors have concerns about students uncritically engaging with ChatGPT in their classes. Here students show the features uncritical glances look for. If the writing bar is only set for “well written” or “good grammar and sentence structure” it may be easier for students to justify using the text unaltered since it meets the surface-level requirements of the assignment.

Information literacy as filter

The third most frequently coded filter was “information literacy.” Comments in this category indicated a theme of drawing on information literacy skills that students have been taught or have incorporated into intuitive evaluations of text. Comments coded “information literacy” may have included student comments about a lack of citations, the presence or absence of correct or accurate information, and notes about their need to present a different prompt to the technology. In short, this filter addressed critical perspectives of information, and it was important for demonstrating how students leverage information literacy skills to evaluate a text and evaluate their own request for information. Example comments in this category included:

“It also seems pretty accurate from what I've learned in my biochem class…the one difference would be I don't really see references again.” (Participant E, ST4 Talk Aloud)
“This is for a research paper. And so I don’t think I could honestly use any of this because I would have to cite my sources unless it’s like common knowledge, I suppose.” (Participant G, ST2 PT2)
“...it just comes back to like me being worried about plagiarism, because I don't know how you would cite something like this, or where the information is coming from. Because like it doesn't have citations which I want…this could totally be written somewhere else, and it just grabbed it. But didn't cite it. So that makes me nervous.” (Participant O, ST2 PT2)

We also discovered that comments in this category may address questions about the credibility of ChatGPT texts and trustworthiness of sources used in ChatGPT texts as a form of information literacy. As an example, students talked about the degree to which they “trusted” ChatGPT and whether or not information provided by ChatGPT was trustworthy. This was expressed in multiple ways using words such as “proof” “evidence” “misleading” “trust” “correct” and “citations” as the examples below demonstrate.

“I think I would need more proof and evidence to support this, because I don't, this could have some misleading information that might not be checked or peer reviewed so.” (Participant AA, ST2 PT2)
“I would not use it because it's just not a time saver. For me. It's very shallow information that I could get just as easily. And then at least I would know where I'm getting it from. And I trust myself more than I trust the AI.” (Participant QQ, Q6)
“But I also wouldn't use it because I wouldn't trust the sources without looking them up first because I question if they exist, or if they're, you know, if the sources are correct. And also it doesn't look like it uses any in-text citations which I would probably want in an essay.” (Participant FF, ST5 PT2)

Self and Experience as Filter

The fourth most common filter was “self and experience.” Students used the self and experience filter when framing their reasoning around how ChatGPT impacts their own experience while using the tool. This reasoning came up when students were assessing the writing produced by ChatGPT and when thinking about the implication of turning in an academic assignment produced by AI. Many students said they like doing their own assignments, acknowledging that using ChatGPT is not their work. Students using this filter were making clear points that they were either against, or hesitant to use AI generated texts because it impacted their own learning, was not their writing, or it would have other personal consequences.

Multiple students used this filter for the scenario where they asked ChatGPT to write a literacy narrative. In response to the question for the prompt of “How likely would you be to use this ChatGPT text unaltered as your academic homework?”, one student said, “I like to write my own essays from personal experience so I personally wouldn’t use computer generated text.” Another student said, “there’s a prompt for your homework so you put it in here and it writes something, but it’s not something you wrote, so I would not use that as my homework for a literacy narrative about myself.” These students are first and foremost making judgements on their likelihood of using ChatGPT produced text based on the reasoning that the writing is not reflective of their personal experience.

Students were also using the self and experience filter in connection with the simplicity of either the text produced or the assignment in general. A student who responded to the personal literacy narrative prompt using self and experience as filter stated, “...because it’s not very well written, I would not want to submit this and say this is the best work that I can produce.” Another student said “because that’s something that I could write very easily on my own I don’t really see why anyone would need to use ChatGPT for that.” Students were using this reasoning to assert that ChatGPT is not reflective of their own standards of work. Students thought they could produce something better on their own for an academic assignment or that it might even be easier to do it on their own versus asking AI.

Self and experience as filter shows us that students care about ownership in their work and they see ChatGPT infringing on this in some scenarios. Students are asserting agency and personal responsibility with their work, as well as their experience learning, which tells us they see the value in their education.

Logic and Organization as Filter

The logic and organization filter was used to code student comments that focused on the construction and coherence of ChatGPT texts. Using this filter, students commented on the ways ChatGPT texts were constructed, integration of details, and whether or not the ideas and arguments were fully developed in the texts. As an example, one student remarked that paragraphs were rather surface level “It doesn’t have any substance behind the words. It’s honestly like every single one of these paragraphs could be an introductory paragraph for a paper” (Participant G, ST5 PT2). Another student commented that detail was lacking in paragraphs “the way it was worded it was lacking in detail in the paragraphs” (Participant KK, ST5 PT1). Some students complemented the ways ChatGPT organized ideas “it gives you an idea of how you could go about this chronologically” (Participant YY, ST1 PT2), but some students commented that the ChatGPT writing was very basic and did not reflect college level writing “I think that structure is very basic” (Participant T, ST1 PT2). Overall, comments coded in this filter showed the ways students observed the development and order of ideas in ChatGPT texts.

Ethics as Filter

The ethics as filter code centers student’s emphasis on moral implications surrounding using ChatGPT for academic assignments. In our codebook, the language used to describe this form of analysis states, “A basic form of analysis students used. This theme category captures how students used external policies to evaluate the text, and how students expressed ethical concerns about using ChatGPT text instead of their own. Comments in this filter include worries about plagiarism, privacy, and getting “caught” using the technology.” Ethics as filter shows how students reasoned about academic and moral consequences to submitting an AI generated text for an assignment. Through their responses, we can further understand how students are thinking through ethical and academic standards in regards to using ChatGPT. As evident in their responses, in some cases there is confidence in their convictions and in other scenarios, students are seeing the “right” decision as less straightforward.

Students used this reasoning most frequently when asked the post-task question, “How likely would you be to use this ChatGPT text unaltered as your academic homework?” Many of the student responses were associating the ChatGPT texts with academic integrity by stating that it was not something they [the student] wrote, and therefore they would not turn it in. These types of responses were especially prevalent when students were asked to do the first prompt scenario: “write a literacy narrative.” The literacy narrative was the most personal prompt to the individual student and their responses align with this in terms of how they were thinking about academic integrity. One student said, “...but it’s not something you wrote, so I would not use that as my homework for a literacy narrative about myself”. Another student stated, “...if this is something I turned in, I don’t think it seems very personal”. Students were connecting the impersonal nature of the texts and reasoning about the likelihood of their audience knowing it was not produced by them, leading to entanglements with ethics and academic integrity.

Other students were calling out ethics and academic integrity in more direct ways by stating their concerns about plagiarism. One student said “...like I said before, it reads like AI. I feel like ChatGPT has a voice almost, and it sounds to me like it’s very AI generated. Like a professor receiving this, I feel like would be extremely doubtful that it’s coming from me, and not AI, since it’s been something that professors have to look out for now.”

Function as Filter

Function as filter was used to code students' responses where they expressed curiosity about the technological capabilities of ChatGPT. This code appeared most often in the debrief interview section when students were asked, “What questions, if any, does ChatGPT raise for you?” Here, students expressed curiosity about ChatGPT’s sources asking, “Where is the information coming from?," "What biases does it have?,” “How does it pull from the Internet?,” and “What kind of database is it pulling from?” They also wondered about how ChatGPT worked asking, “What can’t it do?,” “What are its limits?,” “How does it work?,” “Does it have different levels of writing?,” “Does it generate a different response for everyone?,” and “Is it accessible? Is it free?” Finally, students wondered about its functionality in the future. They asked, “How much more advanced will this be in a couple of years?” and “Will it get overused?” Under function as filter, students illustrated critical thinking about ChatGPT use and concerns about its impact on their academic and non-academic futures.

Circumstance as filter

In their evaluations of Chat GPT texts, students leveraged many kinds of knowledge related to circumstance. We used the code “circumstances as filter” when students responded by filtering information through their academic identity, their major, or content from a specific class or professor. As we used the code, the team came to realize this description matched well with models of rhetorical situation since students were thinking about audience, content, purpose, author, and timing.

Indications of this filter included mentions of class content or professor expectations, understanding of genres they were working with, and considerations of how ChatGPT “understood” the assignment. Students also considered how the situation would influence whether or not they would use ChatGPT at all. For example, one student mentioned “if this is a last resort, you know, I'd throw it in there, and I probably still get like a C minus, maybe, but it's really not that good. But for a discussion post this would be good enough.” Another participant noted that the text met the expectations of their genre but that it could have been more useful pedagogically by adding numerical examples, “It includes most of the things that you'd expect in a psychological as abstract. I feel like, maybe a bit more like numbers or specific statistics would be nice, even just generating those for an example, although it has no data to base it on overall.” Finally, another student considered their knowledge of professor expectations saying, “Sometimes professors have their own things that they want you to remember, or they have their own examples that they want you to remember, and they might hope that you kind of personalize your paper or personalize your assignment to fit the course and fit specifically what you've been learning.”

The results from this filter give us some insight into how students understand the rhetorical situation of academic writing. Students rely on expectations set by instructors about the quality of their work and the extent to which course content should be explicitly included in their writing products. In addition, the way that courses talk about genre served as a tool that students used to determine if Chat GPT could be used. Commonly, discussion posts seemed like an area where there was more comfort using ChatGPT opposed to papers where they felt their unique ideas were more highly valued or looked for. A few students noted that they were more likely to use Chat GPT in required foundational courses or general education coursework, “I have used before in an academic class particularly for like any liberal education that are not related to economics for in terms of a time-saving aspect as well as to get a general understanding.” This alerts us that students are less likely to use ChatGPT when they find the information valuable or directly related to their major or area of study. As we see elsewhere, students care about the value of their education, and, interestingly, it seems that when they don’t find value in a class seemingly unrelated to their major they may be more likely to use AI tools to help them save time. Overall, sixty-two student comments were coded in this category.

Composing with AI