Speaker
Description
This study compares the ability of language teachers, AI checker apps and large language models to distinguish human-written from machine-generated text produced by an LLM (ChatGPT) or translation software (DeepL). This presentation will discuss how texts were produced which were representative of student writing practices. In addition, the design of the study will be described, including use of Claude AI to code the survey in HTML and JavaScript, and how gamification aimed to increase responses.
Short summary
This study compares the ability of language teachers, AI checker apps and large language models to distinguish human-written from machine-generated text produced by an LLM (ChatGPT) or translation software (DeepL). This presentation will discuss how texts were produced which were representative of student writing practices. In addition, the design of the study will be described, including use of Claude AI to code the survey in HTML and JavaScript, and how gamification aimed to increase responses.
Keywords
AI detection
teaching writing
Abstract
Lacking a consensus on the benefits or drawbacks of AI in language learning, many teachers of second language writing may wish to limit its use and identify when it has been used. However, can teachers identify machine-generated text and distinguish it from human-written sentences?
This presentation will describe the pilot for a study comparing the ability of language teachers, AI checker apps and large language models (LLMs) to distinguish human-written from machine-generated text produced by an LLM (e.g. ChatGPT) or machine-translation (MT) software (e.g. Google Translate).
24 extracts of 300-350 words were included, comprising 1) real student writing from the pre-AI era; 2) text machine-generated using an LLM (ChatGPT); 3) text translated from an entirely Japanese text using MT (DeepL); 4 and 5) real student writing with AI or MT elements; 6) real student writing “polished” using Grammarly (considered mixed human/machine).
In the survey, respondents identify which sentences in each text appear machine-generated. Then, they give an overall evaluation of the text. In this pilot, respondents also evaluate if the extracts are representative of student writing practices, including improper AI/MT usage. The final survey includes gamification so that respondents can compare their performance against that of LLM and AI checkers.
The presentation describes the steps which were undertaken to produce the extracts and design the mechanics of the survey, itself coded in HTML and JavaScript using Claude AI. Participants will be invited to participate in the pilot and sign up to join the full study later in the year.
| Scheduling preference | Anytime on Saturday or Sunday |
|---|---|
| Title | Can teachers spot machine-generated English?: Designing a pilot study |