Generative ai sensitivity to text quality in automated feedback

supervisors: reza farrokhnia

Generative AI (GenAI) tools, such as ChatGPT, Claude, and Gemini, are increasingly used for automated feedback on students’ written assignments (e.g., Banihashem et al., 2024; Er et al., 2024). Since these tools rely on textual inputs for interaction, the quality and structure of the input text significantly impact the accuracy, specificity, and coherence of AI-generated responses.

Despite their widespread adoption, it remains unclear how sensitive GenAI tools are to different aspects of text quality. Specifically, do these tools generate different feedback based on variations in coherence, lexical diversity, syntactic complexity, or discourse structure? Understanding this relationship is essential for enhancing AI-driven feedback systems, ensuring fair and high-quality writing support for students, and improving best practices for human-AI communication.

This study aims to examine whether and how various textual features influence the quality, depth, and accuracy of AI-generated feedback. By analyzing AI responses to student essays with varying levels of coherence and complexity, this research will provide valuable insights into the linguistic factors shaping AI-assisted writing evaluation.

The study will be guided by the following key research questions:

  1. How do coherence and organization in a written assignment affect the quality of feedback generated by GenAI?
  2. What is the relationship between specific linguistic features (e.g., syntactic complexity, lexical diversity, discourse coherence) and the informativeness of AI-generated feedback?
  3. Are different GPT models differentially sensitive to variations in input text quality?
  4. Is there an optimal level of textual complexity that maximizes the usefulness and specificity of AI-generated feedback?
  5. To what extent do GenAI exhibits biases or inconsistencies in feedback based on surface-level text properties rather than content meaning?

METHOD

This study will employ a mixed-methods approach, integrating computational text analysis, AI-generated feedback assessment, and statistical modeling.

References

Banihashem, S. K., Kerman, N. T., Noroozi, O., Moon, J., & Drachsler, H. (2024). Feedback sources in essay writing: peer-generated or AI-generated feedback?. International Journal of Educational Technology in Higher Education21(1), 23. https://doi.org/10.1186/s41239-024-00455-4

Er, E., Akçapınar, G., Bayazıt, A., Noroozi, O., & Banihashem, S. K. (2024). Assessing student perceptions and use of instructor versus AI‐generated feedback. British Journal of Educational Technology. https://doi.org/10.1111/bjet.13558

McNamara, D. S., Louwerse, M. M., McCarthy, P. M., & Graesser, A. C. (2010a). Coh-Metrix: Capturing linguistic features of cohesion. Discourse Processes47(4), 292-330.

McNamara, D. S., Crossley, S. A., & McCarthy, P. M. (2010b). Linguistic features of writing quality. Written communication27(1), 57-86. https://doi.org/10.1177/0741088309351547