supervisors: reza farrokhnia
Generative AI (GenAI) tools, such as ChatGPT, Claude, and Gemini, are increasingly used for automated feedback on students’ written assignments (e.g., Banihashem et al., 2024; Er et al., 2024). Since these tools rely on textual inputs for interaction, the quality and structure of the input text significantly impact the accuracy, specificity, and coherence of AI-generated responses.
Despite their widespread adoption, it remains unclear how sensitive GenAI tools are to different aspects of text quality. Specifically, do these tools generate different feedback based on variations in coherence, lexical diversity, syntactic complexity, or discourse structure? Understanding this relationship is essential for enhancing AI-driven feedback systems, ensuring fair and high-quality writing support for students, and improving best practices for human-AI communication.
This study aims to examine whether and how various textual features influence the quality, depth, and accuracy of AI-generated feedback. By analyzing AI responses to student essays with varying levels of coherence and complexity, this research will provide valuable insights into the linguistic factors shaping AI-assisted writing evaluation.
The study will be guided by the following key research questions:
- How do coherence and organization in a written assignment affect the quality of feedback generated by GenAI?
- What is the relationship between specific linguistic features (e.g., syntactic complexity, lexical diversity, discourse coherence) and the informativeness of AI-generated feedback?
- Are different GPT models differentially sensitive to variations in input text quality?
- Is there an optimal level of textual complexity that maximizes the usefulness and specificity of AI-generated feedback?
- To what extent do GenAI exhibits biases or inconsistencies in feedback based on surface-level text properties rather than content meaning?
METHOD
This study will employ a mixed-methods approach, integrating computational text analysis, AI-generated feedback assessment, and statistical modeling.
- Data collection: A corpus of student essays will be collected, covering a range of writing proficiency levels and text qualities. These essays will be categorized based on their coherence, complexity, and linguistic richness. AI models (e.g., GPT-4, GPT-4o) will be prompted to provide feedback using theory-driven instructions.
- Textual feature analysis: Text analysis will be conducted using either automated NLP tools or manual coding based on established theoretical frameworks.
- Automated analysis: NLP tools, such as Coh-Metrix (McNamara et al., 2010a), can be used to extract linguistic features, including coherence, lexical diversity, syntactic complexity, and discourse structure (McNamara et al., 2010b).
- Manual coding: Trained raters will analyze selected linguistic features based on established theoretical frameworks in discourse analysis, syntax, and lexical semantics to provide qualitative insights.
- AI feedback evaluation: AI-generated feedback will be assessed for accuracy, specificity, coherence, and relevance using expert human ratings. AI responses will be compared to human-generated feedback, and statistical analyses, including correlation and regression models, will be conducted to examine relationships between textual features and AI feedback quality.
References
Banihashem, S. K., Kerman, N. T., Noroozi, O., Moon, J., & Drachsler, H. (2024). Feedback sources in essay writing: peer-generated or AI-generated feedback?. International Journal of Educational Technology in Higher Education, 21(1), 23. https://doi.org/10.1186/s41239-024-00455-4
Er, E., Akçapınar, G., Bayazıt, A., Noroozi, O., & Banihashem, S. K. (2024). Assessing student perceptions and use of instructor versus AI‐generated feedback. British Journal of Educational Technology. https://doi.org/10.1111/bjet.13558
McNamara, D. S., Louwerse, M. M., McCarthy, P. M., & Graesser, A. C. (2010a). Coh-Metrix: Capturing linguistic features of cohesion. Discourse Processes, 47(4), 292-330.
McNamara, D. S., Crossley, S. A., & McCarthy, P. M. (2010b). Linguistic features of writing quality. Written communication, 27(1), 57-86. https://doi.org/10.1177/0741088309351547