iWrite与DeepSeek的评分信度与反馈内容对比分析-科学发展研究-CSCIED科技核心评价数据库-手机版

iWrite与DeepSeek的评分信度与反馈内容对比分析

Comparison of scoring reliability and feedback content between iWrite and DeepSeek

ES评分 0 浏览量：167 下载量：0

DOI	10.12208/j.sdr.20250263
刊名	科学发展研究 Scientific Development Research
年，卷(期)	2025, 5(7)
作者	胡婕妤,
作者单位	武汉纺织大学外国语学院湖北武汉
摘要	随着大语言模型性能的不断优化，大语言模型逐渐被教育者用来批改作文和提供反馈，与专门的作文评阅系统iWrite相比，其评分信度与反馈性能如何? 是否能成为一款可以信赖的评分与反馈工具。为探究此问题，本研究以国内某大学国际合作办学院系中艺术专业大二两个班的46篇雅思作文为样本，对比分析iWrite与DeepSeek的评分信度与反馈内容，以期为教育工作者在选择评分与反馈工具时提供借鉴。
Abstract	With the constant improvement of the performance of Large Language Models(LLMs), LLMs are gradually employed by teachers to score students’ writing and provide feedback for them. Compared with the professional Automated Essay Scoring system such as iWrite, the scoring reliability and the performance of generating feedback of LLMs are unclear. It remains doubtful whether these LLMs can be used as reliable tools for scoring and providing feedback. In order to answer this question, this study conducts a comparative analysis of iWrite and DeepSeek, evaluating their scoring reliability and feedback performance on IELTS writing tasks completed by 46 sophomore Art majors in a Chinese-foreign cooperative university program. It aims to provide some insights into choosing automated scoring and feedback tools for teachers and researchers.
关键词	iWrite；DeepSeek；英语写作；评分信度；反馈
KeyWord	iWrite; DeepSeek; English writing; Scoring reliability; Feedback
基金项目
页码	25-30

参考文献
相关文献

[1] 高乔.中国人工智能创新你何以令海外惊叹
[N].人民日报海外版,2025-02-15(006).

[2] Guo. D.,Yang. D.,Zhang, H., et al. Deepseek-R1: Incentivizing Reasoning Capability in LLms Via Reinforcement Learning
[R/OL]. (2025-01-29)
[2025-09- 15]. https://arxiv.org/abs/2501.12948

[3] 张福慧,李文滔,龙宓吟,高瑛. 基于三个技术平台的自我调节性写作学习效果对比研究
[J].外语电化教学, 2019, (10):22-26.

[4] 王昕,李钦萌.英语专业大学生学术英语写作线上多元反馈模式探索
[J]. 外语研究,2023,(4):44-50.

[5] 李艳玲,田夏春.iWrite 2.0在线英语作文评分信度研究
[J].现代教育技术, 2018, 28(2):75-80.

[6] 马小森.AES系统iWrite反馈能力及评分信度研究
[J]. 海外英语, 2024,(3):99-101.

[7] Mizumoto, A. & Eguchi, M. Exploring the potential of using an AI language model for automated essay scoring
[J]. Research Methods in Applied Linguistics, 2023, 2(2): Article 100050.

[8] 殷小娟,林庆英.ChatGPT与AES系统对大学英语写作的反馈效度比较
[J]. 闽江学院学报,2024,(3):78-92.

[9] 冯庆华. DeepSeek在翻译教学与研究中的创新应用
[J].中国翻译,2025,(2):58-67.

[10] 张天成.大语言模型与人类评分员的对比研究
[J].外语测试与教学,2025,(3):31-38,58.

[11] 马睿朵. DeepSeek与批改网写作批改效能对比研究
[J].计算机时代, 2025,(7):62-65.

[12] 董艳云,祁昕阳,马晓梅.基于GPT-4的英语写作自动化评估探索----以雅思写作任务2为例子
[J].语言测试与评价,2024,(2):13-30.

[13] Akoglu, H.User’s guide to correlation coefficients
[J]. Turkish Journal of Emergency Medicine,2018,18(3) : 91-93.

[14] 林莉兰.基于电子档案袋测评的评分者间信度分析报告
[J].西安外国语大学学报,2021,29( 4) : 67－72.

[15] 陈曦,胡中峰.基于DeepSeek的智能评分:效度、信度与可行性研究
[J].高教探索,2025,(3):62-67.

引用本文

胡婕妤. iWrite与DeepSeek的评分信度与反馈内容对比分析 [J]. 科学发展研究. 2025; 5; (7). 25 - 30.

文献评论

相关学者

相关机构