In defense of machine grading?!?! Well, no, not really. But I thought I’d start a post with a title like that. You know, provocative.
There has been a bit of a ruckus on WPA-L for a while now in support of a petition against machine grading and for humans at the web site humanreaders.org and I of course agree with the general premise of what is being presented on that site. Machine grading software can’t recognize things like a sense of humor or irony, it tends to favor text length over conciseness, it is fairly easy to circumvent with gibberish kinds of writing, it doesn’t work in real world settings, it fuels high stakes testing, etc., etc., etc. I get all that.
We should keep pushing back against machine grading for all of these reasons and more. Automated testing furthers the interests of Edu-business selling this software and does not help students nor teachers, at least not yet. I’m against it, I really am.
- It seems to me that we’re not really talking about grading per se but about teaching, and the problem is writing pedagogy probably doesn’t work when the assessment/ grading part of things is completely separated from the teaching part of things. This is one of the differences between assigning writing and teaching writing.
- There’s a bit of a catch 22 going on here. Part of the problem was that writing teachers complained (rightly so, I might add) about big standardized tests of various sorts not having writing components. So writing was added to a lot of these tests. However, the only way to assess thousands of texts generated through this testing is with specifically trained readers (see my next point) or with computer programs. So we can skip the writing altogether with these tests or we can accept a far from perfect grading mechanism.
- I’ve participated in various holistic/group grading sessions before (though it’s been a long time), which is how they used to do this sort of thing before the software solutions. The way I recall it working was dozens and dozens of us were trained to assign certain ratings for essays based on a very specific rubric. We were, in effect, programmed, and there was no leeway to deviate from the guidelines. So I guess what I’m getting at is in these large group assessment circumstances, what’s the difference if it’s a machine or a person?
- This software doesn’t work that well yet, especially in uncontrolled circumstances: that is, grading software is about as accurate as humans with these standardized prompt responses written in specific testing situations, but it doesn’t work well at all as an off-the-shelf rating solution for just any chunk of writing that students write for classes or that writers write for some other reason. But the key word in that last sentence is yet, because this software has (and is) getting a lot better. So what happens when it gets as good as a human reader (or at least good enough?) Will we accept the role of this evaluation software much in the same way we now all accept spell checking in word processors? (And by the way, I am old enough to remember resistance among English teacher-types to that, too– not as strong as the resistance to machine grading, but still).
- As a teacher, my least favorite part of teaching is grading. I do not think that I am alone in that sentiment. So while I would not want to outsource my grading to someone else or to a machine (because again, I teach writing, I don’t just assign writing), I would not be against a machine that helps make grading easier. So what if a computer program provided feedback on a chunk of student writing automatically, and then I as the teacher followed behind those machine comments, deleting ones I thought were wrong or unnecessary, expanding on others I thought were useful? What if a machine printed out a report that a student writer and I could discuss in a conference? And from a WPA point of view, what if this machine helped me provide professional development support to GAs and part-timers in their commenting on students’ work?