I’m Still Dreaming of an AI Grading Agent (and a bunch of AI things about teaching and writing)

I’m in the thick of the fall semester and I’ve been too busy to think/read/write much about AI for a while. Honestly, I’m too busy to be writing this right now, but I’ve also got a bucket full of AI tabs open on my browser, so I thought I’d do a bit of a procrastination and “round up” post.

In my own classes, students seem to be either leery of or unimpressed with AI. I’ve encouraged my more advanced students to experiment with/play around with AI to help with the assignments, but absent me requiring them to do something with AI, they don’t seem too interested. I’ve talked to my first year writing students about using AI to brainstorm and to revise (and to be careful about trusting what the AI presents as “facts”), but again, they don’t seem interested. I have had at least one (and perhaps more than that) student who tried to use AI to cheat, but it was easy to spot. As I have said before, I think most students want to do the work themselves and to actually learn something, and the students who are inclined to cheat with AI (or just a Google search) are far from criminal geniuses.

That said, there is this report, “GenAI in Higher Education: Fall 2023 Update Time for Class Study,” which was research done by a consulting firm called Tyton Partners and sponsored by Turnitin. I haven’t had a chance to read beyond the executive summary, but they claim half of students are “regular users” of generative AI, though their use is “relatively unsophisticated.” Well, unless a lot of my students are not telling me the truth about using AIs, this isn’t my impression. Of course, they might be using AI stuff more for other classes.

Here’s a very local story about how AI is being used in at least one K-12 school district: “‘AI is here.’ Ypsilanti schools weigh integrity, ethics of new technology,” from MLive. Interestingly, a lot of what this story is about is how teachers are using AI to develop assignments, and also to do some things like helping students who don’t necessarily speak English as their native language:

Serving the roughly 30% of [Ypsilanti Community Schools] students who can speak a language other than English, the English Learner Department has found multiple ways to bring AI into the classroom, including helping teachers develop multilingual explanations of core concepts discussed in the curriculum — and save time doing it.

“A lot of that time saving allows us to focus more on giving that important feedback that allows students to grow an be aware of their progress and their learning,” [Teacher Connor] Laporte said.

Laporte uses an example of a Spanish-speaking intern who improved a vocabulary test by double-checking the translations and using ChatGPT to add more vocabulary words and exercises. Another intern then used ChatGPT to make a French version of the same worksheet.

A lot of the theme of this article is about how teachers have moved beyond being terrified of AI ruining everything to becoming a tool to work with in teaching. That’s happening in lots of places and lots of ways; for example, as Inside Higher Ed noted, “Art Schools Get Creative Tackling AI.” It’s a long piece with a somewhat similar theme: not necessarily embracing AI, but also recognizing the need to work with it.

MLA apparently now has “rules” for how to cite AI. I guess maybe it isn’t the end of the essay then, huh? Of course, that doesn’t mean that a lot of writers are going to be happy about AI.  This one is from a while ago, but in The Atlantic back in September, Alex Reisner wrote about “These 183,000 Books are Fueling the Biggest Fight in Publishing and Tech.” Reisner had written earlier about how Meta’s AI systems were being trained on a collection of more than 191,000 books that were often used without permission. The article has a search feature so you can see if your book(s) were a part of that collection. For what it’s worth, my book and co-edited collection about MOOCs did not make the cut.

Several famous people/famous writers are now involved in various lawsuits where the writers are suing the AI companies for using their work without permission to train (“teach?”) the AIs. There’s a part of me that is more than sympathetic to these lawsuits. After all, I never thought it was fair that companies like Turnitin can use student writing without permission as part of its database for detecting plagiarism. Arguably, this is similar.

But on the other hand, OpenAI et al didn’t “copy” passages from Sarah Silverman or Margaret Atwood or my friend Dennis Danvers (he’s in that database!) and then try to present that work as something the AI wrote. Rather, they trained (taught?) the AI by having the program “read” these books. Isn’t that just how learning works? I mean, everything I’ve ever written has been been influenced in direct and indirect ways by other texts I’ve read (or watched, listened to, seen, etc). Other than scale (because I sure as heck have not read 183,000 books), what’s the difference between me “training” by reading the work of others and the AI doing this?

Of course, even with all of this training and the continual tweaking of the software, AIs still have the problem of making shit up. Cade Metz wrote in The New York Times “Chatbots May ‘Hallucinate’ More Often Than Many Realize.” Among other things, the article is about a new start-up called Vectara that is trying to estimate just how often AIs “hallucinate,” and (to leap ahead a bit) they estimated that different AIs hallucinate at different rates ranging from 3% to 27% of the time. But it’s a little more complicated than that.

Because these chatbots can respond to almost any request in an unlimited number of ways, there is no way of definitively determining how often they hallucinate. “You would have to look at all of the world’s information,” said Simon Hughes, the Vectara researcher who led the project.

Dr. Hughes and his team asked these systems to perform a single, straightforward task that is readily verified: Summarize news articles. Even then, the chatbots persistently invented information.

“We gave the system 10 to 20 facts and asked for a summary of those facts,” said Amr Awadallah, the chief executive of Vectara and a former Google executive. “That the system can still introduce errors is a fundamental problem.”

If I’m understanding this correctly, this means that even when you give the AI a fairly small data-set to analyze (10-20 “facts”), the AI still makes shit up with things not a part of that data-set. That’s a problem.

But it still might not stop me from trying to develop some kind of ChatGPT/AI-based grading tool, and that might be about to get a lot easier. (BTW, talk about burying the lede after that headline!)  OpenAI announced something they’re calling (very confusingly) “GPTs,” which (according to this article by Devin Coldewey in TechCrunch) is “a way for anyone to build their own version of the popular conversational AI system. Not only can you make your own GPT for fun or productivity, but you’ll soon be able to publish it on a marketplace they call the GPT Store — and maybe even make a little cash in the process.”

Needless to say, my first thought was could I use this to make an AI Grading tool? And do I have the technical skills?

As far as I can tell from OpenAI’s announcement about this,  GPTs require upgrading to their $20 a month package and it’s just getting started– the GPT store is rolling out later this month, for example.  Kevin Roose of The New York Times has a thoughtful and detailed article about the dangers and potentials of these things, “Personalized A.I. Agents Are Here. Is the World Ready for Them?” User-created agents will very soon be able to automate responses to questions (that OpenAI announcement has examples like a “Creative Writing Coach,” a “Tech Advisor” for trouble-shooting things, and a “Game Time” advisor that can explain the rules of card and board games. Roose writes a fair amount about how this technology could also be used by customer service or human resource offices, and to handle things like responding to emails or updating schedules. Plus none of this requires any actual programming skills, so I am imagining something like “If This Then That” but much more powerful.

AI agents might also be made to do evil things, which has a lot of security people worried for obvious reasons. Though I don’t think these agents are going to be to powerful enough to do anything too terrible; actually, I don’t think these agents will have the capabilities to make the AI grading app I want, at least not yet. Roose got early access to the OpenAI project, and his article has a couple of examples of how he played around with it:

The first custom bot I made was “Day Care Helper,” a tool for responding to questions about my son’s day care. As the sleep-deprived parent of a toddler, I’m always forgetting details — whether we can send a snack with peanuts or not, whether day care is open or closed for certain holidays — and looking everything up in the parent handbook is a pain.

So I uploaded the parent handbook to OpenAI’s GPT creator tool, and in a matter of seconds, I had a chatbot that I could use to easily look up the answers to my questions. It worked impressively well, especially after I changed its instructions to clarify that it was supposed to respond using only information from the handbook, and not make up an answer to questions the handbook didn’t address.

That sounds pretty cool, and I bet I could create an AI agent capable of writing an summative end-comment on a student essay based on a detailed grading rubric I feed into the machine. But that’s a long way from doing the kind of marginal commenting on student essays that responds to particular sentences, phrases, and paragraphs. I want an AI agent/grading tool that can “read” a piece of student writing that is more like how I would read and comment on a piece of student writing, and that  limited to a rubric.

But this is getting a lot closer to being potentially useful– not a substitute for me actually reading and evaluating student writing, but as a tool to make it easier to do. Right now, the free version of ChatGPT does a good job of revising away grammar and style mistakes and errors, so maybe instead of me making marginal comments on a draft about these issues, students can first try using the AI to help them do this kind of low-level revision before they turn it in. That, combined with a detailed end comment from the AI might, actually work well. I’m not quite sure if this would actually save me any time since it seems like setting up the AI to do this would take a lot of time, and I have a feeling I’d have to set up the AI agent for every unique assignment. Plus, and in addition to the time it would take to set up, this would cost me $20 a month.

Maybe for next semester….

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.

This site uses Akismet to reduce spam. Learn how your comment data is processed.