I was at St James’ Park today, I believe the local football fans are rather fond of the place, but I was there for Turnitin’s first roundtable discussion since before the pandemic. Trying to start this post with ‘not AI’, we had a look at Turnitin’s product roadmap which is all about the new Feedback Studio. The new version has been redesigned from the ground-up to be screen reader accessible, a common complaint about the old version, and to be fully responsive, rather than Turnitin developing mobile apps for the platform. The rubric manager has also been rewritten to make improvements in managing and archiving rubrics, and adding the ability to import rubrics from common file formats like Excel, rather than the previous propriety format they used. It goes live on July 15th, but institutions can opt-out, and they are expecting a long period of transition. Alas that we are switching to the Canvas framework integration so our staff won’t benefit from this.
And that’s about it for ‘not AI’. In the opening remarks Turnitin presented on the outcomes of a global staff and student survey on perceptions of generative artificial intelligence. Overall, 78% of respondents were positive about the potential of AI, while at the same time 95% believed that AI was being misused. Among students only, 59% were concerned that an over-reliance on AI would result in reduced critical thinking skills (I have thoughts on this that I’ll circle back to later). In the slightly blurry photo above (I was sat at the back) you can see the survey results broken down by region, showing that in the UK and Ireland we are the least optimistic about AI having a positive impact on education, at only 65%, while India has the most positive outlook at 93%. All regions report being overwhelmed by the availability and volume of AI, which is unsurprising when every application and website is adding spurious AI tools to their services in a desperate attempt to be The One that sticks and ends up making a profit. (Side note to remind everyone that no-one is making any money out of actual AI systems in the current boom, these large language models are horrifically expensive to train and run, and the whole thing is being sustained by investment capital in a huge gamble on future returns. What could possibly go wrong!?)
The keynote address was delivered by Stephen Gow, Leverhulme Research Fellow at Edinburgh Napier University, who discussed the StudentXGenAI research project, and the ELM tool at the University of Edinburgh which is an institutionally provided front-end for accessing various language models but which has safeguards built-in to prevent misuse. Stephen reported on the mixed success of this. While it seems like a good idea, and the kind of thing I believe universities should be providing to ensure equitable access for all students, uptake has been poor, and students report that the they don’t like using the tool because the feel it’s ‘spying on them’, and would rather use AI models directly – highlighting issues of trust and autonomy. Stephen pointed us to C. Thi Nguyen’s paper ‘Trust as an Unquestioning Attitude‘ for a more detailed discussion of trust as it pertains to complex IT systems, and how trust should be viewed not as a binary, but a delicate and negotiated balance.
During our breakout roundtable discussions, my group discussed how AI is a divisive issue, people either love it or hate it, with few in the middle ground. There is some correlation along generational lines here, with younger staff and students being more positive, but it isn’t an exact mapping. One of my table colleagues reported having an intern, a young, recent graduate, who refuses to use any Gen AI systems on environmental ethical grounds, while another colleague won’t use it because they fear offloading their thinking skills to it. That was the second time such a sentiment had been expressed today, and it made me think of the parallels with the damage that social media has done to attention spans, but while that concept took a long time to enter the public consciousness (and we are barely starting to deal with the ramifications), there seems to be more voices raising the problem of AI’s impact on cognitive ability, and it’s happening sooner in the cycle, which gives me some limited optimism. Another colleague at my table also introduced me to the concept of ‘AI shaming‘, from a paper by Louie Giray.
Finally, we were given a hands-on experience of Clarity, Turnitin’s new product which provides students with a web interface for written assessments with a built-in AI chat assistant. The idea is to provide students with an AI system that they can use safely, and which gives confidence to both them and their tutors that there has been no abuse of Gen AI to write the essay. I like the idea of this, and I have advocated for Sunderland to provide clear guidance to students on what they can and can’t use, and that we should be providing something legitimate for students which would have safe guards of some kind to prevent misuse. Why, therefore, when presented with just such a solution, was I so sceptical and disappointed; unable to see anything but its flaws? Maybe the idea just doesn’t work in practice.
I was hoping to see and learn more about Clarity today, so I was very pleased that we were given this opportunity. Of course I immediately started to try and break it. I went straight in with the strawberry test, but the system just kept telling me it wouldn’t help with spelling, and directed me to write something addressing the essay question instead. I did get it to break though, first, by inserting the word into my essay and asking it to check my spelling and grammar, but after I had something written in the input window I found that it would actually answer the question directly, reporting that ‘strawberries’ is actually spelled with one r and two b’s. Fail. When I overheard a colleague at another table reporting that it seemed to be directing them to use US English spelling, I decided to experiment by translating my Copilot produced ‘essay’ into Spanish with Google Translate. Clarity then informed me that the assignment required the essay to be in English, a straight-up hallucination as there was no such instruction. What there was, as Turnitin told us, was that the system has been built on US English and can’t yet properly handle other variations and languages. There were also quite transparent on the underlying technology which is based on Anthropic’s Claude model, which I appreciated as I have found other companies offering AI tools to be evasive, insisting that they have developed their own models based on the own training data only, which I’m highly sceptical about given the resource requirements.
Fun as it may be to try and break AI models with spelling challenges, it’s not what they are built for, and there is an old fashioned spell checker built into the text entry box. However, that doesn’t mean that when presented with an AI chatbot in a setting like this, students aren’t going to ask it questions about spelling and grammar. This seems like a perfectly legitimate use case, and the reason I suspect that Turnitin have installed a ‘guard rail’ here is that they are well aware that large language models are no good for this kind of question, just as they are no good for mathematical operations. Or, for that matter, providing straight facts. The development of people using these models like they were search engines should frighten everyone. Our table chuckled when one of us reported that ChatGPT was confidently telling them that Nigel Farage was the Prime Minister (did I say chuckle? I meant shudder.), but more subtle errors can be far harder to spot, and could have terrible ramifications in the fractured, post-truth world we’ve built. I’m sure I’ve said something like this before on here, and I probably will again, but calling these systems ‘intelligent’ has been a huge mistake. There is no intelligence to be found here. There is no understanding. Only very sophisticated predication systems about what comes next after a given input.
I’m most doubtful about the assumptions that students will want to use Clarity in the first place. Am I giving myself aways as old when I say that I would never even contemplate writing something as important as a multi-thousand word essay in an online web interface that requires a stable, constant internet connection? Clarity has no ability for students to upload their written work, and though you can copy and paste text into it, this would be immediately flagged by Clarity as an issue for investigation. There’s no ability for versioning, no ability to export and save offline, limited formatting options and fonts, no ability to use plugins for reference management, etc. I also can’t imagine any circumstances in which I would recommend students use Clarity. It is not an infrequent problem that academics come to us reporting that they have spent hours writing student feedback in Turnitin’s Writing Feedback tool, only to find out later that their comments haven’t saved properly and just aren’t there. It is such a big problem that we routinely train our staff to write all of their feedback offline first, and then copy and paste it into Feedback Studio. Colleagues in the room challenged Turnitin about this, and the response was that in their evaluation students reported being very happy with the system.
Nevertheless, Turnitin believe that some kind of process validation is going to be necessary to ensure the academic integrity of written work going forwards, and I do think they have a point. But the only way I can see Clarity, or something like it working, is if academics mandate its use for assessment with students having to do everything in the browser, in which case unless they are teaching a module on how to alienate your students and make them hate you, it isn’t going to go down well. As much as Turnitin would like it to be so, I don’t think there’s a technological solution to this problem. I increasing think that in order to validate student knowledge and understanding we are going to have to use some level of dialogic assessment, which doesn’t scale in the highly marketised higher education system we now find ourselves in.