Press "Enter" to skip to content

Tag: Feedback

Navigating the Future: Innovation and Integrity in the Era of AI

I was at St James’ Park today, I believe the local football fans are rather fond of the place, but I was there for Turnitin’s first roundtable discussion since before the pandemic. Trying to start this post with ‘not AI’, we had a look at Turnitin’s product roadmap which is all about the new Feedback Studio. The new version has been redesigned from the ground-up to be screen reader accessible, a common complaint about the old version, and to be fully responsive, rather than Turnitin developing mobile apps for the platform. The rubric manager has also been rewritten to make improvements in managing and archiving rubrics, and adding the ability to import rubrics from common file formats like Excel, rather than the previous propriety format they used. It goes live on July 15th, but institutions can opt-out, and they are expecting a long period of transition. Alas that we are switching to the Canvas framework integration so our staff won’t benefit from this.

And that’s about it for ‘not AI’. In the opening remarks Turnitin presented on the outcomes of a global staff and student survey on perceptions of generative artificial intelligence. Overall, 78% of respondents were positive about the potential of AI, while at the same time 95% believed that AI was being misused. Among students only, 59% were concerned that an over-reliance on AI would result in reduced critical thinking skills (I have thoughts on this that I’ll circle back to later). In the slightly blurry photo above (I was sat at the back) you can see the survey results broken down by region, showing that in the UK and Ireland we are the least optimistic about AI having a positive impact on education, at only 65%, while India has the most positive outlook at 93%. All regions report being overwhelmed by the availability and volume of AI, which is unsurprising when every application and website is adding spurious AI tools to their services in a desperate attempt to be The One that sticks and ends up making a profit. (Side note to remind everyone that no-one is making any money out of actual AI systems in the current boom, these large language models are horrifically expensive to train and run, and the whole thing is being sustained by investment capital in a huge gamble on future returns. What could possibly go wrong!?)

The keynote address was delivered by Stephen Gow, Leverhulme Research Fellow at Edinburgh Napier University, who discussed the StudentXGenAI research project, and the ELM tool at the University of Edinburgh which is an institutionally provided front-end for accessing various language models but which has safeguards built-in to prevent misuse. Stephen reported on the mixed success of this. While it seems like a good idea, and the kind of thing I believe universities should be providing to ensure equitable access for all students, uptake has been poor, and students report that the they don’t like using the tool because the feel it’s ‘spying on them’, and would rather use AI models directly – highlighting issues of trust and autonomy. Stephen pointed us to C. Thi Nguyen’s paper ‘Trust as an Unquestioning Attitude‘ for a more detailed discussion of trust as it pertains to complex IT systems, and how trust should be viewed not as a binary, but a delicate and negotiated balance.

During our breakout roundtable discussions, my group discussed how AI is a divisive issue, people either love it or hate it, with few in the middle ground. There is some correlation along generational lines here, with younger staff and students being more positive, but it isn’t an exact mapping. One of my table colleagues reported having an intern, a young, recent graduate, who refuses to use any Gen AI systems on environmental ethical grounds, while another colleague won’t use it because they fear offloading their thinking skills to it. That was the second time such a sentiment had been expressed today, and it made me think of the parallels with the damage that social media has done to attention spans, but while that concept took a long time to enter the public consciousness (and we are barely starting to deal with the ramifications), there seems to be more voices raising the problem of AI’s impact on cognitive ability, and it’s happening sooner in the cycle, which gives me some limited optimism. Another colleague at my table also introduced me to the concept of ‘AI shaming‘, from a paper by Louie Giray.

Finally, we were given a hands-on experience of Clarity, Turnitin’s new product which provides students with a web interface for written assessments with a built-in AI chat assistant. The idea is to provide students with an AI system that they can use safely, and which gives confidence to both them and their tutors that there has been no abuse of Gen AI to write the essay. I like the idea of this, and I have advocated for Sunderland to provide clear guidance to students on what they can and can’t use, and that we should be providing something legitimate for students which would have safe guards of some kind to prevent misuse. Why, therefore, when presented with just such a solution, was I so sceptical and disappointed; unable to see anything but its flaws? Maybe the idea just doesn’t work in practice.

I was hoping to see and learn more about Clarity today, so I was very pleased that we were given this opportunity. Of course I immediately started to try and break it. I went straight in with the strawberry test, but the system just kept telling me it wouldn’t help with spelling, and directed me to write something addressing the essay question instead. I did get it to break though, first, by inserting the word into my essay and asking it to check my spelling and grammar, but after I had something written in the input window I found that it would actually answer the question directly, reporting that ‘strawberries’ is actually spelled with one r and two b’s. Fail. When I overheard a colleague at another table reporting that it seemed to be directing them to use US English spelling, I decided to experiment by translating my Copilot produced ‘essay’ into Spanish with Google Translate. Clarity then informed me that the assignment required the essay to be in English, a straight-up hallucination as there was no such instruction. What there was, as Turnitin told us, was that the system has been built on US English and can’t yet properly handle other variations and languages. There were also quite transparent on the underlying technology which is based on Anthropic’s Claude model, which I appreciated as I have found other companies offering AI tools to be evasive, insisting that they have developed their own models based on the own training data only, which I’m highly sceptical about given the resource requirements.

Fun as it may be to try and break AI models with spelling challenges, it’s not what they are built for, and there is an old fashioned spell checker built into the text entry box. However, that doesn’t mean that when presented with an AI chatbot in a setting like this, students aren’t going to ask it questions about spelling and grammar. This seems like a perfectly legitimate use case, and the reason I suspect that Turnitin have installed a ‘guard rail’ here is that they are well aware that large language models are no good for this kind of question, just as they are no good for mathematical operations. Or, for that matter, providing straight facts. The development of people using these models like they were search engines should frighten everyone. Our table chuckled when one of us reported that ChatGPT was confidently telling them that Nigel Farage was the Prime Minister (did I say chuckle? I meant shudder.), but more subtle errors can be far harder to spot, and could have terrible ramifications in the fractured, post-truth world we’ve built. I’m sure I’ve said something like this before on here, and I probably will again, but calling these systems ‘intelligent’ has been a huge mistake. There is no intelligence to be found here. There is no understanding. Only very sophisticated predication systems about what comes next after a given input.

I’m most doubtful about the assumptions that students will want to use Clarity in the first place. Am I giving myself aways as old when I say that I would never even contemplate writing something as important as a multi-thousand word essay in an online web interface that requires a stable, constant internet connection? Clarity has no ability for students to upload their written work, and though you can copy and paste text into it, this would be immediately flagged by Clarity as an issue for investigation. There’s no ability for versioning, no ability to export and save offline, limited formatting options and fonts, no ability to use plugins for reference management, etc. I also can’t imagine any circumstances in which I would recommend students use Clarity. It is not an infrequent problem that academics come to us reporting that they have spent hours writing student feedback in Turnitin’s Writing Feedback tool, only to find out later that their comments haven’t saved properly and just aren’t there. It is such a big problem that we routinely train our staff to write all of their feedback offline first, and then copy and paste it into Feedback Studio. Colleagues in the room challenged Turnitin about this, and the response was that in their evaluation students reported being very happy with the system.

Nevertheless, Turnitin believe that some kind of process validation is going to be necessary to ensure the academic integrity of written work going forwards, and I do think they have a point. But the only way I can see Clarity, or something like it working, is if academics mandate its use for assessment with students having to do everything in the browser, in which case unless they are teaching a module on how to alienate your students and make them hate you, it isn’t going to go down well. As much as Turnitin would like it to be so, I don’t think there’s a technological solution to this problem. I increasing think that in order to validate student knowledge and understanding we are going to have to use some level of dialogic assessment, which doesn’t scale in the highly marketised higher education system we now find ourselves in.

AI Disclaimer: There is no ethical use of generative artificial intelligence. The environmental cost is devastating and the technology is built on plagiarised content and stolen art, for the purpose of deskilling, disempowering and replacing the work of real people.
Leave a Comment

Innovation and Integrity in the Age of AI

I don’t usually attend these Turnitin product updates, not out of a lack of interest, just because it’s something that lies more with the other half of the team here at Sunderland, so I leave them to it and to cascade what’s important to the rest of us when required. This one piqued my interest though, after seeing a preview of the new user interface at NELE last week. You can see some of the planned changes to the Feedback Studio and the Similarity Report view above. I asked a question about the lack of audio feedback following NELE, and was told that this, along with new video feedback capabilities are on the roadmap and coming soon.

I was also interested in their new Clarity tool, which will allow students to submit or write their work through a web interface, and get immediate feedback with help on how to improve their writing from Turnitin’s AI chatbot. Very similar to how Studiosity’s Writing Feedback+ service works, so that’s going to be very interesting for me to see how that develops.

AI Disclaimer: There is no ethical use of generative artificial intelligence. The environmental cost is devastating and the technology is built on plagiarised content and stolen art, for the purpose of deskilling, disempowering and replacing the work of real people.
Leave a Comment

AI-Augmented Marking

Chart showing correlation of human and KEATH.ai grading
Accuracy of KEATH.ai Grading vs. Human Markers

This was a HeLF webinar facilitated by Christopher Trace at the Surrey Institute of Education, to provide us with an introduction to KEATH.ai, a new generative AI powered feedback and marking service which Surrey have been piloting.

It looked very interesting. The service was described as a small language model, meaning that it is trained on very specific data which you – the academic end user – feeds into it. You provide some sample marked assignments, the rubric they were marked against, and the model can then grade new assignments with a high level of concurrence to human markers, as shown in the chart above of Surrey’s analysis of the pilot. Feedback and grading of a 3-5,000 word essay-style assignment takes less than a minute, and even with that being moderated by the academic for quality, which was highly recommended, it is easy to see how the system could save a great deal of time.

In our breakout rooms, questions arose around what the institution would do with this ‘extra time’, whether they would even be willing to pay the new upfront cost of such a service when the cost of marking and feedback work is already embedded into the contracts of academic and teaching staff, and how students would react to their work being AI graded? Someone in the chat shared this post by the University of Sydney discussing some of these questions.

Leave a Comment

Studiosity Partner Forum 2024

I attended my third Studiosity Partner Forum today, which kind of began last night with a dinner and discussion about generative artificial intelligence led by Henry Aider. Generative AI and Studiosity’s new GAI powered Writing Feedback+ service was of course the main topic of conversation throughout the event. Writing Feedback+ launched in February, and they have reported that uptake is around 40% of eligible students, which compares with 15-20% for the classic Writing Feedback service. The model has been built and trained internally, using only writing feedback provided by Studiosity’s subject specialists, no student data. The output of WF+ is being closely quality assured by those agents, and they estimate that quality is around 95-97% as good as human provided feedback.

David Pike, from the University of Bedfordshire presented on their experience with the service in the afternoon. They made it available to all of their students in February, around 20,000, and usage has already exceeded usage of the classic Writing Feedback service since September last year. The average return time from WF+ is around one and a half minutes, and student feedback on the service is very positive at 88.5%. However, he did also note that a number of students who have used both versions of the service stated that they preferred the human provided feedback.

On the flip side of AI, last year Studiosity were exploring a tool to detect submissions which had been written by generative AI. That’s gone. Nothing has come of it as they found that the reliability wasn’t good enough to roll out, especially so for students who have English as a second language. No surprises for me there, detection is a lie.

The keynote address was delivered by Nick Hillman from the Higher Education Policy Institute (HEPI), who talked about their most recent report on the benefits and costs associated with the graduate visa route. It’s overwhelmingly positive for us as a country, and it would be madness to limit this.

Other things which I picked up included learning more about Crossref, a service for checking the validity of academic references; a course on Generative AI in Higher Education from Future Learn was recommended; and Integrity Matters, a new course developed by the University of Greenwich and Bloom to teach new students about academic integrity.

Finally I was there presenting myself, doing my Studiosity talk about our implementation at Sunderland and the data we now have showing a strong positive correlation between engagement with Studiosity and student outcomes and continuation.

Leave a Comment

ALT NE User Group: June 2023

A photo of Durham's lightboard in action
Durham University’s Lightboard, a very cool (but smudgy) piece of tech

Hosted by my lovely colleagues at Durham, this ALT North East meeting began with a discussion of the practice of video assessment. I talked through what we do at Sunderland using Canvas and Panopto, covering our best practice advice and talking through the things which can go wrong. The problem of a VLE having multiple tools for recording / storing video was one such headache shared by all of us, no matter what systems we are using.

We then moved on to a discussion about Turnitin, ChatGPT and AI detection, pretty much a standing item now. Dan shared with us a new tool he has come across, which I’m not going to name or share, which uses AI to autocomplete MCQs. A new front has emerged. Some bravery from Northumbria who must be one of the few HEIs to have opted in to Turnitin’s beta checker, and New College Durham are going all in on the benefits of generative writing to help staff manage their workload by, for example, creating lesson plans for them. A couple of interesting experiments to keep an eye on there.

After lunch we had demonstrations of various tools and toys in Durham’s Digital Playground Lab. This included a Lightboard. This is a really cool and simple piece of tech that lets presenters write on a transparent board between them and the camera using UV pens. I came across this a few years ago, before the pandemic I think, but it’s a strange beast. It’s not a commercial system, but open hardware, so anyone can build one for themselves at little cost. Unfortunately at Sunderland, and I suspect at many bureaucracies, this actually makes it a lot harder to get one than just being able to go to a supplier. So it never happened, but at least today I got to see one live.

Another bespoke system demonstrated was a strip of LED lights around the whiteboard controlled through a web app which allows students to discretely indicate their level of comprehension. We had a short tour of the Playground’s media recording room, watched some video recordings of content created in VR to, for example, show the interaction of the magnetic fields of objects, a demonstration of Visual PDE which is an open source web tool for demonstrating differential equations, and Kaptivo, a system for capturing the content of a whiteboard but not the presenter. You can see the Kaptivo camera in the background of my photo, behind the Lightboard.

Leave a Comment

Studiosity Partner Forum 2022

My first in-person conference in two years at the University of Roehampton’s gorgeous campus was a chance to learn about Studiosity’s plans for the future, to network with colleagues at other UK HEIs using Studiosity and compare notes, and pretty randomly, I was able to get a tour of Roehampton’s new library building during lunchtime (it’s lovely).

On those future plans, we’re going to see an enhanced version of the student feedback view in the next couple of months which is going to allow their subject specialists to insert short videos and infographics explaining particular grammatical concepts, issues with spelling, and so on. They are also introducing a new ‘Student Connect’ tool which will help to facilitate peer-to-peer student support. This is currently in beta testing, and two UK universities are part of this evaluation.

The keynote address was by Sir Eric Thomas, who sits on Studiosity’s Academic Advisory Board, and he made a great point that, looking at historical precedents from past plagues, people at the time always think, “this is going to change everything, we can’t go back to how things used to be”, but invariably things do go back to exactly how they were once the threat is over. He speculated that this was because plagues and pandemics leave physical infrastructure unchanged, in contrast to wars, where the physical act of rebuilding allows for societal changes to be literally built in. However, what may be different as we ‘re-build’ after Covid, is that new communication technologies such as Teams and Zoom have come into their own and already effected change in how we live and work. The permanence of these changes is something that lingers in my mind as I contemplate my future.

Good opportunities for informal chats with colleagues at more advanced stages of Studiosity use, and no easy answers to be had in terms of managing use and expectations, and showing causal links between use of the service and student retention and attainment, something I’m in the midst of grappling with now as we approach the end of our pilot.

Leave a Comment

PG Cert AP: Day 14

Final day of the core module began with a session delivered by a guest lecturer who talked about workplace literacy and how the non-academic writing we do on a day-to-day basis is as valuable as academic writing and teaching in forming our professional identities. This was based on a paper by Mary and Barry Stierer – Lecturers’ everyday writing as professional practice in the university as workplace: new insights into academic identities.

In the afternoon there was a catch-up for a few people who missed the peer teaching session, followed by another run of the nominal group feedback exercise to get our feedback on the module now that it has completed.

Leave a Comment

PG Cert AP: Day 8

First day of my optional module, Assessment and Feedback for Learning, began with a discussion of how assessment can be used for learning, rather than as a tool to measure learning. The module has this concept at its core and, as such, the main assessment of this module is to critically analyse two assessments that you have used or written previously. There is also a second assessment, to write a personal reflective report on how you have found the problem based learning approach taken in this module, and how what you have learned impacts on your own academic practice. Very meta.

After setting out the learning objectives and the assessments of the modules, the remainder of the day was spent discussing the various factors and contexts which influence how assessments are set and marked. These included how student expectations have changed as a result of the marketisation of the sector, the university’s generic assessment criteria and how that relates to the learning outcomes on individual modules, and the cascading down of risk onto lecturers, e.g. pressures around graduate employability and how that influences the assessments which are set.

We also discussed the difference between formative and summative assessment, and how and why students often see formative assessments as options. There was a little about Foucault’s ‘regimes of truth’ (got to love a bit of Foucault!), and the concepts of the hidden curriculum and expectations – that everyone has a certain baseline IT literacy for example.

Leave a Comment

PG Cert AP: Day 6

The final day of the first semester was a little unusual. The morning was given over to a review of the assignments for this module which are to complete the UKPSF form, critique a learning session, analyse a learning theory, and write a report on the experience of peer observation, comparing the experience of being the observer and the observee. Drafts are due at the end of semester 2, with final versions by September. All well and good, and all covered in the module guide. This session didn’t add anything, and yet we did literally spend the entire morning debating it. Strange things happen when you have academics as students.

The afternoon session was more useful. First there was a short presentation on evaluation in general, why and how to do it, followed by an introduction to nominal group technique. A definition of evaluation was given as ‘assessing the process and practice of a prior learning strategy or event by feedback and trying to make objective summaries of an often subjective interpretation.’ This was followed by a discussion on the different types of evaluation – student, staff, data, and self – and the difference between quality assurance, which is backwards looking and tends to be about accountability, and quality enhancement, which is about how to improve and develop your programme or module.

With quality enhancement in mind, nominal group technique was then introduced followed by actually using it to evaluate this first semester. As a group, and with the programme leader absent, we drew up two lists of ten to twelve points of things that are going well, and things which we think need to be improved. These were written on a board in no particular order, then individually we had ten votes, or points, with which to rank what we thought were the most important points. So for example, if you thought that ‘over-assessment’ and ‘use of VLE’ were the two most important things that needed to be improved upon, then you could give each one five votes. The programme leader was then invited back in and the votes were added up to show what we collectively ranked as the most important things for improvement, and what we felt was going well. The outcome of this evaluation will be actively used in the development of the programme for the second semester.

Leave a Comment

Turnitin UK User Summit

student_feedback

Attended the afternoon sessions of Turnitin’s UK user summit which focused on customer experience, with talks from colleagues at the University of Edinburgh, the University of East London, Newcastle University and the University of Huddersfield. It’s always cathartic to hear your colleagues sharing their tales of woe and horror which are so familiar in your own work, like the academics who insist on treating the originality score as sacrosanct when making a plagiarism decision, but more productively there were some really good ideas and pieces of best practice shared. One colleague was using Blackboard’s adaptive release function to hide the Turnitin assignment submission link until students had completed a ‘quiz’ which was simply making them acknowledge in writing that they work they were about to submit was all their own. A couple of people presented their research findings on what students wanted from feedback, such as in the attached photo which shows a clear preference for electronic feedback. Someone made a product development suggestion, splitting the release of the grade and feedback in Turnitin so that students have to engage with their feedback before they get their grade. But I think my personal highlight from the day was the very diplomatic description of difficult customers as those who have ‘higher than average expectations’.

Though I missed out on the morning session due to another commitment, I was able to get the gist from networking with colleagues in-between sessions. Improvements to the Feedback Studio including the ability to embed links, multiple file upload, a new user portal which will show the most recent cases raised by people at your institution, and the development I found most interesting, the ability to identify ghost written assignments. This is still quite away from being ready, but it’s an increasing problem and one Turnitin has in their sights. They couldn’t reveal too much about how this will work for obvious reasons, but the gist is that they will attempt to build up a profile of the writing style of individuals so that they can flag up papers which seem to be written differently.

The Twitter conversation from the summit is available from the TurnitinUKSummit hashtag, where you will see I won the Top Tweet! Yay me, but alas there were no prizes.

Leave a Comment