‘Deploying AI agents’ sounds so hi tech and futuristic to (non Comp-Sci) me whilst weirdly also resonating of classic 60s and 70s TV shows I loved as a kid. I have been fiddling for a while on the blurred boundaries between LLMs and Agents, notably with Claude, but what appealed when I first saw Manus was the execution of outputs seemingly beyond what Claude can manage. Funnily enough it looks quite a bit like Claude but it seems it is actually a multi-tool agent. I pretty much concur with the conclusion from the MIT Tech review:
While it occasionally lacks understanding of what it’s being asked to do, makes incorrect assumptions, or cuts corners to expedite tasks, it explains its reasoning clearly, is remarkably adaptable, and can improve substantially when provided with detailed instructions or feedback. Ultimately, it’s promising but not perfect.
Anyway, I finally got in, having been on the Manus waitlist for a while. Developed by Chinese startup Monica, it is an autonomous AI agent capable of executing complex online tasks without ongoing human input and created something of a buzz. TL:DR: This is the initial output from first prompt to web-based execution. The selection and categorisation need honing but this in my view is an impressive output. The second version after addition of a follow up prompt.
Longer version:
I wanted to see what I could get from a single prompt so decided to see if it could build a shareable, searchable web page that curates short how-to videos (under five minutes) by higher education educators demonstrating uses of Generative AI. I began by requesting Manus to collect and cluster videos showing how AI is applied in teaching, assessment, feedback, and research (Natural Language Prompt). Manus responded immediately by creating a structured project directory and initiating web searches to identify relevant video content, starting with collections from institutions like Notre Dame and Harvard (which it didn’t get beyond in the first iteration).
Once videos were sourced, Manus automatically filtered them to ensure they were under five minutes in length (but failed to note that one wasn’t a video and others linked to same page where they were already curated!) and produced by educators in the higher education sector (this it did get right). It then categorised them by thematic area (Teaching, Assessment & Feedback, AI Literacy, and Research but the categrosations were not always spot on) while also tagging institutional affiliations (mostly OK). The tagging and filtering work pretty well but the actual sourcing is very limited in that first iteration. A metadata database was created to support an interactive interface, allowing users to search videos by presenter, title, or description and filter by category or institution. Manus built this interface and deployed the collection to a live web page: https://cjubrvkx.manus.space. It took about 15 minutes. Iteration is by natural language prompting for fine tuning.
The initial prompt from me and the start of the executaiton of the processes
After launch, I expanded the scope by asking Manus to include contributions from additional individuals and institutions- I threw in my old buddy Dr. Gerhard Kristandl (Dr GeeKay) who I know has a YouTube channel, I added me- Martin Compton (without any additional info) and it found relevant content from each of us. The other three potential souces I threw in reaped no reward though. This is the reasoning:
I searched for content from Tadhg Blommerde, University of Sydney, and Monash University as well, but most of their videos exceeded our 5-minute duration criteria. I’ve documented all findings in the data directory for reference.
Manus resumed its asynchronous task execution, searching for and integrating further video content into the existing framework enabling me to publish a second version . So the output works, I would need much more thoughtful prompting to get it how I want it and, tbh, there are better ways of curating YouTube content but the principle is what interested me and the output is remarkable given the stages and processes it went through.
Modifying or changing assessment ‘because of AI’ always feels like it feeds ‘us and them’ narratives of a forthcoming apocalypse (already predicted) and couches the change as necessary only because of this insidious, awful thing that no-one wants except men in leather chairs who stroke white cats.
It is of course MUCH more complex than that and much of the desired change has been promoted by folk with a progressive, reform, equity, inclusion eye who do (or immerse themselves in) scholarship of HE pedagogy and assessment practices.
Anyway, a colleague suggested that we should have a collection of ideas about practical ways assessments could be modified to either make them more AI ‘robust’ or at least ‘AI aware’ or ‘ AI inclusive’ (I’m hesitant to say ‘resitant’ of course). Whilst colleagues across King’s have been sharing and experimenting it is probably true to say that there is not a single point of reference. We are in King’s Academy working on remedying this as part of the wider push to support TASK (transforming assessment for students at King’s) and growing AI literacy but first I wanted to curate a few examples from elsewhere to offer a point of reference for me and to share with colleagues in the very near future. I’ve gone for diversity from things I have previously book marked. Other than that, they are here only to offer points of discussion, inspiration, provocation or comparison!
Before I start I should remind KIng’s colleagues of our own guidance and the assessment principles therein, note that with collleagues at LSE, UCL and Southampton I am working on some guidance on the use of AI to assist with marking (forthcoming and controversial). Some of the College Teaching Fund projects looked at assessment and This AI Assessment Scale from Perkins et al. (2024) has a lot of traction in the sector too and is not so dissimilar from the King’s 4 levels of use approach. It’s amazing how 2023 can feel a bit dated in terms of resources these days but this document form the QAA is still relevant and applicable and sets out broader, sector level approarpriate principles. In summary:
Institutions should review and reimagine assessment strategies, reducing assessment volume to create space for activities like developing AI literacy, a critical future graduate attribute.
Promote authentic and synoptic assessments, enabling students to apply integrated knowledge practically, often in workplace-related settings, potentially incorporating generative AI.
Move away from traditional, handwritten, invigilated exams towards innovative approaches like digital exams, observed discipline-specific assessments or oral examinations
Follow guiding principles ensuring assessments are sustainable, inclusive, aligned to learning outcomes, and effectively demonstrate relevant competencies, including appropriate AI usage.
I’m also increasingly referring to the two lane approach being adopted by Sydney which leans heavily into similar principles. Context is different to UK of course but I have a feeling we will find ourselves moving much closer to the broad approach here. It feels radical but perhaps no more radical than what many, if not most, unis did in Covid.
Evaluation of coursework assessments to determine susceptibility to generative AI and potential integration of AI tools.
Redesign of assessments to explicitly incorporate evaluation of ChatGPT-generated outputs, enhancing critical evaluation skills and understanding of AI limitations.
Integration of generative AI within module curricula and teaching practices, providing formative feedback opportunities.
Collection of student perspectives and experiences through questionnaires and focus groups on AI usage in learning and assessments.
Shift towards rethinking traditional assessment formats (MCQs, SAQs, essays) due to AI’s impact, encouraging ongoing pedagogical innovation discussions.
Gamification: Complex immunology concepts taught through a Star Wars-inspired, game-based approach.
AI-driven game design: ChatGPT 4.0 used to structure game scenarios, resources, and dynamic challenges.
Visual resources with AI: DALLE-3 employed to create engaging imagery for learning materials.
Iterative AI prompting: An innovative method using progressive ChatGPT interactions to refine complex game elements.
Practical, collaborative learning: Students collaboratively trade resources to combat diseases, supported by iterative testing and refinement of the game.
Integration of AI: The original essay task was redesigned to explicitly require students to use an LLM, typically ChatGPT.
The change: Individual component of wider collaborative task. Students submit both the AI-generated output (250 words) and a critical evaluation of that output (250 words) on what is unique about a business proposal.
Critical Engagement Emphasis: The new task explicitly focuses on students’ critical analysis of AI capabilities and limitations concerning their business idea.
Reflective Skill Development: Students prompted to reflect on, critique, and consider improvements or extensions of AI-generated content, enhancing their evaluative and adaptive skills.
“Empirical studies suggest that a majority of students cheat. Longitudinal studies over the past six decades have found that about 65–87% of college students in America have admitted to at least one form of nine types of cheating at some point during their college studies”
(Yu et al., 2018)
Shocking? Yes. But also reassuring in its own way. When you are presented with something like that from 2018 (ie. pre chatgpt) you realise that this is not a newly massive issue; it’s the same issue with a different aspect, lens or vehicle. Cheating in higher education has always existed, but I do acknowledge that generative AI has illuminated it with an intensity that makes me reach for the eclipse goggles. There are those that argue that essay mills and inappropriate third party support were phenomena that we had inadequately addressed as a sector for a long time. LLMs have somehow opened a fissure in the integrity debate so large that suddenly everyone wants to do something about it. it has become so much more complex because of that but also that visibility could be seen positively (I may be reaching but I genuinely think there is mileage in this) not least because:
1. We are actually talking about it seriously.
2. It may give us leverage to effect long needed changes.
The common narratives I hear are ‘where there’s a will, there’s a way’ and chatgpt makes the ‘way’ easier. The problem though, in my view, is that just because the ‘way’ is easier does not mean the ‘will’ will necessarily increase. Assuming all students will cheat does nothing to build bridges, establish trust or provide an environment where the sort of essential mutual respect necessary for transparent and honest working can flourish. You might point to the stat at the top of this page and say we are WAY past the need to keep measuring will! Exams, as I’ve argued before, are no panacea, given the long-standing issues of authenticity and inclusivity they bring (as well as being the place where students have shown themselves to be most creative in their subversion techniques!).
In contrast to this, study after study is finding that students are increasingly anxious about being accused of cheating when that was never their intention. They report unclear and sometimes contradictory guidance, leaving them uncertain about what is and isn’t acceptable. A compounding issue is the lack of consistency in how cheating is defined. it varies significantly between institutions, disciplines and even individual lecturers. I often ask colleagues whether certain scenarios constitute cheating, deliberately using examples involving marginalised students to highlight the inconsistencies. Is it ok to get structural, content or proof reading suggestions from your family? How does your access to human support differ if you are a first generation, neurodivergent student studying in a new language and country? Policies usually say “no” but to fool ourselves that this sort of ‘cheating’ is not routine would be hard to achieve and even harder to evidence. The boundaries are blurred, and the lack of consensus only adds to the confusion.
To help my thinking on this I looked again at some articles on cheating over time (going back to 1941!) that I had put in a folder and badly labelled as per usual and selected a few to give me a sense of the what and how as well as the why and to provide a baseline to inform the context around the current assumptions about cheating. Yu et al. (2018) use a long established categorisation of types of cheating with a modification to acknowledge unauthorised digital assistance:
Copying sentences without citation.
Padding a bibliography with unused sources.
Using published materials without attribution.
Accessing exam questions or answers in advance.
Collaborating on homework without permission.
Submitting work done by others.
Giving answers to others during an exam.
Copying from another student in an exam.
Using unauthorised materials in an exam.
The what and how question reveals plenty of expected ways of cheating, especially in exams but it is also noted where teachers / lecturers are surprised by the extent and creativity. Four broad types:
Plagiarism in various forms from self, to peers to deliberate inappropriate practices in citation.
Homework and assignment cheating such as copying work, unauthorised collaboration, or failing to contribute fairly.
Other academic dishonesty such as falsifying bibliographies, influencing grading or contract cheating.
In exams.
The amount of exam based cheating reported should really challenge assumptions about the security of exams at the very least and remind us that they are no panacea whether we see this issue through an ongoing or a chatgpt lens. Stevens and Stevens (1987) in particular share some great pre-internet digital ingenuity and Simpkin and McLeod (2006) show how the internet broadened the scope and potential. These are some of the types reported over time:
Using unauthorised materials.
Obtaining exam information in advance.
Copying from other students.
Providing answers to other students.
Using technology to cheat (using microcassettes, pre-storing data in calculators, mobile phones. Not mentioned but now apparently a phenomenon is use of bone conduction tech in glasses and/ or smart glasses).
Using encoded materials (rolled up pieces of paper for example).
Hiring a surrogate to take an exam.
Changing answers after scoring (this one in Drake,1941)
Collaborating during an exam without permission.
These are the main reasons for cheating across the decades I could identify (from across all sources cited at the end):
Difficulty of the work. When students are on the wrong course (I’m sure we can think of many reasons why this might occur), teaching is inadequate or insufficiently differentiated.
Pressure to succeed. ‘Success’ when seen as the principal goal can subdue the conscience.
Laziness. This is probably top of many academics’ assumptions and it is there in the research but also worth considering what else competes for attention and time and how ‘I can’t be bothered’ may also mask other issues even in self-reporting.
Perception that cheating is widespread. If students feel others are doing it and getting away with it, it increases the cheating.
Low risk of getting caught.
Sense of injustice in systemic approach, structural inequalities both real and perceived can be seen as a valid justification.
External factors such as evident cheating in wider society. A fascinating example of this was suggested to me by an academic who was trained in Soviet dominated Eastern Europe who said cheating was (and remains) a marker of subversion so carries its own respectability)
Lack of understanding of what is allowed and is not- students reporting they have not been taught this and degrees of cheating blurred by some of the other factors here- when does collaboration become collusion?
Cultural influences. Different norms and expectations can create issues and this comes back to my point about individualised (or contextualised) definitions of what is and is not appropriate.
My own experiences, over 30 years, of dealing with plagiarism cases often reveals very powerful, often traumatic, experiences that lead students to act in ways that are perceived as cheating.
For each it’s worth asking yourself:
How much is the responsibility for this on the student and how much on the teacher/ lecturer and / or institution (or even society)?
I suspect that the truly willful, utterly cynical students are the ones least likely to self declare and are least likely to get caught. This furthers my own discomfort about the mechanisms we rely (too heavily?) on to judge integrity too.
This skim through really did make clear to me that cheating and plagiarism are not the simple concepts that many say they are. Also cheating in exams is a much bigger thing than we might imagine. The reasons for cheating are where we need to focus I think. Less so the ‘how’ as that becomes a battleground and further entrenches ‘us and them’ conceptualisations. When designing curricula and assessments the unavoidable truth is we need to do better by moving away from one size fits all approaches, by realising cultural, social and cognitive differences will impact many of the ‘whys’ and hold ourselves to account when we create or exacerbate structural factors that broaden likelihood of cheating.
I am definitely NOT saying give wilful cheaters a free pass but all the work many universities are doing on assessment reform needs to be seen through a much longer lens than the generative AI one. To focus only on that is to lose sight of the wider and longer issue. We DO have the capacity to change things for the better but that also means that many of us will be compelled (in a tense, under threat landscape) to learn more about how to challenge conventions and even invest much more time in programme level, iterative, AI cognisant teaching and assessment practices. Inevitably the conversations will start with the narrow and hyped and immediate manifestations of inappropriate AI use but let’s celebrate this as leverage; as a catalyst. We’d do well, at the very least, to reconsider how we define cheating, why we consider some incredibly common behaviours as cheating (is it collusion or is it collaboration for example or proof reading help from 3rd parties). Beyond that, we should be having serious discussions about augmentation and hybridity in writing: what counts as acceptable support? How does that differ according to context and discipline? It will raise questions about the extent to which writing is the dominant assessment medium, about authenticity in assessment and about the rationale and perceived value of anonymity.
It’s interesting to read how over 80 years ago (Drake, 1941) many of the behaviours we witness today in both students and their teachers have 21st century parallels. Strict disciplinarian responses or ignoring it because ‘they’re only harming themselves’ being common. In other words, the underlying causes were not being addressed. To finish I think this sets out the challenge confronting us well:
“Teachers in general, and college professors in particular, will not be enthusiastic about proposed changes. They are opposed to changes of any sort that may interfere with long- established routines-and examinations are a part of the hoary tradition of the academic past”
(Drake, 1941, p.420)
Drake, C. A. (1941). Why students cheat. Journal of Higher Education, 12(5)
Hutton, P. A. (2006). Understanding student cheating and what educators can do about it. College Teaching, 54(1), 171–176. https://www.jstor.org/stable/27559254
Miles, P., et al. (2022). Why Students Cheat. The Journal of Undergraduate Neuroscience Education (JUNE), 20(2):A150-A160
Rettinger, D. A., & Kramer, Y. (2009). Situational and individual factors associated with academic dishonesty. Research in Higher Education, 50(3), 293-313. https://doi.org/10.1007/s11162-008-9116-5
Stevens, G. E., & Stevens, F. W. (1987). Ethical inclinations of tomorrow’s managers revisited: How and why students cheat. Journal of Education for Business, 63(1), 24-29. https://doi.org/10.1080/08832323.1987.10117269
Yu, H., Glanzer, P. L., Johnson, B. R., Sriram, R., & Moore, B. (2018). Why college students cheat: A conceptual model of five factors. The Review of Higher Education, 41(4), 549-576. https://doi.org/10.1353/rhe.2018.0025
Gallant, T. B., & Drinan, P. (2006). Organizational theory and student cheating: Explanation, responses, and strategies. The Journal of Higher Education, 77(5), 839-860. https://www.jstor.org/stable/3838789
I have been thinking a lot recently about my own and others’ positions in relation to AI in education. I’m reading a lot more from the ‘ResistAI’ lobby and share many persepctives with core arguments. I likewise read a lot from the tech communities and enthusiastic educator groups which often get conflated but are important to distinguish given bloomin’ obvious as well as more subtle agenda and motivation differences (see world domination and profit arguments for example). I see willing adoption, pragmatic adoption, reluctant adoption and a whole bunch of ill-informed adoption/ rejection too. My reality is that staff and students are using AI (of different types) in different ways. Some of this is ground-breaking and exciting, some snag-filled and disappointing, some ill-advised and potentially risky. Exisiting IT infrastrucutre and processes are struggling to keep pace and daily conversations range from ‘I have to show you this- it’s going to change my life! ‘ to ‘I feel like I’m being left behind here’ and a lot more besides.
So it was that this morning I saw a post on LinkedIn (who’d have thought the place where we put our CVs would grow so much as an academic social network?) from Leon Furze who defines his position as ‘sitting on the fence’. I initially I thought ‘yeah that’s me’ but, in fact, I am not actually sitting on the fence at all in this space. I am trying as best I can to navigate a path that can be defined by the broad two word strategy we are trying define and support at my place: Engage Responsibly. Constructive resitance and debate are central but so is engagement with fundamental ideas, technologies, principles and applications. I have for ages been arguing for more nuanced understanding. I very much appreciate evidence and experiential based arguments (counter and pro). The waters are muddied though with, on the one hand, big tech declarations of educational transformation and revolution (we’re always on the cusp, right?) and sceptical generalisations like the one I saw gaining social media traction the other day which went something like:
“Reading is thinking
Writing is thinking
AI is anti-thinking”
If you think that then you are not thinking in my view. Each of those statements must be contextualised and nuanced. This is exactly the kind of meme-level sound bite that sounds good initially but is not what we should be entertaining as a position in academia. Or is it? Below are some adjectives and defintions of the sorts of positions identified by Leon Furze in the collection linked above and by me and research partners in crime Shoshi, Olivia and Navyasara. Which one/s would you pick to define your position? (I am aware that many of these terms are loaded; I’m just interested in the broadest sense where people see themselves, whether they have planted a flag or if they are still looking for a spot as they wander around in the traffic wide eyed).
Cautious: Educators who are cautious might see both the potential benefits and risks of AI. They might be hesitant to fully embrace AI without a thorough understanding of its implications.
Critical: Educators who are critical might take a stance that focusses on one or more of the ethical concerns surrounding AI and its potential negative impacts, such as the risk of AI being used for surveillance or control, or ways in which data is sourced or used.
Open minded: Open minded educators might be willing to explore AI’s possibilities and experiment with its use in education, while remaining aware of potential drawbacks.
Engaged: Engaged educators actively seek to understand AI, its capabilities and its implications for education. They seek to shape the way AI is used in their field.
Resistant: Resistant educators might actively oppose the integration of AI into education due to concerns about its impact on teaching, learning or ethical considerations.
Pragmatic: Pragmatic educators might focus on the practical applications of AI in education, such as using it for administrative tasks or to support personalised learning. They might be less concerned with theoretical debates and more interested in how AI can be used to improve their practice.
Concerned: Educators who are concerned might primarily focus on the potential negative impacts of AI on students and educators. They might worry about issues like data privacy, algorithmic bias, or the deskilling of teachers.
Hopeful: Hopeful educators might see AI as a tool that can enhance education and create new opportunities for students and teachers. They might be excited about AI’s potential to personalise learning, provide feedback and support students with diverse needs.
Sceptical: Sceptical educators might question the claims made about AI’s benefits in education and demand evidence to support its effectiveness. They might be wary of the hype surrounding AI and prefer to wait for more research before adopting it.
Informed: Informed educators would stay up-to-date with the latest developments in AI and its applications in education. They would understand both the potential benefits and risks of AI and be able to make informed decisions about its use.
Fence-sitting: Educators who are fence-sitting recognise the complexity of the issue and see valid arguments on both sides. They may be delaying making a decision until more information is available or a clearer consensus emerges. This aligns with Furze’s own position of being on the fence, acknowledging both the benefits and risks of AI.
Ambivalent: Educators experiencing ambivalence might simultaneously hold positive and negative views about AI. They may, for example, appreciate its potential for personalising learning but be uneasy about its ethical implications. This reflects cognitive dissonance, where conflicting ideas create mental discomfort. Furze’s exploration of both the positive potential of AI and the reasons for resisting it illustrates this tension.
Time-poor: Educators who are time-poor may not have the capacity to fully (or even partially) research and understand the implications of AI, leading to delayed decisions or reliance on simplified viewpoints.
Inexperienced: Inexperienced educators may lack the background knowledge to confidently assess the potential benefits and risks of AI in education, contributing to hesitation or reliance on the opinions of others.
I have been interested recently in the ways in which AI is being integrated into healthcare as part of my personal goal to widen my understanding and broaden my own definition of AI. I’m seeing increasing need to do this as part of growing awareness and literacy as well as a need to show that AI is impacting curricula well beyond the ongoing kerfuffle around generative AI and assessment integrity. I was recommended this panel by Professor Dan Nicolau Jr who chaired this session at the recent event which looked at the many barriers to advances in a context where early detection, monitoring, business models and data availability impact the ways in which we do medicine and advance it in a world where ageing populations present an existential threat to global healthcare systems. It struck me when I watched this how much the potentials and barriers expressed here will likely be mirrored in other disciplines. Medicine does seem to be an effective bellweather though.
Some of the issues that stood out:
Data availability and validity: Just as healthcare AI can produce skewed results from over-represented organisms in protein design, we see similar issues of data bias emerging across AI applications. The challenges around electronic health records – inconsistent, incomplete and error-prone – mirror concerns about data quality in other domains.
Business models and willingness/ ability to use what is available: The difficulty in monetising preventative AI applications in medicine, for example, reflects broader questions about how we value different types of AI innovation. Similarly, the need to shift mindsets from reactive to proactive approaches in healthcare has parallels with cultural change required for effective AI adoption elsewhere. The comments from the panel about human propensities NOT to use devices or take medicines that will help them are quite shocking but still somehow unsurprising. Cracking that, according to the panel, would increase life expectancy more than finding a cure for cancer.
The regulatory landscape: The NHS’s procurement processes, which can stifle AI innovation, demonstrate how existing institutional frameworks may need significant adaptation. This raises important questions about how we balance innovation with appropriate oversight – something all sectors grappling with AI must address.
For me, healthcare exemplifies the complex relationship between technical capability and human behaviour. The adoption issue is obviously one that has parallels with willingness/ openness to using novel technologies, even where they can be shown to make life better or easier. The panel’s observations about patient compliance mirror wider challenges around user adoption and engagement with AI systems. We cannot separate the technology from the human context in which it operates.
This is a swift intro to character AI *(note 1)- a tool that is available to use for free currently (on a freemium model). My daughter showed it to me some months ago. It appears as a novelty app but is used (as I understand it) beyond entertainment for creative activity, gaming, role playing and even emotional support. For me it is the potential to test ideas that many have about bot potential for learning that is most interesting. By shifting focus away from ‘generating essays’ it is possible to see the appeal of natural language exchanges to augment learning in a novel medium. While I can think of dozens of use cases based on the way I currently (for example) use YouTube to help me to learn how to unblock a washing machine I imagine that is a continuum that goes all the way up to teacher replacement.*(note 2) Character AI is built on large language model, employs ‘reinforcement’ (learning as coversations continue) and provides an easy to learn interface (basically typing stuff in boxes) that allows you to ground the bot with ease in a wysiwyg interface.
As I see it, it offers three significant modifications in the default interface to standard (free) LLMs. 1. You can create characters and define their knowledge and ‘personality’ traits by having space to ground the bot behaviour through customisation. 2. You can have voice exchanges by ‘calling’ the character. 3. Most importantly, it shifts the technology back to interaction and away from lengthy generation (though they can still go on a bit if you don’t bake succinctness in!) What interests me most is the potential to use tools like this to augment learning, add some novelty and provide reinforcement opportunity through text or voice based exchanges. I have experimented with creating some academic architypes for my students to converse with. This one is a compassionate pedagogue, this one is keen on AI for teaching and learning, this one a real AI sceptic, this one deeply worried about academic integrity. They each have a back story, defined university role and expertise. I tried to get people to test arguments and counter arguments and to work through difficult academic encounters. It’s had mixed reviews so far: Some love it; some REALLY do not like it at all!
How do/ could you use a tool like this?
Note 1. This video in no way connotes promotion or recommendation (by me or by my employer) of this software. Never upload data you are not comfortable sharing and never upload your own or other’s personal data.
Note 2: I am not a proponent of this! There may be people who think this is the panacea to chronic educational underfunding though so beware.
In the fast evolving landscape of AI tools, two recent releases have really caught my attention: Google’s NotebookLM and the advanced conversational features in ChatGPT. Both offer intriguing possibilities for how we might interact with AI in more natural, fluid ways.
NotebookLM, still in its experimental stage and free to use, is well worth exploring- as one of my King’s colleagues pointed out recently: it’s about time Google did something impressive in this space! Its standout feature is the ability to generate surprisingly natural-sounding ‘auto podcasts’. I’ve been particularly struck by how the AI voice avatars exchange and overlap in their speech patterns, mimicking the cadence of real conversation. This authenticity is both impressive and slightly unsettling and at least two colleagues thought they were listing to human exchanges.
I tested this feature with three distinct topics:
Language learning in the age of AI (based on three online articles):
A rather flattering exchange about my blog posts (created in fact by my former colleague Gerhard Kristandl – I’m not that egotistical):
A summary of King’s generative AI guidance:
The results were remarkably coherent and engaging. Beyond this, NotebookLM offers other useful features such as the ability to upload multiple file formats, synthesise high-level summaries, and generate questions to help interrogate the material. Perhaps most usefully, it visually represents the sources of information cited in response to your queries, making the retrieval-augmented generation process transparent.
Meanwhile, ChatGPT’s latest update advance voice feature (not available in EU, by the way) has addressed previous latency issues, resulting in a much more realistic exchange. To test this, I engaged in a brief conversation, asking it to switch accents mid-dialogue. The fluidity of the interaction was notable, feeling much closer to a natural conversation than previous iterations. Watch here:
What struck me during this exchange was how easily I slipped into treating the AI as a sentient being. At one point, I found myself saying “thank you”, while at another I felt a bit bad when I abruptly interrupted. This tendency to anthropomorphise these tools is deeply ingrained and hard to avoid, especially as the interactions become more natural. It raises interesting questions about how we relate to AI and whether this human-like interaction is beneficial or potentially problematic.
These developments challenge our conventions around writing and authorship. As these tools become more sophisticated, the line between human and AI-generated content blurs further. What constitutes a ‘valid’ tool for authorship in this new landscape? How do we navigate the ethical implications of using AI in this way?
What are your thoughts on these developments? How might you see yourself using tools like NotebookLM or the advanced ChatGPT in your work?
*That’s supposed to read AI3 but the title font refuses to allow superscript!
Yesterday I was delighted to keynote at the Universities at Medway annual teaching and learning conference. It’s a really interesting collaboration of three universities: University of Greenwich, University of Kent and Canterbury Christchurch University. Based at the Chatham campus in Medway you can’t help but notice the history the moment you enter the campus. Given that I’d worked at Greenwich for five years I was familiar with the campus but, as was always the case when I went there during my time at Greenwich, I experienced a moment of awe when seeing the campus buildings again. It’s actually part of the Chatham Dockyard World Heritage site and features the remarkable Drill Hall library. The reason I’m banging on about history is because such an environment really underscores for me some of those things that are emblematic of higher education in the United Kingdom (especially for those that don’t work or study in it!)
It has echoes of cultural shorthands and memes of university life that remain popular in representations of campus life and study. It’s definitely a bit out of date (and overtly UK centric) like a lot of my cultural references, but it made me think of all the murders in the Oxford set crime drama ‘Morse’. The campus locations fossilised for a generation the idea of ornate buildings, musty libraries and deranged academics. Most universities of course don’t look like that and by and large academics tend not to be too deranged. Nevertheless we do spend a lot of time talking about the need for change and transformation whilst merrily doing things the way we’ve done them for decades if not hundreds of years. Some might call that deranged behaviour. And that, in essence, was the core argument of my keynote: For too long we have twiddled around the edges but there will be no better opportunity than now with machine-assisted leverage to do the things that give the lie to the idea that universities are seats of innovation and dynamism. Despite decades of research that have helped define broad principles for effective teaching, learning, assessment and feedback we default to lecture – seminar and essay – report – exam across large swathes of programmes. We privilege writing as the principle mechanism of evidencing learning. We think we know what learning looks like, what good writing is, what plagiarism and cheating are but a couple of quick scenarios to a room full of academics invariably reveal lack of consensus and a mass of tacit, hidden and sometimes very privileged understandings of those concepts.
Employing an undoubtedly questionable metaphor and unashamedly dated (1984) concept of ‘crossing the streams’ from the original Ghostbusters film, I argued that there are several parallels to the situation the citizens of New York first found themselves in way back when and not least the academics (initially mocked and defunded) who confront the paranormal manifestations in their Ghostbusters guises. First are the appearances of a trickle of ghosts and demons followed by a veritable deluge. Witness ChatGPTs release, the unprecedented sign ups and the ensuing 18 months wherein everything now has AI (even my toothbrush). There’s an AI for That has logged 12,982 AIs to date to give an indication of that scale (I need to watch the film again to get an estimate on number of ghosts). Anyway, early in the film we learn that a Ghost catching device called a ‘Proton Pack’ emits energy streams but:
“The important thing to remember is that you must never under any circumstances, cross the streams.” (Dr Egon Spengler)
Inevitably, of course, the resolution to the escalating crisis is the necessity of crossing the streams to defeat and banish the ghosts and demons. I don’t think that generative AI is something that could or should be defeated and I definitely do not think that an arms race of detection and policing is the way forward either. But I do think we need to cross the streams of the three AIs: Artificial Intelligence; Academic Integrity and Assessment Innovation to help realise the long-needed changes.
Artificial Intelligence represents the catalyst not the reason for needing dramatic change.
Academic Integrity as a goal is fine but too often connotes protected knowledge, archaic practices, inflexible standards and a resistance to evolution.
Assessment innovation is the place where we can, through common language and understanding, address the concerns of perhaps more traditional or conservative voices about perceived robustness of assessments in a world where generative AI exists and is increasingly integrated into familiar tools along with what might be seen as more progressive voices who, well before ChatGPT, were arguing for more authentic, dialogic, process-focussed and, dare I say it, de-anonymised and humanly connected assessments.
Here is our opportunity. Crossing the streams may be the only way we mitigate a drift to obsolescence! MY concluding slide showed a (definitely NOT called Casper) friendly ghost which, I hope, connoted the idea that what we fear is the unknown but as we come to know it we find ways to shift from engagement (sometimes aggressively) to understanding and perhaps even an ‘embrace’ as many who talk of AI encourage us to do.
Incidentally, I asked the Captain (in my custom bot ‘Teaching Trek: Captain’s Counsel’) a question about change and he came up with a similar metaphor:
“Blow Up the Enterprise: Sometimes, radical changes are necessary. I had to destroy the Enterprise to save my crew in “Star Trek III: The Search for Spock.” Academics should learn when to abandon a failing strategy and embrace new approaches, even if it means starting over.”
In a way I think I’d have had an easier time if I’d stuck with Star Trek metaphors. I was gratified to note that ‘The Search for Spock’ was also released in 1984. An auspicious year for dated cultural references from humans and bots alike.
—————–
Thanks:
The conference itself was great and I am grateful to Chloe, Emma, Julie and the team for orgnaising it and inviting me.
Earlier in the day I was inspired by presentations by colleagues from the three universities: Emma, Jimmy, Nicole, Stuart and Laura. The student panel was great too- started strongly with a rejection of the characterisation of students as idle and disintersted and carried on forcefully from there! And special thanks too to David Bedford (who I first worked with something like 10 years ago) who uses an analytical framework of his own devising called ‘BREAD’ as an aid to informing critical information literacy. His session adapted the framework for AI interactions and it prompted a question which led, over lunch, to me producing a (rough and ready) custom GPT based on it.
I should also acknowledge the works I referred to: 1. Sarah Eaton whose work on the 6 tenets of post-plagiarism I heartily recommended and to 2. Cath Ellis and Kane Murdoch* for their ‘enforcement pyramid’ which also works well as one of the vehicles that will help us navigate our way from the old to the new.
*Recommendation of this text does not in any way connote acceptance of Kane’s poor choice when it comes to football team preference.
A Summary of the transcript-first drafted via Google Gemini , prompted and edited by Martin Compton
The British Acupuncture Accreditation Board (BAAB) recently hosted a workshop on the implications of AI with a focus on generative AI tools like ChatGPT for teaching and assessment. With Dr Vivien Shaw from BAAB who designed and led the breakout element of the session, I was invited to share my thoughts on this rapidly evolving landscape, and it was a fantastic opportunity to engage with acupuncture/ Chinese Traditional Medicine educators and practitioners.
We started by noting the fact that the majority of attendees have had little or no experience using these tools and most were concerned:
Key Points
After a few defintions and live demos the key points I made were:
AI is Bigger Than Generative AI: While generative AI tools like ChatGPT have taken the spotlight, it’s crucial to remember that artificial intelligence encompasses a much broader spectrum of technologies.
Generative AI is a Black Box: Even the developers of these tools are often surprised by their capabilities and applications. This unpredictability presents both challenges and opportunities.
The Human Must Remain in the Loop: AI should augment, not replace, human expertise. The “poetry” and nuance of human intelligence are irreplaceable.
Scepticism is Essential: Don’t trust everything AI produces. Critical thinking and verification of information are more important than ever.
AI is Constantly Improving: The capabilities of AI tools are evolving at a breakneck pace. What seems impossible today might be commonplace tomorrow.
Embracing the Opportunities and Addressing the Threats
The workshop highlighted the need for educators to lean into AI, understand its potential, and exploit its capabilities where appropriate. We also discussed the importance of adapting our teaching and assessment methods to this new reality.
To get a hands-on feel for AI’s impact, we divided into breakout groups and tackled some standard acupuncture exam questions using ChatGPT and other AI tools. The results were both impressive and concerning.
Group 1: Case History: The AI-generated responses were generic and lacked the nuance and depth expected from a student.
Group 2: Reflective Task: The AI produced “marshmallow blurb” – responses that sounded good but lacked substance or specific details.
Group 3: PowerPoint Presentation: While the AI-generated presentation was a decent starting point, it lacked the specifics and critical analysis required by the assignment.
It was noted that these outputs should not mask the potential for labour saving, for getting something down as a start or the possibilites when multi-shot prompting (iterating).
The Road Ahead
The workshop sparked lively discussions about the future of teaching and assessment in the age of AI. Some key questions that emerged:
How can we ensure that students are truly learning and not just relying on AI to generate answers?
What are the ethical implications of using AI in education?
How can we adapt our assessments to maintain their validity and relevance?
This will all take work but, as a starting point and even if you are blown away by the tutoring demo from Sal Khan /GPT 4o this week, value human connecton and interaction at all times. Neither dismiss out of hand or unthinkingly accept change for its own sake. Transformation is possible with these new tech because these AI are powerful tools, but it’s up to us to use them responsibly and ethically and to grow our understanding through experimentation and dialogue. We need to engage with the opportunities presented while remaining vigilant about the potential threats.
This post is a AI/ me hybrid summary of the transcript of a conversation I had with Prof Oz Acar as part of the AI conversations series at KCL. This morning I found that my Copilot window now allows me to upload attachments (now disabled again! 30/4/24) but the output with the same prompt was poor by comparison to Claude or my ‘writemystyle’ custom GPT unfortunately (for now and at first attempt). I have made some edits to the post for clarity and to remove some of the wilder excesses of ‘AI cringe’.
“The beauty of PAIR is its flexibility,” Oz explained. “Educators can customise each component based on learning objectives, student cohorts, and assignments.” An instructor could opt for closed problem statements tailored to specific lessons, or challenge students to formulate their own open-ended inquiries. Guidelines may restrict AI tool choices, or enable students more autonomy to explore the ever-expanding AI ecosystem. That oversight and guidance needs to come from an informed position of course.
Crucially, by emphasising skills like problem formulation, iterative experimentation, critical evaluation, and self-reflection, PAIR aligns with long-established pedagogical models proven to deepen understanding, such as inquiry-based and active learning. “PAIR is really skill-centric, not tool-centric,” Oz clarified. “It develops capabilities that will be invaluable for working with any AI system, now or in the future.”
The early results from over a dozen King’s modules across disciplines like business, marketing, and arts have piloted PAIR have been overwhelmingly positive. Students have reported marked improvements in their AI literacy – confidence in understanding these technologies’ current capabilities, limitations, and ethical implications. “Over 90% felt their skills in areas like evaluating outputs, recognising bias, and grasping AI’s broader impact had significantly increased,” Oz shared.
While valid concerns around academic integrity have catalysed polarising debates, with some advocating outright bans and restrictive detection measures, Oz makes a nuanced case for an open approach centred on responsible AI adoption. “If we prohibit generative AI for assignments, the stellar students will follow the rules while others will use it covertly,” he argued. “Since even expert linguists struggle to detect AI-written text reliably (especially when it has been manipulated rather than simply churned from a single shot prompt), those circumventing the rules gain an unfair advantage.”
Instead, Oz advocates assuming AI usage as an integrated part of the learning process, creating an equitable playing field primed for recalibrating expectations and assessment criteria. “There’s less motivation to cheat if we allow appropriate AI involvement,” he explained. “We can redefine what constitutes an exceptional essay or report in an AI-augmented age.”
This stance aligns with PAIR’s human-centric philosophy of ensuring students remain firmly in the driver’s seat, leveraging AI as an enabling co-pilot to materialise and enrich their own ideas and outputs. “Throughout the PAIR process, we have mechanisms like reflective reports that reinforce students’ ownership and agency … The AI’s role is as an assistive partner, not an autonomous solution.”
Looking ahead, Oz is energised by generative AI’s potential to tackle substantial challenges plaguing education systems globally – from expanding equitable access to quality learning resources, to easing overstretched educators’ burnout through intelligent process optimisation and tailored student support. “We could make education infinitely better by leveraging these technologies thoughtfully…Imagine having the world’s most patient, accessible digital teaching assistants to achieve our pedagogical goals.”
However, Oz also acknowledges legitimate worries about the perils of inaction or institutional inertia. “My biggest concern is that we keep talking endlessly about what could go wrong, paralysed by committee after committee, while failing to prepare the next generation for their AI-infused reality,” he cautioned. Without proactive engagement, Oz fears a bifurcated future where students are either obliviously clueless about AI’s disruptive scope, or conversely, become overly dependent on it without cultivating essential critical thinking abilities.
Another risk for Oz is generative AI’s potential to propel misinformation and personalised manipulation campaigns to unprecedented scales. “We’re heading into major election cycles soon, and I’m deeply worried about deepfakes fuelling conspiracy theories and political interference,” he revealed. “But even more insidious is AI’s ability to produce highly persuasive, psychologically targeted disinformation tailored to each individual’s profile and vulnerabilities.”
Despite these significant hazards, Oz remains optimistic that responsible frameworks like PAIR can steer education towards effectively harnessing generative AI’s positive transformations while mitigating risks.
An additional point to note: The recording is of course a conversation between two humans (Oz and Martin) and is unscripted. The Q&A towards the end of the recording was faciliated by a third human (Sanjana). I then compared four AI transcription tools: Kaltura, Clipchamp, Stream and Youtube. Kaltura estimated 78% accuracy, Clipchamp crashed twice, Stream was (in my estimation) around 90-95% accurate but the editing/ download process is less convenient when compared to YouTube in my view so the final transcript is the one initially auto-generated in in YouTube, ChatGPT punctuated then re-edited for accuracy in YouTube. Whilst accuracy has improved noticeably in the last few years the faff is still there. The video itself is hosted in Kaltura.