This post is part of the AI Apprenticeship series: By Dr Anita Lamprecht, supported by Diplo AI and Gemini In week 3 of the AI Apprenticeship online course, each participant began to build their own functioning chatbot. For this purpose, we had to use a topic we are familiar with to ensure that we can later verify the accuracy of our chatbots’ output. We also needed to collect and upload datasets, compose system prompts, and choose a large language model (LLM) for each bot. All these steps need to be taken with care. As creators of the chatbots, our choices serve as boundaries for our chatbots, influencing their level of unpredictability, often referred to as ‘hallucinations’, though I prefer the term ‘randomness’ as it seems more precise. Randomness more accurately reflects the way these unexpected outputs arise from the complex interplay of data and algorithms (I have explored the concept of randomness in more detail in part 2.5 of my blog series). But let’s return to our task. Nature generously designs everything for us, but in technology, we must carefully craft our creations. Designing our bots starts with giving them a purpose. I am creating two bots, and their purpose, quite frankly, is personal. They will both serve my personal aims, but I hope the final result will also be useful to others. So, what are their purposes? My bots are designed to help me overcome my human limitations. Rephrased more positively, their purpose is to extend my capabilities. More concretely, they will enable me to keep pace with the speed and scope of publications in my field of research. How? By enhancing my naturally limited processing capabilities and memory. One bot will be dedicated to preserving and expanding my acquired knowledge of the topic ‘child safety in the metaverse’, whilst the other will help me sift through the flood of documents recently published by ITU as part of the UN Virtual Worlds Initiative. To put it succinctly: these bots will be an extension of me. They are, after all, artefacts. Let us begin. The existence of our bots begins with providing system prompts. In the Geppetto metaphor I used in my first post, these prompts would be equal to taking a piece of wood and starting to chisel the outlines of Pinocchio. Experienced carpenters know that each piece of wood comes with its own character, expressed through its grain and other qualities. Our system prompts serve as the foundational blueprint for our chatbots, defining their core purpose and guiding their development. They are like Geppetto’s initial vision of Pinocchio as a puppet with the potential to become a boy. LLMs are where the magic happens. Think of an LLM as the force that animates Pinocchio by enabling him to speak in a human-like way; it provides the foundational language patterns and understanding, drawn from a vast forest of information. In our case, it’s about statistics and pattern recognition, not magic. Data and the forest of information In Part 2 of this blog series, we explored how AI interprets data by identifying patterns and correlations, similar to how flags convey meaning through colours and symbols. Now, imagine the internet as a vast forest filled with different kinds of wood, each with its own texture and strength. This is the raw material from which our chatbot will be carved. Our carefully curated dataset is like a selection of high-quality wood blocks, chosen for their specific properties. This ensures that the chatbot learns from reliable and relevant information. The LLM, in turn, helps the chatbot ‘see the forest for the trees’, revealing hidden connections within this vast amount of data and assembling them into a meaningful structure. It guides the chatbot to navigate the forest and uncover the insights most relevant to our needs. DiploAI as our workshop While datasets ensure high quality and relevance for effective machine learning, data from the internet provides a broad range of language patterns. However, this data requires significant filtering to remove noise and inaccuracies. DiploAI provides the workshop, the tools, and the environment for shaping this material. And we, as sculptors, use prompts as our chisels, actively shaping the wood (data) and carving out the desired form and features. But what is this ‘workshop’ exactly? Well, DiploAI is an AI platform developed by Diplo’s AI and Data Lab. It focuses on exploring the possibilities of machine learning, neural networks, and natural language processing algorithms. DiploAI is actively involved in finding new datasets, testing different machine-learning models, and sharing its newly acquired knowledge through a weekly diary. It plays a significant role in enhancing the understanding of AI and data science in the context of diplomacy and global governance. (If you want to learn more about DiploAI and its activities, you can visit the following link: Diplo AI.) Just as Pinocchio encounters various characters who influence his path – Honest John, Geppetto, or the Blue Fairy – users play a crucial role in shaping the chatbot’s development. Their interactions, feedback, and even misunderstandings provide valuable insights for refining the chatbot’s system prompts and datasets. For example, frequent misunderstandings or irrelevant responses signal the need for adjustments in the chatbot’s system prompts or datasets. Diverse feedback from users can also reveal new functionalities for the bot. In addition, users can benefit from learning more about the skills they need to acquire to communicate with and use emerging technologies. What makes Diplo’s approach to AI so outstanding is that the workshop is independent of the individual LLM. We can make our Pinocchio from oak, mahogany, or pine at the click of a mouse. We could say it’s tailor-made, not off-the-shelf. This also means we retain ownership and control over our AI, unlike relying on pre-built solutions from big tech companies. While I’ve initially chosen a particular underlying LLM to bring this blueprint to life, the bot itself is not bound to this specific model. As the participants of the AI Apprenticeship course include those in diplomacy, global governance, and international relations, DiploAI provides a few important elements: Overall, the specialisation and customisation of DiploAI make it a valuable tool for supporting diplomatic activities, enhancing research in global governance, and facilitating informed decision-making in the complex world of international relations. Like Pinocchio, the essence of my chatbot lies in its system prompts, which allow it to be adapted and re-implemented with different LLMs as technology evolves. This ensures its longevity, adaptability, and the much-needed independence in the ever-changing landscape of AI. Let us look at the various roles that shape our relationship with AI: We can just build a chatbot to achieve a quick win, but if we – as legal professionals and experts in governance – eventually want to master the challenges of AI governance, we need to be able to comprehend our mutual roles profoundly. For me, this apprenticeship is about developing a common sense of the abilities and limitations of both the technological and human aspects within our socio-technological system. This is particularly challenging when dealing with the intangible nature of AI, where code and algorithms replace wood and chisels. Even with the helpful metaphor of Pinocchio, bridging the gap between the tangible and the intangible requires a shift in thinking. The AI Apprenticeship online course is part of the Diplo AI Campus programme.
Purpose is personal
The relation between system prompts and LLMs
AI learns through experience
Tailor-made AI for diplomacy and global governance
Elements in designing a chatbot