Librarians’ Guide to Answering Students’ Technical Questions about AI

Questions librarians might get about AI – and how to answer them

A librarian answering a student's technical questions about AI, such as ChatGPT

What questions are library users asking about generative AI and how can we be prepared to answer them? To help with that, we’ve drafted sample questions and answers about ChatGPT and similar tools.

This is the first of a three-part series. In this part, we’ll offer some answers to technical questions, aimed at beginners. The second and third installments will examine ethical and practical questions students may have.

Of course, sometimes students aren’t yet asking the questions we would like them to know the answers to! This is why we at the University of Arizona Libraries, decided to include them in our LibAnswers FAQ system. This gives us a central place where our staff can direct students for answers. We can also use them in different ways, such as in LibGuides, tutorials, and workshops.

As always, make sure these answers align with your campus policies. Sometimes, these policies come from a writing center or other groups on campus. At the University of Arizona, it’s up to individual instructors to decide on classroom policies, so we always tell students to find out the policy for each class they are in.

What is machine learning?

Machine learning is the practice of developing computer models that can learn patterns and improve their performance based on input data. This allows the model to make predictions based on the patterns it’s learned. It can solve problems without being explicitly programmed for each task. This is very different from rules-based programming, where programmers lay out each step for the machine to follow. Instead, these models learn from patterns.

In AIQ: How People and Machines Are Smarter Together, Nick Polson and James Scott explain it clearly: “In AI, the role of the programmer isn’t to tell the algorithm what to do. It’s to tell the algorithm how to train itself what to do, using data and the rules of probability.” 

💡 Learn more

What is generative AI?

It’s AI that can generate new content, like text, images, video, music, and speech. Some examples of models that generate text are ChatGPT, Microsoft Copilot, and Google Gemini. Models like Midjourney, Adobe Firefly, and Stable Diffusion can generate images. For generating speech, Eleven Labs is a popular tool. Suno and Udio are models for generating music.

Another type of AI is “discriminative AI.” This is an AI that can classify, predict, or recognize patterns in existing data. Some examples are Netflix’s recommendations for what to watch next and Gmail’s spam filtering.

It’s good to keep these two types in mind when you hear about AI. Is it classifying existing data (like with spam filtering), or is it generating new content (like with ChatGPT)? These are very different types of systems with different strengths and weaknesses.

💡 Learn more

🌟 Subscribe to the LibTech Insights newsletter for weekly roundups and bonus content, including: 

What is a large language model (LLM)?

A large language model (LLM) is a type of artificial intelligence that can generate human language and perform related tasks. These models are trained on huge datasets, often containing billions of words. By analyzing all these data, the LLM learns patterns and rules of language, similar to how a human learns to communicate through exposure to language. LLMs can perform various language tasks, such as answering questions, summarizing text, translating between languages, and writing content.

Since language models are now becoming multimodal (working with media types beyond text), they are now also called “foundation models.” This refers to models that are trained on vast amounts of data and can be adapted to a wide range of tasks and operations, not just working with language.

💡 Learn more

What does it mean to “train a model” when talking about generative AI like ChatGPT?

For language models like ChatGPT, “training a model” refers to the process of helping the AI system teach itself to perform specific tasks by exposing it to vast amounts of data. This process allows the model to learn patterns, relationships, and structures within the data, enabling it to generate new content that resembles the training data.

It’s important to note that the model does not simply memorize and reproduce the training data; instead, it learns to create novel content based on the patterns it has learned.

💡 Learn more

Does ChatGPT contain copies of the text it was trained on? Do AI image generators contain copies of the images they were trained on?

No, these models don’t contain exact copies of the texts or images they were trained on. Instead, they create new texts or images using the statistical patterns and relationships learned from the training data. The models build mathematical representations of these patterns, which allow them to generate novel content.

In rare cases, a generated text or image is nearly identical to a text or an image from the training data; this is an unintended consequence of the model’s learning process, not a deliberate feature. Researchers are actively developing methods to prevent such verbatim copying and to ensure that the models generate original content based on their learned patterns.

💡 Learn more

What are guardrails (in tools like ChatGPT)?

Guardrails are built-in safeguards and content moderation techniques. They aim to filter out harmful, biased, or inappropriate content. Developers implement them to ensure that the AI model adheres to ethical and safety standards when responding.

Tools like ChatGPT rely on user feedback to improve, so if you find harmful or incorrect content in a response, you can click the “thumbs down” icon and provide more information. OpenAI will review these responses and add or adjust guardrails if needed.

💡 Learn more

What is hallucination (in models like ChatGPT)?

Hallucination is the word used to describe the situation when models like ChatGPT output false information as if it were true. Even though the AI may sound very confident, sometimes the answers it gives are just plain wrong. 

Why does this happen? AI tools like ChatGPT are trained to best respond to what words should come next in the conversation you’re having with it. They are really good at putting together sentences that sound plausible and realistic.

However, these AI models don’t understand the meaning behind the words. They lack the logical reasoning to tell if what they are saying actually makes sense or is factually correct. They were never designed to be search engines. Instead they might be thought of as “wordsmiths”—tools for summarizing, outlining, brainstorming, and the like.

So we can’t blindly trust that everything they say is accurate, even if it sounds convincing. It’s always a good idea to double-check important information against other reliable sources.

Here’s a tip: Models grounded in an external source of information (like web search results) hallucinate less often. That’s because the model searches for relevant web pages, summarizes the results, and links to the pages that each part of the answer came from. This makes it easier to fact-check the result.

Examples of grounded models are Microsoft Copilot, Perplexity, and ChatGPT Plus (the paid version).

💡 Learn more

Feel free to share and modify these questions for your own use. Since generative AI products change often, we’ll be working to keep these answers current. You can find all of our AI-related FAQs at the University of Arizona LibGuide.

🔥 Sign up for LibTech Insights (LTI) new post notifications and updates.

✍️ Interested in contributing to LTI? Send an email to Daniel P. at Choice with your topic idea.