Activity - LLMs Unplugged Facilitator Guide

Welcome¶

AI is everywhere
Literacy should be liberatory; learning to consume products is not liberation.
The goal of this workshop is to understand so that you can critique, that you can decide to trust or not trust appropriately

Unplugged, Really¶

This workshop is unplugged. We have no slides. We will explain though demonstration after a little bit of setting the stage.

This workshop is not designed to give you as much information as possible, instead, it’s designed for you to get really comfortable with a few key ideas. To really absorb them. To hold them. For them to stay with you. To sit with a few ideas long enough that you let them shape your future choices.

Literacy is liberatory afterall, we want you to feel a bit of liberation.

So, we invite you to really unplug:

silence notifications on any wearables
take a moment and jot down that thing you’re worrying about
tell the person you’re texting you’re taking a moment to focus deeply and learn
put away your devices

What is a AI?¶

Currently the biggest advance in AI is LLMs. Some are “vision” models or “audio” models, but they all take the same basic idea.

AI, human like machine decision making, has been the goal of CS for a long time
Many strategies exist, most recent success is due to machine learning, finding patterns in data instead of writing code explicitly
Current shift is LLMs or more broadly generative models
LLM = Large Language Models

Let’s break down that term.

Models¶

A model is a simplification in order to communicate an idea. In ML, our model describes our assumptions about the world that we use to find patterns in the data.

Some examples:

ex: a volcano demo
ex: diarama
ex: globe
ex: atom model with rings

In computing, our model are always a mathematical model. In ML, typically a statistical model.

Here w

familiar mathematical model: line
2 params: have meaning

Language¶

languages consists of words and grammar
in Natural Langue Processing we discuss tokens and documents (units of text to analyze)
communicates ideas

Large¶

broadly: “bigger” is more complex
refers to the number of parameters
a line has two parameters: slope & intercept
GPT 3 had 175 Billion parameters

Putting it together¶

Over time, ML has done many different models, but the one that happens to have hit is a generative language model.

It starts with a simple assumption:

We can generate sequences of words by sampling from a distribution of what word comes next given a sequence of past words.

Informal

Formal

Specifically: we assume that the distribution of the next word given the previous words is all we need

P (next| previous)

(1)

A simpler Model¶

LLMs require a lot of parameters because spoken language is complex. There are a lot of words and meaning can be conveyed over long strings of words.

So, LLMs use an deep neural network to implement that distribution in a computer.

However, if we use a very simple language, and for now, only the one previous word then the equation can be represented by a table.

Even better, we can do random with physical objects.

A coin flip
a die roll
a raffle draw

So, we will implement this distribution physically.

Sampling a Pretrained Model¶

tiny language: 4 words
implement (2) with $n=1$ using buckets and ping pong balls
we have a pretrained model, let’s test it so we can really understand that model

Sampling procedure

A helper posts a sticky in the color of the prompt on the board
The facilitator starts in the bin labeled with the prompt and draws a ball
A helper adds a sticky in the color of the drawn ball
The facilitator draws from the bin of the last sticky in the document
repeat until you draw a white.

Training a Model¶

author your own document using sticky notes. only rule is white = end of document
Training varies by implementaion of the generator
for our model: give a document, put a ball in color of word $j$ in the bin labeled with color for word $j-1$ for words $j=2$ to the end of document
each person comes up to train on thier document

Discussion¶

what observations do you have about the generated documents
how are they different/similar

Welcome¶

Unplugged, Really¶

What is a AI?¶

Models¶

Language¶

Large¶

Putting it together¶

A simpler Model¶

Sampling a Pretrained Model¶

Training a Model¶

Discussion¶

Second Training and Reinforcing Ideas¶

Closing¶