Skip to article frontmatterSkip to article content

What is AI?

Goals:

AI is computer science

Building artificial intelligence has been the goal of computing research (computing is a broader feild including both computer science or computer engineering) the whole time. Computing is not a very old discipline, but can trace its roots to math and electrical engineering (and through that, physics). At some institutions, computer science and computer engineering are in the same department and at others they may be separate. Computer Science has especially been focused on developing AI as a goal.

A venn diagram showing CS as a broad feild with AI as a sub feild

Figure 1:A venn diagram showing CS as a broad feild with AI as a sub feild

AI is a field of study, which means there is a community of people egaged in this work.

AI can be done in many ways and has been over time, but the current dominant paradigm is machine learning.
The formal definitions of AI are broad and in that way, almost everything in a computer can be considered to “be AI” but, in this moment, AI typically refers to a few things:

A venn diagram showing CS as a broad feild with many subfeilds

Figure 2:A venn diagram showing CS as a broad feild with many subfeilds

What is common across all of these, and all things in computer science, is algorithms. Algorithms have gotten a lot more attention recently, but they are not fundamentally new, mathematicians have developed and studied them formally for centuries and informally, people have developed them all over.

Algorithms are at the center and what is common in all of CS.

Figure 3:Algorithms are at the center and what is common in all of CS.

How Algorithms are Made

A familiar way to think about what an algorithm is as a recipe. A recipe consists of a set of ingredients (inputs) and a set of instructions to follow (procedures) to produce a specific dish (output). Computer algorithms describe the procedure to produce an output given an input in order to solve a problem.

Mathematicians have developed algorithms for centuries by:

  1. Selecting a problem to solve in the world

  2. Representing for the relevant part of the world mathematically

  3. Working to solve the problem in the mathematical space

  4. Documenting their process so that it is repeatable

Begin
Select
Decisions
Approximate
Represent
Solve
Distill
Algorithms are developed by people, not naturally occuring things that we observe or discover

Figure 4:Algorithms are developed by people, not naturally occuring things that we observe or discover

Computer Scientists do the same, with the main change being that the solution is expressed in a programming language instead of in a spoken language, for example Python instead of English.

Simple changes
Implementation
Computer Scientists might use different tools to develop algorithms or terms to document things, but the process is mostly the same

Figure 11:Computer Scientists might use different tools to develop algorithms or terms to document things, but the process is mostly the same

The challenge with this is as we approach more complex problems as our goal to try to delegate to a computer, the approximaitons we make get in the way more. Writing an algorithm to add numbers together or find an exact match for an item from a list is straightforward, writing an algorithm to detect if a set of pixels represents a person or not (e.g. if there is a person in front of a self driving car) is much more complex.

The traditional way of developing algorithms works well for problems where we have a good mathematical repesentation of the part of the world we need to compute and people can describe the steps that need to occur in terms of calculations a computer can carry out.

This is where machine learning comes in.

What is ML?

In machine learning, we change the process a little.

Instead of solving the problem and figuring out precise steps to carry out, the people define a generic strategy, collect a lot of examples, and write a learning algorithm to fill in the details of the generic strategy from the examples. Learning algorithms are developed the way we have always developed algorithms, but then these algorithms essentially write a prediction or inference algorithm that is what gets sent to the world.

All ML consists of two parts: learning and prediction

these can also go by other names.

Learning may be called:

  • fitting

  • optimization

  • training

Prediction may be called:

  • inference

  • testing

A common assumption

All ML has some sort of underlying assumptions, almost all ML relies on two key assuptions, that can be written in many ways:

Plain English
Math
Code (Python)
Diagram
  1. A relationship exists to that we can determine, or predict outcome or target from input.

  2. Given enough examples a computer can find that relationship

where:

  • outcome or target is the goal of the task

  • input is the information to be used to predict that target

This, alone, is not that different than the traditional way of developing algorithm, we have to assume a way to get from some input to the desired output exists for that to happen. However, in machine learning this is a bit more specific, we assume that there is a specific xx and yy that are available to us[2] and that from the xx, we can compute a value for yy.

To make this concrete, this could be as simple as a linear regression

Plain English
Math
Code (Python)

We can predict the tip for a restaurant bill based on the total bill, by multiplying by some percentage and adding a flat amount. We can determine the percentage and the amount to add from previous bills.

This generally has to be written mathematically to be solved, then the implementation is then translated into a programming language for a computer to execute.

A common problem to solve

Then the goal in creating the learning algorithm is to find the right details, if we take the mathematical representation above, we need to find the right θ\theta.

Learning algorithms output that and then allow us to have a complete prediction algorithm.

A learning algorithm and prediction algorithm are linked by a shared model. The prediction algorithm is basically the model treated as a template so that once the parameters are set it becomes a simple input output function. The learning algorithm is what people work on how to write how to find the right parameters to make predictions in a specific domain.

ML is classified in many ways

AI can be classified by how it is developed:

Most current things are ML, and the underlying assumptions come in different forms.

ML can be classified in many different ways too:

We can describe a model with each of these descriptors for example:

What is an LLM?

While AI has been a research area in computing since the beginning of computing, AI came into most common use when ChatGPT was released. ChatGPT is chatbot interfact to the GPT family of LLMs. This and large scale models of vision for image generation or audio for audio production, etc all work on the same basic idea. For LLMs, specifically this is:

Specifically, they model language by using a lot of examples and a statistical model. In math:

P(wjwj1,wj1,,wjc)P(w_j| w_{j-1}, w_{j-1},\ldots, w_{j-c})

where cc is callend the context window.

In English, this says that the model represents a proabability distribution of possible next words(wjw_j) given a past sequence of cc words.

(3) is implemented in a computer using neural network. A Neural networks is computationl model for approximating a function defined by a number of artificial neurons. Neural networks approximate complex functions by combining a lot of simple functions together.

Footnotes
  1. Python is a programming language specifically designed for readability.

  2. there is also unupservised or semi-supervised where the yy is either unknown or only available for some samples, but they still assume that it exists and the xx can be used to compute it.

  3. quantum computers, which are not yet available for consumer use or even broad research use, represent data with probabilistic qubits instead of traditional binary