TDDA and Quality for LLMs

Posted on Mon 23 December 2024 in misc

It is December 2024 as I write, and large language models (LLMs) are having an extended moment as I have been writing a book on tet-driven data analysis. Several people have suggested that I should write about LLMs or artificial intelligence (AI), a term that for many people now means either LLMs or LLMs and other the other forms of generative AI.

Training Inference

Size Training Data

Inputs

Goal

First do no harm.

Strong AI.

Beliefs. Hallucinations.

Stochastic hypothesis generators.

Rhydwaith

LLMs are neural networks that (loosely) predict the next word.*
Given some text, they predict the next word
You sentences by appending each predicted word to the input and iterating.

Mary had a -> little Mary had a little -> lamb, Mary had a little lamb, -> its Mary had a little lamb, its -> fleece

Mary had a -> seizure Mary had a seizure -> last Mary had a seizure last -> night

LLMs are trained on unimaginably large corpuses of data, mainly from the web.
LLMs have trillions of parameters—knobs that can be set to different values
With any given parameter settings, the LLMs will predict the next word
Some knob settings match the next-word associations better than others
Training an LLMs consists of optimizing the knob settings
(Most of) the parameters (knobs) are called ``weights''.
During training, the current weights are used to predict the next word
- When it is ``wrong'' (differs from the input), the weights are adjusted
- Even when it is ``right'', the weights are usually adjusted
- The raw prediction is not a single word, but probabilities for possible words
- There is always an error, which can always be reduced.
- The adjustments are calculated to try to reduce the errors over time.