Skip to content

Large Language Model (LLM)

Posted on:November 21, 2023 at 08:20 PM

tp.web.random_picture

Notes on Large Language Models aka LLMs.

Table of contents

Open Table of contents

Introduction

A large language model (LLM) is a type of language model that is designed to understand and generate language. LLMs are commonly trained with large amounts of data and computing resources. These models work by taking a text and predicting the next work accurately. Notable LLMs include OpenAI’s GPT-4, Meta’s LLaMa, Google’s PaLM and Anthropic’s Claude.

It’s like a super smart computer program that has read a lot of books, websites, and other information. It can understand and generate human-like language, helping people by answering questions, providing information, and even chatting with them. It’s a bit like having a really clever robot friend who knows whole bunch of stuff!

Essentially, it’s a tool that can process and generate human-like text based on the patterns and information it learned during its training.

Types of LLMs

Base LLM

Base LLM predicts next word, based on text training data.

For example, If you write “once upon a time, there was a unicorn”, then it may complete it by adding “that lived in a magical forest with all her unicorn friends”.

But if you may prompt “what is the capital of France?” then it may answer with another set of questions like “what’s France’s largest city?” or “what is France’s population?”, etc. Because articles on the internet could quite possibly list such questions about the country of France.

Instruction Tuned LLM

An instruction-tuned LLM has been trained to follow instructions.

Model Limitations

Hallucination

Makes statements that sound plausible but are not true.

Reducing hallucinations

Ask the model to first find the relevant information then answer the question based on the relevant information.