What Do Vectors Look Like in LLMs?

Large language models (LLMs) run on text, but they don’t “see” text as letters. They operate on vectors: long lists of numbers that represent meaning, context, and relationships in a form that math can handle. This article explains what those vectors look like, where they appear inside an LLM, and why they matter.

Vectors as Number Lists, Not Words

A vector in an LLM is a fixed-length array of floating-point numbers, often with sizes like 768, 1024, or 4096 dimensions. Each token—not each word—maps to exactly one vector.

A toy example (much smaller than a real model) might look like:

“cat” → [0.12, -0.44, 0.03, 0.88, …]
“dog” → [0.10, -0.41, 0.07, 0.81, …]
“banana” → [-0.30, 0.22, -0.91, 0.05, …]

These vectors are dense, meaning most entries are non-zero. They are also opaque:

No single dimension reliably means “animal,” “past tense,” or “noun.”
Meaning is distributed across many dimensions.

A helpful analogy:

A vector is less like a labeled checklist and more like a coordinate in meaning-space.

Why So Many Dimensions?

High dimensionality gives the model room to separate concepts. With only a few dimensions, many meanings would overlap. Thousands of dimensions allow the model to represent subtle differences such as:

literal vs metaphorical meaning
tone (formal, sarcastic, emotional)
syntactic role
topic and domain

Importantly, these dimensions are not designed by humans. They emerge during training because they make prediction easier.

Token Embeddings: The First Vectors

The first vectors in an LLM are token embeddings.

The model maintains a large table called an embedding matrix:

Each row corresponds to a token ID
Each row is a learned vector

When text is tokenized, each token ID is replaced by its embedding:

Html

These vectors are:

Looked up, not computed dynamically
Learned during training
Shared across all contexts initially

At this stage, the vector for “cat” is the same whether it appears in:

“The cat slept”
“A cat is an animal”
“Cat videos are popular”

Context has not entered yet.

Position Information: Encoding Order

Vectors alone don’t encode order. Without position information, a model would treat:

“dog bites man” “man bites dog”

as the same bag of tokens.

To fix this, LLMs combine token vectors with positional information:

Html

Position encodings can be:

Learned (a table like embeddings)
Computed (e.g., sinusoidal patterns)
Relative (positions encoded via attention rules)

The key idea:

Every token vector now carries both “what it is” and “where it is.”

Contextual Vectors: Meaning Changes with Context

After the first transformer layer, vectors stop representing tokens and start representing tokens-in-context.

For example:

“bank” in river bank
“bank” in bank account

These start with the same embedding, but after attention:

They become different vectors
Their neighborhoods in vector space diverge

Each layer refines this further. By deeper layers, vectors encode:

word sense
syntactic role
semantic dependencies
discourse-level information

A useful mental model:

Early layers → lexical meaning Middle layers → syntax and relations Later layers → task-relevant meaning

Attention: How Vectors Interact

Inside each transformer block, every token vector is linearly projected into three new vectors:

Query (Q) — what this token wants to find
Key (K) — what this token offers
Value (V) — the information it contributes

Attention works like this:

Compare Q vectors to K vectors (via dot products)
Convert similarities into weights
Take a weighted sum of V vectors
Produce a new vector for each token

Mathematically, vectors flow through shapes like:

(sequence_length × hidden_size)
→ (sequence_length × head_dim)
→ mixed and recombined
→ back to (sequence_length × hidden_size)

Conceptually:

Tokens “talk” by asking questions and listening to answers, all via vector math.

Geometry: Meaning as Distance and Direction

Vectors live in a high-dimensional geometric space. This makes certain geometric ideas fundamental:

Similarity

Vectors pointing in similar directions tend to represent similar meanings. Cosine similarity measures this directional closeness.

This is why:

“cat” is closer to “dog” than to “banana”
paraphrases cluster together

Structure, Not Labels

Some relationships appear as consistent directions:

tense changes
pluralization
semantic shifts

These patterns are:

statistical, not perfect
emergent, not hard-coded

Still, they are strong enough to support:

semantic search
clustering
retrieval-augmented generation

Intermediate Vectors Are the Model’s “Thought State”

At any moment, the model’s internal vectors represent:

what has been read
what matters right now
what is likely to come next

They are not symbolic rules or explicit logic trees. They are continuous states optimized for prediction.

This is why probing vectors can reveal:

topic awareness
entity tracking
syntactic structure

…but not clean, human-readable rules.

Output Vectors: Turning States into Tokens

To generate text, the final vector at the current position is mapped to vocabulary scores:

Final hidden vector
Linear projection → logits (one per vocabulary token)
Softmax → probabilities
Sampling → next token

So the model never “chooses a word” directly. It transforms a vector into probabilities over tokens.

What Vectors “Look Like” in Practice

In real systems, vectors are:

Thousands of floating-point numbers
Stored as tensors in GPU memory
Processed in parallel as large matrices
Mostly uninterpretable individually
Powerful when analyzed statistically

LLMs are, at their core, vector transformation machines. They repeatedly reshape numeric spaces until the next token becomes predictable.

The vectors themselves look like numbers, but their behavior looks like language.

VectorsLLMAI

Create your AI Agent

Automate customer interactions in just minutes with your own AI Agent.

Get started for free Chat with AI for fun

Featured posts

Crafting a Smarter Debt Repayment Plan for Financial Freedom

Struggling with debt is like trying to climb out of a sandpit; the harder you struggle without a strategy, the deeper you seem to sink. But fear not, friends! With a well-thought-out debt repayment plan, it’s possible to claw your way out, making sure you have some cash left over at the end of the month for life’s little pleasures or unexpected expenses.

What is an Enterprise AI Solution and What Does it Look Like?

Businesses today often seek ways to use artificial intelligence to improve their work. An enterprise AI solution is AI technology specifically built and used within a company to solve its unique problems and make its operations better. This is different from general AI tools you might find for personal use.

Why JavaScript Has Floating-Point Precision Issues (and How to Fix Them)

Have you ever written a perfectly reasonable line of JavaScript like 0.1 + 0.2 and gotten back 0.30000000000000004? It feels almost mocking—how can a language built for the modern web fail at such basic math? The truth is, JavaScript isn’t bad at math at all. It’s extremely precise. The surprise comes from what kind of math it’s doing. Under the hood, JavaScript uses the same binary floating-point system found in most programming languages and even tools like Excel. And that system, while powerful, was never designed to represent everyday decimal numbers cleanly.

How does an LLM predict the next word in programming?

Large language models (LLMs) often look like they “know” what they’re doing: they write clean functions, follow frameworks, and even fix bugs. What’s really happening is next-word prediction carried to an extreme scale, trained on vast amounts of text that includes a lot of code and technical discussion.

What Is a CNAME in DNS?

A Canonical Name (CNAME) record is a DNS entry that maps an alias name to a true or canonical domain name. Instead of pointing a domain directly to an IP address, a CNAME points to another domain name, which eventually resolves to an IP address. This structure functions like a nickname or a forwarding address, ensuring that if the IP address of the main domain changes, all aliases automatically update without needing manual reconfiguration.

How do AI companies build web crawlers?

AI companies that train search, recommendation, or language models need a steady stream of fresh pages, feeds, and files from the public internet. That work is done by crawlers: distributed systems that fetch content, discover new URLs, and revisit known pages to detect updates. Building a crawler that runs continuously is less about a single “bot” and more about orchestration, scheduling, data hygiene, and resilience.

Why Is It Very Difficult for AI to Predict Stock Prices?

Predicting stock prices has always been a challenging task, even for experts with years of experience. With the rise of AI and machine learning, many hoped that these technologies would bring more accuracy and consistency in forecasting market movements. Despite advancements, AI still struggles to provide reliable stock price predictions. This article explores the reasons behind the difficulty AI faces in this area.

How to Write a Windows .exe Software?

Creating an executable (.exe) file for Windows might seem complex at first, but it follows a clear process involving choosing a programming language, setting up the development environment, coding, compiling, and testing. This guide walks through essential steps and the technology stack needed to develop Windows software efficiently.

Achieve more with AI

Enhance your customer experience with an AI Agent today. Easy to set up, it seamlessly integrates into your everyday processes, delivering immediate results.

Try for free Get a demo

Latest posts

AskHandle Blog

Ideas, tips, guides, interviews, industry best practices, and news.

• October 31, 2025

How Do I Start Learning Hardware Programming?

Hardware programming lets you write software that interacts with the physical world. It’s where code meets circuits — where your logic literally lights up, moves, and senses. With the right tools and mindset, you can start small and gradually build toward complex systems like robots, smart devices, and embedded controllers.

HardwareLEDSensorsProgramming

• June 8, 2025

How Do Local Large Language Models Open New Opportunities for Privacy-Focused Businesses?

In recent years, large language models (LLMs) have become a significant part of many technology applications. These models can understand and generate human-like text, making tasks like customer service, content creation, and data analysis easier. But as these models grow more powerful, issues around privacy and data security also come into focus. This is where local large language models are starting to make a difference, creating fresh chances for businesses that prioritize privacy.

LLMBusinessesPrivacy

• May 24, 2025

The Real Feeling of Good Software

We use software for nearly everything these days – from waking up to winding down, it's there. The apps on our phones, the websites we visit, the programs on our computers. They’re tools. And like any tool, how they feel to use makes a huge difference.

Good SoftwareUser Experience

View all posts