Modules Inside Generative AI Systems: Tokenizers, Attention, RAG, RLHF & More

Author: Manikanta Kuna
Website: www.Manikantakuna.com

What Are GenAI System Modules?

A Generative AI model like GPT isn’t just one big neural network — it is a pipeline of multiple intelligent components working together:

ModulePurpose
TokenizerConverts text → numeric tokens
EmbeddingsConverts tokens → vector meaning
Neural Network (Transformer)Thinking & reasoning
AttentionUnderstanding relationships between tokens
RAG / Vector DBsKnowledge retrieval from external memory
DecodingConvert predicted tokens → readable text
RLHFMake outputs safe, helpful & aligned

Let’s break each part down 👇

Tokenizers — Breaking Language into Pieces

📌 AI models do not understand words directly.

Text → split into tokens (words, subwords, characters)

Example:

Sentence:
“Generative AI is powerful.”

Tokenized:
["Gener", "ative", " AI", " is", " powerful", "."]

Why important?
✔ Reduces vocabulary size
✔ Helps model understand rare words
✔ Efficient training

Popular tokenizers:

  • Byte Pair Encoding (BPE)
  • WordPiece
  • SentencePiece
  • Tiktoken (OpenAI)

Word & Sentence Embeddings Meaning in Numbers

Each token is mapped into a vector in multidimensional space.

Example intuition:

  • “King” and “Queen” are close vectors
  • “King” – “Man” + “Woman” ≈ “Queen”

Embedding types:

  • Static: Word2Vec, GloVe
  • Dynamic (context-aware): Transformer embeddings

This is how models understand semantic meaning.

Attention & Self-Attention — The Real Intelligence

Transformers introduced Self-Attention:

The model focuses on important tokens within a sentence

Sentence:
She went to the bank to deposit money.”

Attention reveals:

  • “bank” → “deposit money”
  • Not “bank” → “river”

Benefits:
✔ Understands long-context
✔ Parallel processing → faster
✔ Core logic of all LLMs today

Multi-Head Attention = multiple perspectives at once

Decoder & Generation Creating Token-by-Token Output

LLMs are Autoregressive:
Predict the next token based on previous tokens:

“The weather is…” → “sunny” → period → finish

Common decoding strategies:

StrategyBehavior
Greedy searchPicks best next token (boring)
Beam searchMultiple paths (accurate)
Sampling (Top-P, Temperature)Creative, diverse
MixtureBalanced outputs

This module determines creativity vs. correctness.

RAG (Retrieval Augmented Generation) AI with Real Knowledge

Problem:
Models forget things after training (no real-time knowledge)

Solution:

Retrieve relevant information from a Vector Database before generating an answer

Flow:
1️⃣ Convert user question → embeddings
2️⃣ Search in vector DB
3️⃣ Send retrieved knowledge + prompt to model
4️⃣ Model produces structured output

Used in:

  • Enterprise AI
  • Chatbots with company data
  • Private knowledge assistants

RAG = AI + Memory

Vector Databases — Brain Memory Storage

Neurons store embeddings in high-dimensional format.

Popular vector DBs:

  • Pinecone
  • FAISS
  • Milvus
  • Weaviate
  • ChromaDB

Optimized for:
✔ Fast search
✔ Semantic similarity
✔ Scalability with billions of vectors

➡️ Deep dive coming in Blog #4

RLHF — Teaching AI Human Values

LLMs trained only on data are:

  • Raw
  • Unsafe
  • Biased

So humans give feedback:

RLHF = Reinforcement Learning from Human Feedback

Process:
1️⃣ Humans label good vs bad responses
2️⃣ AI learns the preferred behavior
3️⃣ Safer + aligned responses

This is how ChatGPT stopped saying harmful/wrong stuff.

Optional Modules in Production Systems

ModuleBenefit
Agent OrchestrationMulti-step automation
Tools / APIs AccessBrowsing, calculator, code execution
Memory StorePersonalized user experience
GuardrailsSafety rules, filtering

These convert LLM → Autonomous AI Agents

Putting It All Together

GenAI is not just a model — it is an ecosystem of components working together to understand, reason, and generate knowledge.

Pipeline summary:

Token → Embedding → Attention → Thinking → RAG Memory → Decoding → RLHF Safety ✔

Leave a Comment

Your email address will not be published. Required fields are marked *