LangChain Tutorial: From Fundamentals to Advanced RAG

Last updated: April 15 2025

This tutorial provides a comprehensive overview of LangChain, covering the essential concepts from basic setup to building a sophisticated Retrieval-Augmented Generation (RAG) system.

1. Introduction and Setup

What is LangChain?

LangChain is a framework for developing applications powered by large language models (LLMs). It simplifies the entire lifecycle of LLM application development, including development, productionization, and deployment.

Core Concepts:

Chains & LCEL: Chains are sequences of operations (like prompts, models, and parsers) that can be composed using the LangChain Expression Language (LCEL) and the | (pipe) operator for flexible, declarative workflows.
Components: LangChain provides modular building blocks:
- Models: Interfaces for LLMs, chat, and embedding models from providers like OpenAI, Anthropic, and Google.
- Prompts: Templates for dynamic, parameterized prompt construction.
- Output Parsers: Convert LLM outputs into structured formats (strings, JSON, Pydantic models, etc.).
- Memory: Tools for maintaining state across chain runs, supporting context retention.
- Tools: External functions/APIs that can be called by the LLM during execution.
Retrievers & RAG: Retrievers fetch relevant data from sources (like vector stores or databases) to augment LLM responses. Retrieval-Augmented Generation (RAG) combines retrievers and LLMs for up-to-date, domain-specific answers.
Document Loaders & Text Splitters: Utilities for ingesting and chunking data from various sources for use in retrieval pipelines.

Installation

First, let's install the necessary packages.

# Uncomment to install core LangChain and provider-specific packages
# !pip install langchain langchain-core langchain-community langchain-openai langchain-chroma faiss-cpu pypdf sentence-transformers tiktoken -q

# Uncomemnt to install environment management and web request libraries
# !pip install python-dotenv requests beautifulsoup4 -q

Environment Setup

Configure your API keys. It's recommended to use environment variables for security.

import os
import getpass
from dotenv import load_dotenv

# Load environment variables from a .env file if it exists
load_dotenv()

# Set up your OpenAI API key
if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

print("✅ API keys configured!")

✅ API keys configured!

2. Building Your First Chain with LCEL

A "chain" in LangChain is a sequence of calls to components. We'll use the LangChain Expression Language (LCEL) to build a simple chain.

Initialize the Model

We'll start by initializing a chat model.

from langchain.chat_models import init_chat_model

# Initialize a chat model from OpenAI
model = init_chat_model("gpt-4o-mini", model_provider="openai", temperature=0.7)

print(f"✅ Initialized model: {model.__class__.__name__}")

✅ Initialized model: ChatOpenAI

Work with Prompt Templates

Prompt templates allow you to create reusable and parameterized prompts.

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

# Create a prompt template
prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are an expert in {expertise}."),
    ("human", "Explain {topic} in a simple and concise way.")
])

# Create a simple chain using LCEL
# The chain will:
# 1. Take user input for 'expertise' and 'topic'.
# 2. Format the prompt using the prompt_template.
# 3. Pass the formatted prompt to the model.
# 4. Parse the model's output to a string.
simple_chain = prompt_template | model | StrOutputParser()

# Invoke the chain
response = simple_chain.invoke({
    "expertise": "physics",
    "topic": "quantum entanglement"
})

print(response)

Quantum entanglement is a phenomenon in quantum physics where two or more particles become linked in such a way that the state of one particle instantly influences the state of the other, no matter how far apart they are. This means that if you measure one entangled particle and find its property (like spin or polarization), you can immediately know the corresponding property of the other particle, even if it is light-years away. It challenges our classical understanding of separateness and locality, leading to intriguing implications for information and communication.

Streaming Responses

For a better user experience, you can stream the model's response as it's generated.

print("🌊 Streaming response:")
for chunk in simple_chain.stream({
    "expertise": "culinary arts",
    "topic": "the Maillard reaction"
}):
    print(chunk, end="", flush=True)

🌊 Streaming response:
The Maillard reaction is a chemical process that occurs when proteins and sugars in food are heated together, resulting in browning and the development of complex flavors and aromas. This reaction typically happens at high temperatures during cooking, such as roasting or grilling. It’s responsible for the delicious crust on bread, the golden color of seared meats, and the rich flavors in many cooked foods. Essentially, it's what makes food taste and look more appealing when cooked.

3. Structured Output and Advanced Chains

LangChain can parse model outputs into structured formats and build more complex, conditional chains.

Pydantic Output Parser

Use Pydantic models to define the desired output structure.

from typing import List
from pydantic import BaseModel, Field
from langchain_core.output_parsers import PydanticOutputParser

# Define a Pydantic model for structured output
class Recipe(BaseModel):
    name: str = Field(description="The name of the recipe")
    ingredients: List[str] = Field(description="A list of ingredients")
    steps: List[str] = Field(description="The steps to prepare the recipe")

# Create a Pydantic output parser
pydantic_parser = PydanticOutputParser(pydantic_object=Recipe)

# Create a prompt that includes format instructions
structured_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a world-class chef. Generate a recipe based on the user's request and format it as requested."),
    ("human", "I want a simple recipe for {dish}.\n\n{format_instructions}")
])

# Create the structured output chain
structured_chain = structured_prompt | model | pydantic_parser

# Invoke the chain
recipe_request = {
    "dish": "scrambled eggs",
    "format_instructions": pydantic_parser.get_format_instructions()
}
recipe_output = structured_chain.invoke(recipe_request)

print(f"Recipe for: {recipe_output.name}")
print("\nIngredients:")
for ingredient in recipe_output.ingredients:
    print(f"- {ingredient}")

Recipe for: Simple Scrambled Eggs

Ingredients:
- 2 large eggs
- Salt to taste
- Pepper to taste
- 1 tablespoon butter
- Fresh herbs (optional, for garnish)

Conditional Chains with RunnableBranch

Create dynamic chains that change their behavior based on input conditions.

from langchain_core.runnables import RunnableBranch

# Define different prompts for different levels
beginner_prompt = ChatPromptTemplate.from_template("Explain {topic} to a complete beginner within 60 words.")
expert_prompt = ChatPromptTemplate.from_template("Provide a detailed, technical explanation of {topic} within 60 words.")

# Create a conditional chain using RunnableBranch
# This chain checks the 'level' input and routes to the appropriate prompt
conditional_chain = (
    RunnableBranch(
        (lambda x: x.get("level") == "expert", expert_prompt),
        beginner_prompt  # Default prompt
    )
    | model.bind(max_tokens=100)
    | StrOutputParser()
 )

# Test the beginner path
beginner_response = conditional_chain.invoke({"topic": "black holes", "level": "beginner"})
print("--- Beginner Explanation ---")
print(beginner_response)

# Test the expert path
expert_response = conditional_chain.invoke({"topic": "black holes", "level": "expert"})
print("\n--- Expert Explanation ---")
print(expert_response)

--- Beginner Explanation ---
A black hole is a region in space where gravity is so strong that nothing, not even light, can escape from it. They form when massive stars collapse under their own gravity after exhausting their fuel. Imagine a vacuum cleaner that pulls everything in; that's a black hole, pulling in matter and energy from its surroundings.

--- Expert Explanation ---
Black holes are regions in spacetime where gravity is so intense that nothing, not even light, can escape. Formed from collapsing massive stars, they are characterized by an event horizon—the boundary beyond which escape is impossible. Their mass, charge, and angular momentum define their properties, with singularities at their cores representing points of infinite density and spacetime curvature.

4. Retrieval-Augmented Generation (RAG)

RAG connects your LLM to external data, allowing it to answer questions about information it wasn't trained on.

Step 1: Document Loading and Splitting

Load data from a source (like a website) and split it into smaller chunks for processing.

from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Load documents from a web page
loader = WebBaseLoader("https://python.langchain.com/docs/modules/model_io/prompts/")
docs = loader.load()

# Initialize a text splitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)

# Split the documents into chunks
splits = text_splitter.split_documents(docs)

print(f"Loaded {len(docs)} documents and split them into {len(splits)} chunks.")

# Preview the first few chunks
print("\n📖 Preview of chunked text:")
for i, chunk in enumerate(splits[:3]):  # Show first 3 chunks
    print(f"\n--- Chunk {i+1} ---")
    print(f"Content: {chunk.page_content[:200]}...")  # Show first 200 characters
    print(f"Metadata: {chunk.metadata}")
    print(f"Full length: {len(chunk.page_content)} characters")

Loaded 1 documents and split them into 3 chunks.

📖 Preview of chunked text:

--- Chunk 1 ---
Content: Page Not Found | 🦜️🔗 LangChain...
Metadata: {'source': 'https://python.langchain.com/docs/modules/model_io/prompts/', 'title': 'Page Not Found | 🦜️🔗 LangChain', 'language': 'en'}
Full length: 30 characters

--- Chunk 2 ---
Content: Skip to main contentOur Building Ambient Agents with LangGraph course is now available on LangChain Academy!IntegrationsAPI ReferenceMoreContributingPeopleError referenceLangSmithLangGraphLangChain Hu...
Metadata: {'source': 'https://python.langchain.com/docs/modules/model_io/prompts/', 'title': 'Page Not Found | 🦜️🔗 LangChain', 'language': 'en'}
Full length: 427 characters

--- Chunk 3 ---
Content: them know their link is broken.CommunityLangChain ForumTwitterSlackGitHubOrganizationPythonJS/TSMoreHomepageBlogYouTubeCopyright © 2025 LangChain, Inc....
Metadata: {'source': 'https://python.langchain.com/docs/modules/model_io/prompts/', 'title': 'Page Not Found | 🦜️🔗 LangChain', 'language': 'en'}
Full length: 151 characters

Step 2: Embeddings and Vector Stores

Convert the text chunks into numerical representations (embeddings) and store them in a vector database for efficient searching.

from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma
import numpy as np

# Initialize OpenAI embeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

# Preview embeddings - let's see what embeddings look like
sample_text = "LangChain is a framework for developing applications powered by language models."
sample_embedding = embeddings.embed_query(sample_text)

print("📊 Embedding Preview:")
print(f"Sample text: '{sample_text}'")
print(f"Embedding dimension: {len(sample_embedding)}")
print(f"Embedding type: {type(sample_embedding)}")
print(f"First 10 values: {sample_embedding[:10]}")
print(f"Embedding range: [{min(sample_embedding):.4f}, {max(sample_embedding):.4f}]")
print(f"Embedding norm: {np.linalg.norm(sample_embedding):.4f}")

# Create a Chroma vector store from the document splits
vectorstore = Chroma.from_documents(documents=splits, embedding=embeddings)

print(f"\n✅ Vector store created with {len(splits)} document chunks.")
print("Each chunk has been converted to a {}-dimensional embedding vector.".format(len(sample_embedding)))

📊 Embedding Preview:
Sample text: 'LangChain is a framework for developing applications powered by language models.'
Embedding dimension: 1536
Embedding type: <class 'list'>
First 10 values: [-0.03170353174209595, 0.009393854066729546, 0.045744992792606354, -0.013159518130123615, 0.040198035538196564, 0.0005537528777495027, -0.025970900431275368, 0.025228211656212807, -0.03571869060397148, 0.020191853865981102]
Embedding range: [-0.0777, 0.0940]
Embedding norm: 1.0000

✅ Vector store created with 3 document chunks.
Each chunk has been converted to a 1536-dimensional embedding vector.

✅ Vector store created with 3 document chunks.
Each chunk has been converted to a 1536-dimensional embedding vector.

Step 3: Creating a Retriever

A retriever is responsible for finding the most relevant document chunks based on a user's query.

# Create a retriever from the vector store
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})

# Test the retriever
query = "What are prompt templates?"
retrieved_docs = retriever.invoke(query)

print(f"Retrieved {len(retrieved_docs)} documents for the query: '{query}'")

Retrieved 3 documents for the query: 'What are prompt templates?'

Step 4: Building the RAG Chain

Now, let's combine the retriever with a prompt and the LLM to create a complete RAG chain.

from langchain_core.runnables import RunnablePassthrough

# Define a RAG prompt template
rag_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. Use the following context to answer the user's question.\n\nContext:\n{context}"),
    ("human", "{question}")
])

# Helper function to format the retrieved documents
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

# Create the RAG chain
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | rag_prompt
    | model
    | StrOutputParser()
)

# Test the RAG chain
rag_question = "How can I use few-shot examples in my prompts?"
rag_answer = rag_chain.invoke(rag_question)

print(f"\n❓ Question: {rag_question}")
print(f"🎯 Answer: {rag_answer}")

❓ Question: How can I use few-shot examples in my prompts?
🎯 Answer: Few-shot examples can be effectively used in your prompts to guide the model's responses by providing context and specific examples of the desired output. Here’s how you can do it:

1. **Define the Task Clearly**: Start by clearly stating what you want the model to do. This sets the stage for the examples you will provide.

2. **Provide Examples**: Include a few examples that illustrate the input-output pairs. Each example should show a clear relationship between the input and the expected output. Typically, 2-5 examples are sufficient for few-shot prompting.

3. **Use a Consistent Format**: Make sure the format of your examples is consistent. This helps the model understand the pattern it should follow.

4. **End with a New Input**: After providing the examples, include the new input you want the model to respond to. This should be in the same format as the examples.

5. **Keep It Concise**: While examples are important, keep your prompt concise to avoid overwhelming the model with too much information.

### Example Prompt Structure

Task: Translate the following English sentences into Spanish.

Example 1:
Input: "Hello, how are you?"
Output: "Hola, ¿cómo estás?"

Example 2:
Input: "What is your name?"
Output: "¿Cuál es tu nombre?"

Example 3:
Input: "I love programming."
Output: "Me encanta programar."

Now, translate this sentence:
Input: "Where is the nearest restaurant?"
Output:

In this structure, you clearly define the task, provide a few relevant examples, and then present a new input for translation, guiding the model towards producing the expected output.

Summary: What You've Learned

🎯 Core Concepts Mastered

LangChain Expression Language (LCEL)

Built chains using the powerful | (pipe) operator
Composed multiple components into seamless workflows
Learned declarative programming for LLM applications

Essential Components

Models: Initialized and configured chat models with specific parameters
Prompts: Created reusable templates with dynamic variables
Output Parsers: Structured LLM outputs into usable formats (strings, Pydantic models)
Retrievers: Built document search systems for external knowledge

🛠️ Practical Skills Developed

Chain Building Techniques

Simple linear chains for basic text generation
Structured output chains with Pydantic validation
Conditional chains that adapt behavior based on input
RAG chains that combine retrieval with generation

Document Processing Pipeline

Web scraping and content extraction
Text chunking strategies for optimal retrieval
Vector embeddings and similarity search
Knowledge base creation and querying

Remember: Every complex AI application starts with the simple patterns you've learned today. Keep building, keep learning, and keep pushing the boundaries of what's possible with LangChain! 🚀