LangChain Foundations
Models: LLMs, Chat Models & Embeddings
LangChain wraps model APIs in a uniform interface so you can swap providers without changing your pipeline logic. There are three model types:
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_anthropic import ChatAnthropic
# Chat model — most common
gpt4o = ChatOpenAI(model="gpt-4o", temperature=0.7)
claude = ChatAnthropic(model="claude-sonnet-4-6", temperature=0)
# Embeddings — for RAG pipelines (covered in Module 02)
embedder = OpenAIEmbeddings(model="text-embedding-3-small")
# Direct invocation (without a chain)
from langchain_core.messages import HumanMessage, SystemMessage
response = gpt4o.invoke([
SystemMessage(content="You are a helpful assistant."),
HumanMessage(content="What is the capital of France?"),
])
print(response.content) # → Paris
print(response.usage_metadata) # → {'input_tokens': 22, 'output_tokens': 1, ...}
Prompt Templates
Hardcoding prompts as strings makes them impossible to test, version, or reuse. ChatPromptTemplate gives you typed, parameterised templates.
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
# Basic template with variables
prompt = ChatPromptTemplate.from_messages([
("system", "You are an expert in {domain}. Be concise."),
("human", "{question}"),
])
# Format and inspect before sending to the model
formatted = prompt.format_messages(
domain="distributed systems",
question="Explain CAP theorem in one paragraph."
)
print(formatted[0].content) # → "You are an expert in distributed systems..."
# Template with conversation history placeholder (for multi-turn)
chat_prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
MessagesPlaceholder(variable_name="history"), # injected at runtime
("human", "{input}"),
])
# Partial templates — pre-fill some variables, leave others open
base_prompt = prompt.partial(domain="machine learning")
# Now only {question} needs to be supplied at invocation time
result = base_prompt.invoke({"question": "What is gradient descent?"})
LCEL — The LangChain Expression Language
LCEL is LangChain's composition system. The | operator chains Runnables together so the output of each step becomes the input of the next. Every component — prompts, models, parsers, retrievers — implements the Runnable interface.
Input dict {"question": "...", "domain": "..."}
│
▼
ChatPromptTemplate → ChatPromptValue (list of messages)
│
▼
ChatOpenAI → AIMessage(content="...")
│
▼
StrOutputParser → str
│
▼
Final output: "..."
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser, JsonOutputParser
from langchain_core.pydantic_v1 import BaseModel, Field
model = ChatOpenAI(model="gpt-4o-mini", temperature=0)
parser = StrOutputParser()
# Simple text chain
text_chain = (
ChatPromptTemplate.from_messages([
("system", "Summarise the following text in one sentence."),
("human", "{text}"),
])
| model
| parser
)
result = text_chain.invoke({"text": "LangChain is an open-source framework..."})
print(result)
# ── Structured / JSON output ──
class Sentiment(BaseModel):
label: str = Field(description="positive, negative, or neutral")
score: float = Field(description="confidence 0.0-1.0")
reason: str = Field(description="one-sentence justification")
json_parser = JsonOutputParser(pydantic_object=Sentiment)
sentiment_chain = (
ChatPromptTemplate.from_messages([
("system", "Classify the sentiment. Respond in JSON.\n{format_instructions}"),
("human", "{review}"),
]).partial(format_instructions=json_parser.get_format_instructions())
| model
| json_parser
)
out = sentiment_chain.invoke({"review": "The product is excellent, highly recommend!"})
print(out) # → {'label': 'positive', 'score': 0.98, 'reason': '...'}
The most common LCEL bug is a type mismatch between steps. A ChatPromptTemplate outputs a ChatPromptValue, which a ChatModel accepts. A ChatModel outputs an AIMessage, which StrOutputParser unwraps to a str. If you're getting AttributeError, inspect what the previous step actually returns by calling step.invoke(input) in isolation.
Runnables: Parallel, Passthrough, Lambda
Beyond the basic pipe, LCEL provides composable primitives for branching, merging, and transforming data mid-chain.
from langchain_core.runnables import (
RunnableParallel,
RunnablePassthrough,
RunnableLambda,
)
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
model = ChatOpenAI(model="gpt-4o-mini", temperature=0)
parser = StrOutputParser()
# ── RunnableParallel: run two chains simultaneously, merge results ──
summary_chain = (
ChatPromptTemplate.from_template("Summarise: {text}") | model | parser
)
keywords_chain = (
ChatPromptTemplate.from_template("List 5 keywords from: {text}") | model | parser
)
parallel = RunnableParallel(
summary = summary_chain,
keywords = keywords_chain,
)
# Both chains run concurrently; output is {"summary": "...", "keywords": "..."}
result = parallel.invoke({"text": "LangGraph is a library for building..."})
# ── RunnablePassthrough: pass input through unchanged ──
# Useful in RAG to carry the original question alongside retrieved docs
rag_prep = RunnableParallel(
context = some_retriever,
question = RunnablePassthrough(), # question passes through untouched
)
# ── RunnableLambda: wrap any Python function as a Runnable ──
def word_count(text: str) -> dict:
return {"text": text, "word_count": len(text.split())}
counting_chain = RunnableLambda(word_count)
print(counting_chain.invoke("Hello world foo bar"))
# → {"text": "Hello world foo bar", "word_count": 4}
Memory & Message History
LLMs are stateless by default — they don't remember previous turns. You must explicitly pass conversation history. LangChain's RunnableWithMessageHistory handles this pattern cleanly.
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_community.chat_message_histories import ChatMessageHistory
model = ChatOpenAI(model="gpt-4o-mini", temperature=0.5)
parser = StrOutputParser()
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant. Answer concisely."),
MessagesPlaceholder(variable_name="history"),
("human", "{input}"),
])
chain = prompt | model | parser
# In-memory store keyed by session_id
store: dict = {}
def get_session_history(session_id: str) -> ChatMessageHistory:
if session_id not in store:
store[session_id] = ChatMessageHistory()
return store[session_id]
# Wrap the chain so it automatically reads/writes history
chat = RunnableWithMessageHistory(
chain,
get_session_history,
input_messages_key="input",
history_messages_key="history",
)
config = {"configurable": {"session_id": "user-alice"}}
# Turn 1
r1 = chat.invoke({"input": "My name is Alice."}, config=config)
print(r1) # → "Nice to meet you, Alice!"
# Turn 2 — the model remembers the name
r2 = chat.invoke({"input": "What's my name?"}, config=config)
print(r2) # → "Your name is Alice."
The ChatMessageHistory above is in-memory and lost on restart. For production, swap it for RedisChatMessageHistory or PostgresChatMessageHistory from langchain_community — same API, durable backend.
Tools & Tool Calling
Tool calling lets the model decide at runtime to call a function you've defined and inject the result back into the conversation. This is the bridge between a passive LLM and an active agent.
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langchain_core.messages import HumanMessage, ToolMessage
import json, httpx
model = ChatOpenAI(model="gpt-4o-mini", temperature=0)
# ── Define tools with @tool decorator ──
@tool
def get_exchange_rate(base: str, target: str) -> dict:
"""Fetch the current exchange rate between two currencies.
Args:
base: ISO 4217 base currency code, e.g. 'USD'
target: ISO 4217 target currency code, e.g. 'SGD'
"""
url = f"https://open.er-api.com/v6/latest/{base}"
data = httpx.get(url).json()
rate = data["rates"].get(target)
return {"base": base, "target": target, "rate": rate}
@tool
def calculate(expression: str) -> float:
"""Evaluate a safe mathematical expression, e.g. '42 * 1.35'."""
# NOTE: Use a proper safe eval library (simpleeval) in production
allowed = set("0123456789+-*/(). ")
if not all(c in allowed for c in expression):
raise ValueError(f"Unsafe expression: {expression}")
return eval(expression) # noqa: S307
# ── Bind tools to the model ──
tools = [get_exchange_rate, calculate]
model_with_tools = model.bind_tools(tools)
# ── Agentic loop: invoke → handle tool calls → re-invoke ──
messages = [HumanMessage("How many SGD is 250 USD? Show your calculation.")]
response = model_with_tools.invoke(messages)
messages.append(response)
# Process any tool calls the model requested
for tc in response.tool_calls:
tool_map = {t.name: t for t in tools}
result = tool_map[tc["name"]].invoke(tc["args"])
messages.append(ToolMessage(
content=json.dumps(result),
tool_call_id=tc["id"],
))
# Final answer after tool results are injected
final = model_with_tools.invoke(messages)
print(final.content)
# → "250 USD is approximately 337.50 SGD (exchange rate: 1.35)."
The model decides when to call a tool based on its docstring and argument descriptions. Write clear, specific docstrings. Include the format of expected arguments (e.g. "ISO 4217 code like 'USD'"). Vague docstrings lead to incorrect or hallucinated tool calls.