Spring AI & RAG Development

Learn to build production-grade Retrieval-Augmented Generation systems using Spring AI, illustrated with the Power RAG reference project.

Java 17+ (course uses Java 25)
Spring Boot basics — understanding of beans, auto-configuration, and REST controllers
Docker basics — running containers and using Docker Compose
Familiarity with Maven and pom.xml

Module 1 — Foundations

#	Topic	Description	Time
01	What is RAG?	Definition, the problem RAG solves, the 3-step loop, and the 9-stage Power RAG pipeline	8 min
02	Spring AI Overview	Key abstractions, the portable AI layer, BOM pattern, and "write once, run anywhere"	6 min
03	Project Setup	BOM import, provider starters, application.yml configuration for all models	10 min

Module 2 — LLM Integration

#	Topic	Description	Time
04	ChatClient Basics	Three call styles, system prompts, defaultSystem(), and how Power RAG calls the LLM	8 min
05	Multiple LLM Providers	@Qualifier pattern, SpringAiConfig.java, all registered beans and their purpose	10 min
06	Dynamic Model Routing	Runtime model selection, resolveClient(), per-request ChatOptions override	8 min
07	System Prompts & Templates	System prompt patterns, MultilingualPromptBuilder, context-aware prompt construction	10 min

Module 3 — Document Ingestion

#	Topic	Description	Time
08	Document Parsing	DocumentParser interface, all 6 parser types, Spring auto-injection of parser list	9 min
09	Chunking Strategies	Why chunk, trade-offs, sliding window with overlap, visual diagram	9 min
10	Embeddings	Dense vector semantics, gemini-embedding-001, 768 dimensions, the embedding pipeline	7 min

Module 4 — Retrieval

#	Topic	Description	Time
11	Vector Stores	VectorStore interface, Qdrant setup, adding documents, Docker Compose config	8 min
12	Similarity Search	Cosine similarity, SearchRequest API, topK strategy, semantic search explained	7 min
13	Keyword Search (FTS)	PostgreSQL tsvector, GIN index, Spring Data FTS query, when exact matching wins	9 min
14	Reciprocal Rank Fusion	RRF formula, k=60 rationale, full rrfMerge() implementation, worked example	12 min
15	Context Assembly	Formatting chunks as [SOURCE N] blocks, 24k char cap, source citation extraction	8 min
16	Confidence Scoring	RRF-score-based confidence, 0.1 threshold, graceful fallback to general knowledge	6 min

Module 5 — The Full Pipeline

#	Topic	Description	Time
17	The Full RAG Pipeline	All 9 stages of RagService.query() from guardrail to audit log	15 min
18	Semantic Caching	Redis vector search, 0.92 threshold, 24h TTL, NoOp for tests	10 min
19	Guardrails	Gemini 2.5 Flash input safety, PII regex output detection, fail-open design	11 min

Module 6 — Advanced Features

#	Topic	Description	Time
20	Multimodal Image Input	Spring AI Media API, provider-specific conversion, Base64 image decoding	9 min
21	Image Generation	Intent detection, Imagen 3 via Google GenAI SDK, Gemini Flash fallback	10 min
22	Text-to-SQL	Schema introspection, SQL prompt engineering, JSqlParser validation, execution	12 min

Module 7 — Production Concerns

#	Topic	Description	Time
23	Security & JWT	JwtAuthFilter, SecurityConfig filter chain, role-based access, stateless tokens	10 min
24	Database Migrations	Flyway versioned migrations, naming convention, idempotent patterns, production rules	8 min

Module 8 — Quality & Ops

#	Topic	Description	Time
25	Testing Spring AI Apps	Mocking ChatClient, NoOp cache, Testcontainers for Postgres and Qdrant	12 min
26	Observability & Logging	SLF4J log points, log levels, JaCoCo coverage thresholds, Actuator endpoints	8 min

Module 9 — Local AI Infrastructure

#	Topic	Description	Time
27	Local Open-Source LLMs	Model selection for each pipeline role, hardware requirements by tier, GPU vs Apple Silicon vs CPU, recommended configurations	18 min

Module 10 — Agentic RAG & Live Data

#	Topic	Description	Time
28	Model Context Protocol	What MCP is, STDIO vs HTTP/SSE transports, Spring AI MCP client, tool discovery endpoint, and how tools attach to the RAG pipeline	20 min
29	MCP Tool Servers	Python powerrag-tools server: web fetch, time/weather, Jira, GitHub, GCP Logging; email binary server; credential management and SSRF considerations	25 min
30	Intent Routing	QueryIntentClassifier: fast LLM router vs pattern-matching fallback, QueryIntent record, per-query KB/tool activation, provider-aware options	20 min
31	Tool Observability & Audit	ObservingToolCallback decorator, McpInvocationRecorder thread-local buffer, JSONB audit column, API response surface, frontend amber badge, confidence adjustment	18 min