Spring AI & RAG Development
Learn to build production-grade Retrieval-Augmented Generation systems using Spring AI, illustrated with the Power RAG reference project.
Start Learning →Prerequisites
- Java 17+ (course uses Java 25)
- Spring Boot basics — understanding of beans, auto-configuration, and REST controllers
- Docker basics — running containers and using Docker Compose
- Familiarity with Maven and
pom.xml
Module 1 — Foundations
| # | Topic | Description | Time |
|---|---|---|---|
| 01 | What is RAG? | Definition, the problem RAG solves, the 3-step loop, and the 9-stage Power RAG pipeline | 8 min |
| 02 | Spring AI Overview | Key abstractions, the portable AI layer, BOM pattern, and "write once, run anywhere" | 6 min |
| 03 | Project Setup | BOM import, provider starters, application.yml configuration for all models | 10 min |
Module 2 — LLM Integration
| # | Topic | Description | Time |
|---|---|---|---|
| 04 | ChatClient Basics | Three call styles, system prompts, defaultSystem(), and how Power RAG calls the LLM | 8 min |
| 05 | Multiple LLM Providers | @Qualifier pattern, SpringAiConfig.java, all registered beans and their purpose | 10 min |
| 06 | Dynamic Model Routing | Runtime model selection, resolveClient(), per-request ChatOptions override | 8 min |
| 07 | System Prompts & Templates | System prompt patterns, MultilingualPromptBuilder, context-aware prompt construction | 10 min |
Module 3 — Document Ingestion
| # | Topic | Description | Time |
|---|---|---|---|
| 08 | Document Parsing | DocumentParser interface, all 6 parser types, Spring auto-injection of parser list | 9 min |
| 09 | Chunking Strategies | Why chunk, trade-offs, sliding window with overlap, visual diagram | 9 min |
| 10 | Embeddings | Dense vector semantics, gemini-embedding-001, 768 dimensions, the embedding pipeline | 7 min |
Module 4 — Retrieval
| # | Topic | Description | Time |
|---|---|---|---|
| 11 | Vector Stores | VectorStore interface, Qdrant setup, adding documents, Docker Compose config | 8 min |
| 12 | Similarity Search | Cosine similarity, SearchRequest API, topK strategy, semantic search explained | 7 min |
| 13 | Keyword Search (FTS) | PostgreSQL tsvector, GIN index, Spring Data FTS query, when exact matching wins | 9 min |
| 14 | Reciprocal Rank Fusion | RRF formula, k=60 rationale, full rrfMerge() implementation, worked example | 12 min |
| 15 | Context Assembly | Formatting chunks as [SOURCE N] blocks, 24k char cap, source citation extraction | 8 min |
| 16 | Confidence Scoring | RRF-score-based confidence, 0.1 threshold, graceful fallback to general knowledge | 6 min |
Module 5 — The Full Pipeline
| # | Topic | Description | Time |
|---|---|---|---|
| 17 | The Full RAG Pipeline | All 9 stages of RagService.query() from guardrail to audit log | 15 min |
| 18 | Semantic Caching | Redis vector search, 0.92 threshold, 24h TTL, NoOp for tests | 10 min |
| 19 | Guardrails | Gemini 2.5 Flash input safety, PII regex output detection, fail-open design | 11 min |
Module 6 — Advanced Features
| # | Topic | Description | Time |
|---|---|---|---|
| 20 | Multimodal Image Input | Spring AI Media API, provider-specific conversion, Base64 image decoding | 9 min |
| 21 | Image Generation | Intent detection, Imagen 3 via Google GenAI SDK, Gemini Flash fallback | 10 min |
| 22 | Text-to-SQL | Schema introspection, SQL prompt engineering, JSqlParser validation, execution | 12 min |
Module 7 — Production Concerns
| # | Topic | Description | Time |
|---|---|---|---|
| 23 | Security & JWT | JwtAuthFilter, SecurityConfig filter chain, role-based access, stateless tokens | 10 min |
| 24 | Database Migrations | Flyway versioned migrations, naming convention, idempotent patterns, production rules | 8 min |
Module 8 — Quality & Ops
| # | Topic | Description | Time |
|---|---|---|---|
| 25 | Testing Spring AI Apps | Mocking ChatClient, NoOp cache, Testcontainers for Postgres and Qdrant | 12 min |
| 26 | Observability & Logging | SLF4J log points, log levels, JaCoCo coverage thresholds, Actuator endpoints | 8 min |
Module 9 — Local AI Infrastructure
| # | Topic | Description | Time |
|---|---|---|---|
| 27 | Local Open-Source LLMs | Model selection for each pipeline role, hardware requirements by tier, GPU vs Apple Silicon vs CPU, recommended configurations | 18 min |
Module 10 — Agentic RAG & Live Data
| # | Topic | Description | Time |
|---|---|---|---|
| 28 | Model Context Protocol | What MCP is, STDIO vs HTTP/SSE transports, Spring AI MCP client, tool discovery endpoint, and how tools attach to the RAG pipeline | 20 min |
| 29 | MCP Tool Servers | Python powerrag-tools server: web fetch, time/weather, Jira, GitHub, GCP Logging; email binary server; credential management and SSRF considerations | 25 min |
| 30 | Intent Routing | QueryIntentClassifier: fast LLM router vs pattern-matching fallback, QueryIntent record, per-query KB/tool activation, provider-aware options | 20 min |
| 31 | Tool Observability & Audit | ObservingToolCallback decorator, McpInvocationRecorder thread-local buffer, JSONB audit column, API response surface, frontend amber badge, confidence adjustment | 18 min |