Spring AI & RAG Development

Learn to build production-grade Retrieval-Augmented Generation systems using Spring AI, illustrated with the Power RAG reference project.

Start Learning →

Prerequisites

  • Java 17+ (course uses Java 25)
  • Spring Boot basics — understanding of beans, auto-configuration, and REST controllers
  • Docker basics — running containers and using Docker Compose
  • Familiarity with Maven and pom.xml
Module 1 — Foundations
#TopicDescriptionTime
01 What is RAG? Definition, the problem RAG solves, the 3-step loop, and the 9-stage Power RAG pipeline 8 min
02 Spring AI Overview Key abstractions, the portable AI layer, BOM pattern, and "write once, run anywhere" 6 min
03 Project Setup BOM import, provider starters, application.yml configuration for all models 10 min
Module 2 — LLM Integration
#TopicDescriptionTime
04 ChatClient Basics Three call styles, system prompts, defaultSystem(), and how Power RAG calls the LLM 8 min
05 Multiple LLM Providers @Qualifier pattern, SpringAiConfig.java, all registered beans and their purpose 10 min
06 Dynamic Model Routing Runtime model selection, resolveClient(), per-request ChatOptions override 8 min
07 System Prompts & Templates System prompt patterns, MultilingualPromptBuilder, context-aware prompt construction 10 min
Module 3 — Document Ingestion
#TopicDescriptionTime
08 Document Parsing DocumentParser interface, all 6 parser types, Spring auto-injection of parser list 9 min
09 Chunking Strategies Why chunk, trade-offs, sliding window with overlap, visual diagram 9 min
10 Embeddings Dense vector semantics, gemini-embedding-001, 768 dimensions, the embedding pipeline 7 min
Module 4 — Retrieval
#TopicDescriptionTime
11 Vector Stores VectorStore interface, Qdrant setup, adding documents, Docker Compose config 8 min
12 Similarity Search Cosine similarity, SearchRequest API, topK strategy, semantic search explained 7 min
13 Keyword Search (FTS) PostgreSQL tsvector, GIN index, Spring Data FTS query, when exact matching wins 9 min
14 Reciprocal Rank Fusion RRF formula, k=60 rationale, full rrfMerge() implementation, worked example 12 min
15 Context Assembly Formatting chunks as [SOURCE N] blocks, 24k char cap, source citation extraction 8 min
16 Confidence Scoring RRF-score-based confidence, 0.1 threshold, graceful fallback to general knowledge 6 min
Module 5 — The Full Pipeline
#TopicDescriptionTime
17 The Full RAG Pipeline All 9 stages of RagService.query() from guardrail to audit log 15 min
18 Semantic Caching Redis vector search, 0.92 threshold, 24h TTL, NoOp for tests 10 min
19 Guardrails Gemini 2.5 Flash input safety, PII regex output detection, fail-open design 11 min
Module 6 — Advanced Features
#TopicDescriptionTime
20 Multimodal Image Input Spring AI Media API, provider-specific conversion, Base64 image decoding 9 min
21 Image Generation Intent detection, Imagen 3 via Google GenAI SDK, Gemini Flash fallback 10 min
22 Text-to-SQL Schema introspection, SQL prompt engineering, JSqlParser validation, execution 12 min
Module 7 — Production Concerns
#TopicDescriptionTime
23 Security & JWT JwtAuthFilter, SecurityConfig filter chain, role-based access, stateless tokens 10 min
24 Database Migrations Flyway versioned migrations, naming convention, idempotent patterns, production rules 8 min
Module 8 — Quality & Ops
#TopicDescriptionTime
25 Testing Spring AI Apps Mocking ChatClient, NoOp cache, Testcontainers for Postgres and Qdrant 12 min
26 Observability & Logging SLF4J log points, log levels, JaCoCo coverage thresholds, Actuator endpoints 8 min
Module 9 — Local AI Infrastructure
#TopicDescriptionTime
27 Local Open-Source LLMs Model selection for each pipeline role, hardware requirements by tier, GPU vs Apple Silicon vs CPU, recommended configurations 18 min
Module 10 — Agentic RAG & Live Data
#TopicDescriptionTime
28 Model Context Protocol What MCP is, STDIO vs HTTP/SSE transports, Spring AI MCP client, tool discovery endpoint, and how tools attach to the RAG pipeline 20 min
29 MCP Tool Servers Python powerrag-tools server: web fetch, time/weather, Jira, GitHub, GCP Logging; email binary server; credential management and SSRF considerations 25 min
30 Intent Routing QueryIntentClassifier: fast LLM router vs pattern-matching fallback, QueryIntent record, per-query KB/tool activation, provider-aware options 20 min
31 Tool Observability & Audit ObservingToolCallback decorator, McpInvocationRecorder thread-local buffer, JSONB audit column, API response surface, frontend amber badge, confidence adjustment 18 min