Chunking Strategies

Module 3 · ~9 min read
Chunking splits parsed document text into smaller pieces before embedding. The chunk size directly impacts retrieval quality: too large and the embedding captures too many topics; too small and the answer might be split across chunk boundaries. Power RAG uses a sliding window with overlap to balance precision and context continuity.

Why Chunking?

Embedding models have a maximum input length (typically 512–8192 tokens). A 50-page PDF cannot be embedded as a single unit — it must be split. Beyond the hard limit, smaller chunks produce better retrieval because:

Size vs Context Trade-offs

Chunk sizeRetrieval precisionAnswer completenessRisk
Very small (~50 words) High Low — answer may span multiple chunks Fragmented answers; missing context
Medium (~256–512 words) Good Good Balanced — recommended range
Large (~1000+ words) Low High Embedding dilution; less focused retrieval

Sliding Window with Overlap

SlidingWindowChunkingStrategy.java View source ↗
@Component
public class SlidingWindowChunkingStrategy implements ChunkingStrategy {

    // 512 words per chunk, 64 word overlap
    public List<Chunk> chunk(List<ParsedSection> sections) {
        for (ParsedSection section : sections) {
            String[] words = section.getText().split("\\s+");
            int step = Math.max(1, chunkSize - chunkOverlap); // = 448

            for (int start = 0; start < words.length; start += step) {
                int end = Math.min(start + chunkSize, words.length);
                String text = String.join(" ",
                    Arrays.copyOfRange(words, start, end));
                // store with metadata: chunk_index, start_line, ...
            }
        }
    }
}

Overlap Visualized

With chunkSize=512 and chunkOverlap=64, the step is 448 words. Each consecutive chunk shares 64 words with the previous one:

Word positions in document: Chunk 1: [word 0 ... word 511] Chunk 2: [word 448 ... word 959] ← 64-word overlap with Chunk 1 Chunk 3: [word 896 ... word 1407] ← 64-word overlap with Chunk 2 Chunk 4: [word 1344 ... word 1855] ← 64-word overlap with Chunk 3 Overlap zone (chunk 1 / chunk 2): |-- chunk 1 exclusive --|-- overlap --|-- chunk 2 exclusive --| word 0 word 448 word 511 word 959

The overlap ensures that sentences near chunk boundaries appear in at least two chunks. A question whose answer straddles the boundary will match one of the two chunks containing it.

Configuration

application.yml — chunking config View source ↗
powerrag:
  ingestion:
    chunk-size: 512    # words per chunk
    chunk-overlap: 64  # words of overlap between consecutive chunks
The 512-word / 64-word overlap configuration works well as a starting point. If your documents are highly dense (legal contracts, technical specs), consider reducing chunk size to 256. If they are conversational (transcripts, emails), 512–768 works better.