Loading course…
Loading course…
Created by Shaunak Ghosh
Run high-quality local LLMs with realistic hardware expectations, then build a private RAG workflow over your own documents with grounding and citations. You’ll add operational hygiene for incremental indexing, evaluate RAG quality with the right metrics, and connect your local-first agent to MCP tools with least-privilege defaults.
7 modules • Each builds on the previous one
Map latency, throughput, context length, and quality expectations to your actual CPU/GPU, VRAM/RAM, and storage constraints, including how quantization affects speed and accuracy.
Compare setup friction, UX, model lifecycle management, and update strategies across Ollama- and LM Studio-style toolchains to maximize “works on my machine” reliability.
Select local models by task category (reasoning, writing, code, small-footprint) using lightweight benchmarks and real prompts, balancing quality, speed, and context needs.
Design a local-first RAG pipeline that covers document parsing, chunking strategy, embeddings, indexing, and retrieval, with privacy-preserving defaults.
Apply repeatable patterns for building a local personal knowledge base from notes, PDFs, project folders, meeting transcripts, email exports, and codebases with minimal manual curation.
Verify RAG answers with citation/grounding checks, measure retrieval quality with targeted tests, and reduce hallucinations through disciplined prompting and evaluation loops.
Integrate a local-first agent with MCP tools using least-privilege design, read-only defaults, and strict directory scoping so automation remains private and predictable.
Begin your learning journey
In-video quizzes and scaffolded content to maximize retention.