An LLM call is one line of code. An LLM app is hundreds. The gap between 'call OpenAI' and 'shipped product' is filled with prompt templating, memory, retrieval, tool use, parsers, retries, and observability — all the plumbing. LangChain (and similar libraries: LlamaIndex, Haystack, semantic-kernel) is the framework that gives you those primitives as composable parts.
The core abstractions
LangChain models everything as runnable components that you pipe together: prompt templates, LLMs, retrievers, tools, output parsers, and memory. A chain is just a function that takes input, threads it through these components, and returns output. Modern LangChain uses LCEL (LangChain Expression Language) so chains look like `prompt | llm | parser`.
- Prompt — a template with variables: 'Translate {text} to {lang}'
- LLM — model wrapper (OpenAI, Anthropic, local Llama, etc.)
- Retriever — pulls relevant docs from a vector DB (the R in RAG)
- Tool — a function the LLM can call (calculator, web search, code runner)
- Memory — conversation history persisted across turns
- Output parser — turn raw LLM text into structured data (JSON, Pydantic models)
Common chain patterns
Simple QA: prompt → LLM → parser. RAG bot: prompt → retriever → LLM → parser. Tool agent: prompt → LLM (decides tool) → tool → LLM (formats result) → parser. Conversational: memory → prompt → LLM → memory update. Most production apps are 2-4 of these chained together, plus some routing logic.