Skip to the content.

Vector database


In one sentence

A vector database is a specialized database designed to store millions of embeddings and answer the question “which of these are most similar to this one?” very quickly.

Why vector databases exist

Once embeddings became central to how AI systems retrieve information (see embedding.md), a new operational problem appeared: how do you search across a million vectors fast enough for a real-time chatbot?

A naive approach — comparing the query vector against every stored vector one by one — works for a thousand documents but breaks down at a million. The math is unforgiving: a million 1024-dimensional dot products per query is too slow.

Vector databases solve this with approximate nearest-neighbor (ANN) algorithms: index structures that can return the top 10 most similar vectors out of a million in tens of milliseconds, with very high but not perfect recall. The “approximate” part is a deliberate trade — exact-nearest-neighbor search is too slow at scale, and 99% accuracy on the top-10 is good enough for almost every real use case.

What they actually do — concretely

A vector database typically offers four operations:

Operation What it does
insert Add a new vector with metadata (source document, timestamp, tags, etc.)
search Given a query vector, return the top-K most similar vectors
update Replace an existing vector
delete Remove a vector

On top of these, modern vector databases add features like:

The major players (mid-2026)

The right choice is almost always start small, scale later. A research project or pilot can run perfectly well in pgvector or Chroma on a laptop. The need for Pinecone or Milvus shows up at multi-million-vector scale.

Working example — what scaled-up RAG would look like for a teaching project

Suppose the Isenberg Management Department wanted to build a RAG-based teaching assistant that knows the department’s syllabi, lecture transcripts, AACSB documents, and faculty research from the past five years. A reasonable architecture:

  1. Corpus — roughly 10,000 documents (PDFs, transcripts, web pages).
  2. Chunking — split each document into ~500-token chunks. Estimated ~200,000 chunks total.
  3. Embedding modelnomic-embed-text or similar, run locally to keep student data on-premises.
  4. Vector database — pgvector (because the department probably already runs Postgres), or Qdrant if a dedicated service is preferred.
  5. Storage — 200,000 vectors at 1024 dimensions × 4 bytes ≈ 800 MB. Trivial.
  6. Query path — a question goes in, gets embedded, top-15 chunks are retrieved, an LLM generates an answer with citations.

The vector database is one piece of an operational system that also requires document ingestion pipelines, periodic re-indexing as new documents arrive, monitoring, and access control. The vector database itself is rarely the hard part. Data hygiene around it is.

Why this matters in a teaching context

For BBA and MBA students, vector databases are interesting because they sit at the intersection of three trends:

  1. The “AI-on-private-data” boom — every organization wants to do this.
  2. The data infrastructure renaissance — Postgres, ClickHouse, DuckDB, Snowflake, etc. are getting AI-native features.
  3. The embedded-AI pattern — vector search is now showing up inside CRMs, knowledge bases, IDEs, email clients, and document editors as a standard feature, not an exotic add-on.

The strategic question worth raising in a class discussion: for a given organization, is a vector database a strategic asset, an operational utility, or a commodity component? The answer differs by industry. A law firm whose competitive edge depends on its archive of past matters has a strategic asset. A retail brand using vector search to recommend products has an operational utility. A small business adding semantic search to its internal handbook has a commodity component.

Trade-offs


Related entries: embedding.md, rag.md, *(planned).*

Return to Dictionary All Entries (A–Z)