Article · Wikipedia archive · Last revised May 27, 2026

Vector database

A vector database, vector store or vector search engine is a database that stores and retrieves embeddings of data in vector space. Vector databases typically implement approximate nearest neighbor algorithms so users can search for records semantically similar to a given input, unlike traditional databases which primarily look up records by exact match. Use-cases for vector databases include similarity search, semantic search, multi-modal search, recommendations engines, object detection, and retrieval-augmented generation (RAG).

Last revised
May 27, 2026
Read time
≈ 7 min
Length
1,720 w
Citations
87
Source

A vector database, vector store or vector search engine is a database that stores and retrieves embeddings of data in vector space.1 Vector databases typically implement approximate nearest neighbor algorithms so users can search for records semantically similar to a given input, unlike traditional databases which primarily look up records by exact match.23 Use-cases for vector databases include similarity search, semantic search, multi-modal search, recommendations engines, object detection, and retrieval-augmented generation (RAG).1

Vector embeddings are mathematical representations of data in a high-dimensional space. In this space, each dimension corresponds to a feature of the data, with the number of dimensions ranging from a few hundred to tens of thousands, depending on the complexity of the data being represented. Each data item is represented by one vector in this space. Words, phrases, or entire documents, as well as images, audio, and other types of data, can all be vectorized.1

These feature vectors may be computed from the raw data using machine learning methods such as feature extraction algorithms, word embeddings4 or deep learning networks. The goal is that semantically similar data items receive feature vectors close to each other.

Vector retrieval can be combined with metadata filtering or lexical search to support filtered and hybrid retrieval workflows.56

Techniques

Common techniques for similarity search on high-dimensional vectors include:

These techniques may also be combined in vector search systems.7

In recent benchmarks, HNSW-based implementations have been among the best performers.89 Conferences such as the International Conference on Similarity Search and Applications (SISAP)10 and the Conference on Neural Information Processing Systems (NeurIPS)11 have hosted competitions on vector search in large databases.

Applications

Vector databases are used in a wide range of machine learning applications including similarity search, semantic search, multi-modal search, recommendations engines, object detection, and retrieval-augmented generation.1

Retrieval-augmented generation

An especially common use-case for vector databases is in retrieval-augmented generation (RAG), a method to improve domain-specific responses of large language models. The retrieval component of a RAG can be any search system, but is most often implemented as a vector database. Text documents describing the domain of interest are collected, and for each document or document section, a feature vector (known as an "embedding") is computed, typically using a deep learning network, and stored in a vector database along with a link to the document. Given a user prompt, the feature vector of the prompt is computed, and the database is queried to retrieve the most relevant documents. These are then automatically added into the context window of the large language model, and the large language model proceeds to create a response to the prompt given this context.12

Implementations

Name License
Aerospike1314 Proprietary
AllegroGraph1516 Proprietary (Managed Service)
AlloyDB AI17 Proprietary (Managed Service)
Apache Cassandra1819 Apache License 2.0
Azure Cosmos DB20 Proprietary (Managed Service)
Chroma2122 Apache License 2.023
ClickHouse24 Apache License 2.0
Couchbase2526 BSL 1.127
CrateDB28 Apache License 2.0
DataStax29 Proprietary (Managed Service)
Elasticsearch30 Server Side Public License, Elastic License31
JaguarDB3233 Proprietary
LanceDB3435 Apache License 2.036
LlamaIndex37 MIT License38
MariaDB3940 GPL v241
Marqo42 Apache License 2.043
Meilisearch4445 MIT License46
Milvus47 Apache License 2.048
MongoDB Atlas49 Server Side Public License (Managed service)
MyScaleDB50 Apache License 2.0
Neo4j5152 GPL v3 (Community Edition)53
ObjectBox54 Apache License 2.055
OpenSearch5657 Apache License 2.058
Oracle Database59 Proprietary (Managed Service or License)
Pinecone60 Proprietary (Managed Service)
Pixeltable (Incremental Embedding)61 Apache License 2.058
Postgres with pgvector62 PostgreSQL License63
Qdrant64 Apache License 2.065
Redis Stack6667 Redis Source Available License Archived 2024-01-31 at the Wayback Machine68
ScyllaDB6970 Proprietary Source Available
Snowflake71 Proprietary (Managed Service)
SurrealDB72 BSL 1.173
TiDB74 Apache License 2.075
Typesense76 GPL v3 (Community Edition)77
Vespa78 Apache License 2.079
Weaviate BSD 3-Clause80
YDB8182 Apache License 2.083
See also

See also

References

References

  1. "Vector database". learn.microsoft.com. 2023-12-26. Retrieved 2024-01-11.
  2. Roie Schwaber-Cohen. "What is a Vector Database & How Does it Work". Pinecone. Retrieved 18 November 2023.
  3. "What is a vector database". Elastic. Retrieved 18 November 2023.
  4. Evan Chaki (2023-07-31). "What is a vector database?". Microsoft. A vector database is a type of database that stores data as high-dimensional vectors, which are mathematical representations of features or attributes.
  5. "Hybrid search using vectors and full text in Azure AI Search". Microsoft Learn. Retrieved 2026-05-05.
  6. "Add a filter to a vector query in Azure AI Search". Microsoft Learn. Retrieved 2026-05-05.
  7. Pan, James Jie; Wang, Jianguo; Li, Guoliang (2023-10-21). "Survey of Vector Database Management Systems". arXiv:2310.14021 [cs.DB].
  8. Aumüller, Martin; Bernhardsson, Erik; Faithfull, Alexander (2017), Beecks, Christian; Borutta, Felix; Kröger, Peer; Seidl, Thomas (eds.), "ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms", Similarity Search and Applications, vol. 10609, Cham: Springer International Publishing, pp. 34–49, arXiv:1807.05614, doi:10.1007/978-3-319-68474-1_3, ISBN 978-3-319-68473-4, retrieved 2024-03-19{{citation}}: CS1 maint: work parameter with ISBN (link)
  9. Aumüller, Martin; Bernhardsson, Erik; Faithfull, Alexander (2017). "ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms". In Beecks, Christian; Borutta, Felix; Kröger, Peer; Seidl, Thomas (eds.). Similarity Search and Applications. Lecture Notes in Computer Science. Vol. 10609. Cham: Springer International Publishing. pp. 34–49. arXiv:1807.05614. doi:10.1007/978-3-319-68474-1_3. ISBN 978-3-319-68474-1.
  10. "Task description and call for participation SISAP 2025 Indexing Challenge". sisap-challenges.github.io. Retrieved 2026-01-01.
  11. "NeurIPS Competition Practical Vector Search (Big ANN) Challenge 2023". neurips.cc. Retrieved 2026-01-01.
  12. Lewis, Patrick; Perez, Ethan; Piktus, Aleksandra; Petroni, Fabio; Karpukhin, Vladimir; Goyal, Naman; Küttler, Heinrich (2020). "Retrieval-augmented generation for knowledge-intensive NLP tasks". Advances in Neural Information Processing Systems 33: 9459–9474. arXiv:2005.11401.
  13. "Aerospike Recognized by Independent Research Firm Among Notable Vendors in Vector Databases Report". Morningstar. 2024-05-07. Retrieved 2024-08-01.
  14. "Aerospike raises $109M for its real-time database platform to capitalize on the AI boom". TechCrunch. 2024-04-04. Retrieved 2024-08-01.
  15. "AllegroGraph 8.0 Incorporates Neuro-Symbolic AI, a Pathway to AGI". TheNewStack. 2023-12-29. Retrieved 2024-06-06.
  16. "Franz Inc. Introduces AllegroGraph Cloud: A Managed Service for Neuro-Symbolic AI Knowledge Graphs". Datanami. 2024-01-18. Retrieved 2024-06-06.
  17. Wiggers, Kyle (2023-08-29). "Google's AlloyDB AI transforms databases to power generative AI apps". TechCrunch. Retrieved 2026-01-01.
  18. "5 Hard Problems in Vector Search, and How Cassandra Solves Them". TheNewStack. 2023-09-22. Retrieved 2023-09-22.
  19. "Vector Search quickstart". Retrieved 2023-11-21.
  20. "Vector database". learn.microsoft.com. 26 December 2023. Retrieved 2024-01-10.
  21. Palazzolo, Stephanie. "Vector database Chroma scored $18 million in seed funding at a $75 million valuation. Here's why its technology is key to helping generative AI startups". Business Insider. Retrieved 2023-11-16.
  22. MSV, Janakiram (2023-07-28). "Exploring Chroma: The Open Source Vector Database for LLMs". The New Stack. Retrieved 2023-11-16.
  23. "chroma/LICENSE at main · chroma-core/chroma". GitHub.
  24. "Can you use ClickHouse for vector search? | ClickHouse Docs". 2023-10-26. Archived from the original on 2025-06-22. Retrieved 2025-07-02.
  25. "Couchbase aims to boost developer database productivity with Capella IQ AI tool". VentureBeat. 2023-08-30.
  26. "Investor Presentation Third Quarter Fiscal 2024". Couchbase Investor Relations. 2023-12-06.
  27. Anderson, Scott (2021-03-26). "Couchbase Adopts BSL License". The Couchbase Blog. Retrieved 2024-02-14.
  28. "Open Source Vector Database". CrateDB Blog. 16 November 2023. Retrieved 2024-11-06.
  29. Sean Michael Kerner (18 July 2023). "DataStax brings vector database search to multicloud with Astra DB". Venture Beat.
  30. Kerner, Sean (23 May 2023). "Elasticsearch Relevance Engine brings new vectors to generative AI". VentureBeat. Retrieved 18 November 2023.
  31. "elasticsearch/LICENSE.txt at main · elastic/elasticsearch". GitHub.
  32. "JaguarDB Homepage". JaguarDB. Retrieved 2025-04-12.
  33. "Vector DBMS". db-engines.com. 2023-07-03. Retrieved 2025-04-12.
  34. "LanceDB Homepage". LanceDB. 2024-12-17. Retrieved 2024-12-17.
  35. "A scalable, elastic database and search solution for 1B+ vectors built on LanceDB and Amazon S3 | AWS Architecture Blog". aws.amazon.com. 2025-09-22. Retrieved 2026-01-01.
  36. "lancedb/LICENSE at main · lancedb/lancedb". GitHub. Retrieved 2024-12-17.
  37. Wiggers, Kyle (2023-06-06). "LlamaIndex adds private data to large language models". TechCrunch. Retrieved 2023-10-29.
  38. "llama_index/LICENSE at main · run-llama/llama_index". GitHub. Retrieved 2023-10-29.
  39. "MariaDB Vector". MariaDB.org. Retrieved 2024-07-30.
  40. "Vector search in old and modern databases". manticoresearch.com. Retrieved 2024-07-30.
  41. "Licensing FAQ". MariaDB KnowledgeBase. Retrieved 2024-07-30.
  42. Sawers, Paul (2023-08-16). "Meet Marqo, an open source vector search engine for AI applications". TechCrunch. Retrieved 2024-08-20.
  43. marqo-ai/marqo, Marqo, 2024-08-20, retrieved 2024-08-20
  44. "Meilisearch Homepage". Meilisearch. 2024-10-08. Retrieved 2023-10-29.
  45. "Compare Algolia vs ElasticSearch vs Meilisearch vs Typesense". typesense.org. Retrieved 2026-01-01.
  46. "meilisearch/LICENSE at main · meilisearch/meilisearch". GitHub. Retrieved 2024-10-08.
  47. Liao, Ingrid Lunden and Rita (2022-08-24). "Zilliz raises $60M, relocates to SF". TechCrunch. Retrieved 2023-10-29.
  48. "Milvus license". GitHub.
  49. "Introducing Atlas Vector Search: Build Intelligent Applications with Semantic Search and AI Over Any Type of Data". MongoDB. 2023-06-22.
  50. Jamil, Usama (2024-05-20). "Build an Advanced RAG Application Using MyScaleDB and LlamaIndex". The New Stack. Retrieved 2026-01-01.
  51. "Neo4j enhances its graph database with vector search". itbrief. 2023-08-22.
  52. "Vector search indexes". neo4j.
  53. "Neo4j Licensing".
  54. "Top Fifteen Vector Databases". db-engines.com. 2024-07-03. Retrieved 2024-07-03.
  55. "ObjectBox Java license". github.
  56. "Using OpenSearch as a Vector Database". OpenSearch.org. 2023-08-02. Retrieved 2024-02-07.
  57. Pan, James Jie; Wang, Jianguo; Li, Guoliang (2023-10-21), Survey of Vector Database Management Systems, arXiv:2310.14021
  58. "OpenSearch license". github.
  59. Hook, Doug; Priyadarshi, Ranjan (May 2, 2024). "Oracle Announces General Availability of AI Vector Search in Oracle Database 23ai". oracle. Retrieved July 9, 2024.
  60. "Pinecone leads 'explosion' in vector databases for generative AI". VentureBeat. 2023-07-14. Retrieved 2023-10-29.
  61. "Automatic incremental embedding index". Pixeltable. 24 April 2025. Retrieved 2025-07-04.
  62. "pgvector". GitHub. Retrieved 2023-11-27.
  63. "pgvector/License". GitHub. Retrieved 2023-11-27.
  64. Sawers, Paul (2023-04-19). "Qdrant, an open-source vector database startup, wants to help AI developers leverage unstructured data". TechCrunch. Retrieved 2023-10-29.
  65. "qdrant/LICENSE at master · qdrant/qdrant". GitHub. Retrieved 2023-10-29.
  66. "Using Redis as a Vector Database with OpenAI | OpenAI Cookbook". cookbook.openai.com. Retrieved 2024-02-10.
  67. "Redis as a vector database quick start guide". Redis. Archived from the original on 2024-01-31. Retrieved 2024-01-31.
  68. "Search and query". Redis. Retrieved 2024-02-10.
  69. "Open source USearch library jumpstarts ScyllaDB vector search". TheNewStack. 2026-02-05. Retrieved 2026-02-23.
  70. "ScyllaDB adds vector search to managed database platform". TechTarget. 2026-01-20. Retrieved 2026-02-23.
  71. "Vector data type and vector similarity functions — General Availability". Snowflake. 2024-05-17. Retrieved 2024-05-17.
  72. Wiggers, Kyle (2023-01-04). "SurrealDB raises $6M for its database-as-a-service offering". TechCrunch. Retrieved 2024-01-19.
  73. "SurrealDB | License FAQs | The ultimate multi-model database". SurrealDB. Retrieved 2024-02-14.
  74. "TiDB Vector Search". docs.pingcap.com.
  75. tidb/LICENSE at master · pingcap/tidb
  76. Martinez, Miguel (2024-06-20). "Typesense Homepage". Typesense. Retrieved 2024-06-20.
  77. "Typesense licensing". GitHub.
  78. "Creating a Vespa Vector Database". Capital One. Retrieved 2026-01-01.
  79. "vespa/LICENSE at master · vespa-engine/vespa". GitHub.
  80. "weaviate/LICENSE at master · weaviate/weaviate". GitHub. Retrieved 2023-10-29.
  81. "Langchain YDB". Langchain. Retrieved 2025-07-26.
  82. "YDB - Vector Search". ydb.tech. Retrieved 2025-07-26.
  83. "ydb/LICENSE at master · ydb-platform/ydb". GitHub. Retrieved 2025-07-26.
External links