Top
Enterprise Postgres 18 Knowledge DataManagement FeatureUser's Guide

3.1 Overview of the Semantic Text Search and Automatic Vectorization Feature

Semantic text search

The semantic text search is a feature that searches for highly relevant texts based on semantic similarity of the text. This is achieved by utilizing vector representations that maintain the semantic similarity of text data. It is possible to use hybrid search that combines semantic text search using vector representation and full-text search based on string matching.

Automatic vectorization feature

The automatic vectorization feature automatically generates and stores vector representations corresponding to the text data inserted into the table. Use an external service called an embedded provider to generate vector representations of text.

The automatic vectorization feature is executed in the background by defining vectorization for the table to be searched. The definition of vectorization includes the embedded model that determines the vector representation used for semantic text search, the method of text segmentation, and the definition of the index.

When a vectorization is defined for table, a corresponding "embedded table" is created. This embedded table stores text chunks, which are units of the original table's text data divided into chunks, and the corresponding vector data. At the same time as defining the vectorization, you can define a vector index for the vector data and a full-text search index for the text chunks. An "embedded view" that combines the original table and the embedded table is also created.

Semantic text search and embedded view

Semantic text search and hybrid search are executed using embedded view as the search target.

When semantic text search and hybrid search are performed for embedded view, the text given as a query is internally converted into a vector representation, and a similarity search is conducted between the vector representation of the stored knowledge data.