This feature allows you to import ONNX pipeline models. An ONNX pipeline model is an embedding model that takes text as input and generates embeddings as output. The core of a text embedding model is a neural network model expressed in ONNX format or similar (hereafter referred to as the core model). However, to perform text embedding (vectorization), not only the core model is needed, but also preprocessing such as tokenization and post-processing such as pooling. There are various methods for tokenization and pooling, but they are not independent of the core model and need to be used in combination as intended by the developer. Fujitsu Enterprise Postgres uses a model called the ONNX pipeline model, which consolidates the entire pipeline process for performing text embedding into a single ONNX format file. The specifications for ONNX pipeline models that can be imported with this feature are defined as "5.2.6.1 Database-side Model Specifications".
You can use a model with the following specifications as an embedded model.
Item | Constraint | Remarks |
|---|---|---|
Model format | ONNX format | |
Model type (semantics) | Inference Model (Stateless Model) | It is not a learning model (not a stateful model). |
Input tensor | 1-dimensional tensor of STRING type. It may optionally have an additional dimension for batch processing. It should not require any other inputs (without default values). | |
Output tensor | 1-dimensional tensor of FLOAT16 type. It may optionally have an additional dimension for batch processing. | Because the vector type in Fujitsu Enterprise Postgres is a single-precision floating-point type. |
Operator set and operator | Only operators included in the following domain/version operator sets should be used. ai.onnx: Version X to Y ai.onnx.ml: Version X to Y (Operators provided by onnxruntime-extensions) Ensure that the versions of ai.onnx and ai.onnx.ml are supported by the ONNX Runtime that the Triton server uses as its backend. | Even if the conditions are met, operators not provided by onnxruntime's CPUProvider cannot be used. |
Model size | 4TB or less |