Top
Enterprise Postgres 18 Knowledge DataManagement FeatureUser's Guide

5.2.1 Setting Up Inference Server

The Triton Inference Server to be used as the inference server for this feature must meet the following requirements.

Before using this feature, start the Triton Inference Server on the same machine as the database.

Fujitsu Enterprise Postgres provides a sample Dockerfile that meets the above requirements. To create a container image using this sample file, execute the following command. Set the appropriate label for the volumepath option (v option) so that the directory on the host side is visible from the container. "<x>" indicates the product version.

cp /opt/fsepv<x>server64/share/triton_dockerfile.sample ./triton_dockerfile
$ podman build -f ./triton_dockerfile -t triton_image
$ mkdir -p /path/to/model/repository
$ podman run -d --name triton_container -p8001:8001 -p8002:8002 -v /path/to/model/repository:/models \
triton_image tritonserver --model-repository=/models --http-port=0 --grpc-port=8001 \
--metrics-port=8002 --backend-config=onnxruntime,device=cpu --model-control-mode=explicit \
--log-info=true --log-warning=true --log-error=true --log-verbose=0
$ podman container ls # Confirm that the container is running.

By using systemd, you can automatically start the created container. Below is an example of starting an inference server as a service using a sample file necessary for the automatic start of the container.

$ cp /opt/fsepv<x>server64/share/triton.container.sample \
~/.config/containers/systemd/triton.container
$ systemctl --user daemon-reload
$ systemctl --user start triton.service

To achieve model-level access control, configure mutual TLS authentication(mTLS authentication) on the inference server. For details, refer to "5.4.3 Security".

Enable metrics on the inference server to identify the cause when problems occur. Also, enable timestamp formatting for log output.