When using this feature in a streaming replication or database multiplexing configuration, set up and start the Triton Inference Server on the standby server as well.
In a streaming replication or database multiplexing system, the following operations can only be performed on the primary server.
Model import
Model load request/unload request
Deletion of imported model
Ggranting access privilege to model/revoking access privilege
These operations performed on the primary server are all replicated to the standby server.