In database multiplexing mode, perform the monitoring below.
Operating system or server failures, and no-response state
By generating a heartbeat between Mirroring Controller on each server, operating system or server errors are detected and acknowledged between the relevant servers.
The optimal operating method for environments where database multiplexing mode is performed can be selected from the following:
Use the arbitration server to perform automatic degradation (switch/disconnect)
This is the default method.
The arbitration server objectively determines the status of database servers, then isolates and degrades from the cluster system the ones with an unstable status.
Refer to "Database degradation using the arbitration server" for details.
Call the user exit (user command) that will perform the degradation decision, and perform automatic degradation
If the arbitration server cannot be installed, select if arbitration processing can be performed by the user instead.
Mirroring Controller queries the user exit on whether to degrade. The user exit determines the status of the database server, and notifies Mirroring Controller whether to perform degradation.
Refer to "Database degradation using the arbitration command" for details.
Notification messages
Use this method if using a two-database server configuration.
Mirroring Controller outputs messages to the system log when an abnormality is detected. This ensures that a split brain will not occur due to a heartbeat abnormality - however, automatic switching will not be performed if the primary server operating system or server fails or becomes unresponsive.
Perform automatic degradation unconditionally after a heartbeat abnormality
This method is handled as in FUJITSU Enterprise Postgres 9.6 or earlier versions.
This method is not recommended, because Mirroring Controller unconditionally will perform automatic degradation after heartbeat abnormalities.
Database process failures, and no-response state
Mirroring Controller periodically accesses the database processes and checks the status. A process error is detected by monitoring whether an access timeout occurs.
Disk failure
Mirroring Controller periodically creates files on the data storage destination disk below. A disk error is detected when an I/O error occurs.
Data storage destination disk
Transaction log storage destination disk
Tablespace storage destination disk
Failures that can be detected are those that physically affect the entire system, such as disk header or device power failures.
Streaming replication issue
Mirroring Controller detects streaming replication issues (log transfer network and WAL send/receive processes) by periodically accessing the PostgreSQL system views.
Mirroring Controller process failure and no response
In order to continue the monitoring process on Mirroring Controller, Mirroring Controller process failures and no responses are also monitored.
The Mirroring Controller monitoring process detects Mirroring Controller process failures and no responses by periodically querying the Mirroring Controller process. If an issue is detected, Mirroring Controller is automatically restarted by the Mirroring Controller monitoring process.
Point
If output of messages is selected as the operation to be performed when a heartbeat abnormality is detected during heartbeat monitoring of the operating system or server, automatic degradation will not be performed.
However, if an issue in the WAL send process is detected on the primary server, then the standby server will be disconnected, and as a result an automatic disconnection may be performed even if the standby server operating system or server fails or becomes unresponsive.
You can select in the parameters if the primary server will be switched if a database process is unresponsive or if tablespace storage destination disk failure is detected. Refer to "Appendix A Parameters" for details.
If the standby server was disconnected, Mirroring Controller will automatically comment out the synchronous_standby_names parameter and synchronized_standby_slots parameter in the postgresql.conf file of the primary server. Accordingly, you can prevent the application processing for the primary server being stopped.
Note
If the role of primary server was switched to another server and then starts degrading, the original primary server will not become the standby server automatically. Remove the cause of the error, and then change the role of the original primary server to the server currently acting as standby server. Refer to "4.1 Action Required when Server Degradation Occurs" for details.