6.9 Analyzing Audit Logs in SQL

In general, you can quickly and easily analyze the audit log by creating a table for analysis and loading the audit log into that table.
Maybe the analysis table is not updated, so we recommend creating that table as an UNLOGGED table. This is because when loading, the write-ahead log (WAL) is not output and it is not replicated, so it can be loaded faster without interfering with other tasks.

How to Identify the Audit Log Files to Load

To analyze audit logs for a specific time period, load all files that contain audit logs for that time period.
However, do not access dedicated log files with the latest update date. This is because it contains incomplete audit log data that is being written.

Loading Method

The audit log is output in csv format, and can be loaded using the COPY FROM statement or the pgx_loader command.
The following example shows how to load Session Audit Logging output to a dedicated log file.

Creating Tables

Define a table with the columns required to reference the audit log.

$ psql
=# CREATE UNLOGGED TABLE auditlog (
header text,
class text,
sql_start_time timestamp with time zone,
remote_host_name text,
backend_process_id integer,
application_name text,
session_user_name text,
database_name text,
virtual_transaction_id text,
statement_id integer,
substatement_id integer,
command_tag text,
sqlstate text,
object_type text,
object_name text,
error_message text,
sql text,
parameter text
);

Creating an Index
For example, creating an index on a timestamp is useful for time-series analysis.
```
$ psql
=# CREATE INDEX auditlog_time_index ON auditlog (sql_start_time);
```
Loading the Audit Log
Specify a dedicated log file in the FROM clause of COPY FROM.
```
$ psql
=# COPY auditlog FROM
'pgaudit-2024-02-02_150000.log' WITH CSV DELIMITER ',';
```
You can also use the PROGRAM clause to load multiple dedicated log files together.
The following example loads one day of audit logs by specifying wildcard for day part of file name:
```
$ psql
=# COPY auditlog FROM
PROGRAM 'cat pgaudit-2024-02-02_*' WITH CSV DELIMITER ',';
```
```
$ psql
=# COPY auditlog FROM
PROGRAM 'type pgaudit-2024-02-02_*' WITH CSV DELIMITER ',';
```

Audit Log Analysis

Access the table that loaded the audit log.
Loaded records are not in chronological order, but can be sorted by timestamp

$ psql
=# SELECT * FROM auditlog ORDER BY sql_start_time;
 header        |  class |        sql_start_time     | remote_host_name  |  ...
---------------+--------+---------------------------+-------------------+  ...
AUDIT: SESSION | DDL    | 2024-02-02 15:00:49+09    | ::1               |  ...
AUDIT: SESSION | SYSTEM | 2024-02-02 15:00:58+09    |                   |  ...

Considerations when Using the Scalable Audit Log Feature

When you enable the scalable audit log feature, there are many dedicated log files for the target period as you specify for parallel_loggers, so make sure to load all the dedicated log files.
When you disable log_rotation_age, do not analyze the time except target period because not all audit logs may have been loaded.
For example, suppose that you are analyzing the audit log at 11:00-12:00, and you load the file output by logger process 0 at 10:00-13:00 and the file output by logger process 1 at 10:30-12:30, as shown in the following figure. Audit log records up to 10:00-13:00 appears in the table loaded with log files, but some audit logs (Audit log output by logger process 1 at 10:00-10:30 and 12:30-13:00) have not been loaded.
When log_rotation_age is enabled, the timing of the dedicated log file changeover is consistent across all logger processes to prevent such erroneous analysis.