Name
pgx_loader --Loads data from a file into a PostgreSQL table.
Overview
pgx_loader load -c command [options...]
pgx_loader recovery -t table
Description
The pgx_loader command loads data from an external file into PostgreSQL tables, and commits or rolls back transactions prepared during data load.
In load mode, data is loaded at high speed by executing the COPY FROM command specified in command at a certain degree of parallelism. If load is completed successfully, the message below is output to the standard output.
LOAD count
Note: count is the number of rows loaded.
Refer the pgx_stat_progress_loader view to see the progress of the load process.
In recovery mode, commit or rollback of transactions prepared during data load is performed.
See
Refer to "pgx_stat_progress_loader" in the Operation Guide for pgx_stat_progress_loader view.
Options
Display the executed command in the standard output.
Specify the COPY FROM command to be executed. If STDIN is specified for the FROM clause, data will be loaded from the standard input. In this case, SQL superuser privileges (or having one of the roles pg_read_server_files or pg_execute_server_program) are not required, because local user access privileges will be used for external files and external programs, instead of database server access privileges.
"binary" cannot be specified for the FORMAT option of the COPY FROM command specified in this option. Therefore, input files in binary format cannot be specified.
The FREEZE option cannot be specified for the COPY FROM command specified in this option.
See
Refer to "COPY" in the PostgreSQL Documentation for information on the COPY FROM command.
Specify the number of background workers (parallel workers) that the COPY COMMAND should use to simultaneously perform data conversion, table creation, and index creation. This option can dramatically reduce the time for loading a large amount of data on instances that runs on multiple processor machines.
The optimal value depends on the server, client, and network configurations. The number of CPU cores and disk configuration also affect the optimal value. The number of CPU cores of the server is recommended as the initial value to try. Naturally, if a value that is too large is used, performance degradation will occur due to thrashing and context switching.
Specify a value from 2 to 128. The default is 2.
Complete the prepared transactions only for the specified table.
Show how to use pgx_loader command line arguments, and exit.
The command line options below control the database connection parameters.
Specify the database name to connect to.
If this option is not specified, the PGDATABASE environment variable will be used. If the environment variable is not set, your operating-system user name will be used.
Specify the host name of the machine the database server runs on. If the specified value starts with a slash, it will be used as the directory for a Unix domain socket.
If this option is not specified, the PGHOST environment variable will be used. If the environment variable is not set, it will be considered a Unix domain socket connection.
Specify the TCP port to be used by the server to monitor the connection, or extension of the local Unix domain socket file.
If this option is not specified, the PGPORT environment variable will be use. If the environment variables is not set, 27500 will be used.
User name for connection to the database.
Never prompt for the password. If the server requires password authentication but other means (such as a .pgpass file) are not available, the connection attempt will fail. This option can be useful in batch jobs, scripts, and so on, where no user is present to enter a password.
Force pgx_loader to prompt for the password before connecting to the database. This option is never essential, since pgx_loader will automatically prompt for the password if the server demands password authentication. However, pgx_loader will waste a connection attempt finding out if the server requires a password. In some cases it is worth specifying this option to avoid the extra connection attempt.
Diagnostics
0: Normal exit
Other than 0: Abnormal exit
0: There are no prepared transactions that must be completed
3: A prepared transaction was committed
4: A prepared transaction was rolled back
Other than the above: Abnormal exit
Note
The order of the table rows loaded by pgx_loader may not match the order of the file lines. This is because the file lines will have been inserted into the table in parallel, by multiple parallel workers.
Example
The example below loads the file /path/to/data.csv (2000 records) into table tbl using a degree of parallelism of 3.
$ pgx_loader load -j 3 -c "COPY tbl FROM '/path/to/data.csv' WITH CSV" LOAD 2000
The example below reads the file /path/to/data.csv (2000 records) from the standard input and loads into table tbl using a degree of parallelism of 3.
$ pgx_loader load -j 3 -c "COPY tbl FROM STDIN WITH CSV" < /path/to/data.csv LOAD 2000
The example below completes the transactions prepared for table tbl.
$ pgx_loader recovery -t tbl