Top
Enterprise Postgres 17 SP1 Knowledge DataManagement FeatureUser's Guide

3.6.1 Checking the Vectorization Queue

You can check the number of texts waiting to be vectorized for each vectorization definition by referencing the ai.vectorizer_status view, and you can check the number for a specific vectorization definition by using the ai.vectorizer_queue_pending function.

Example) Check the ai.vectorizer_status view

rag_database=> SELECT * FROM ai.vectorizer_status;
-[ RECORD 1 ]-+--------------------------------------
id            | 1
source_table  | public.sample_table
target_table  | public.sample_embeddings_store
view          | public.sample_embeddings
pending_items | 1000
disabled      | f

Example) Check with the ai.vectorizer_queue_pending function

SELECT ai.vectorizer_queue_pending( pgx_vectorizer.get_vectorizer_id(view_name => 'sample_embeddings') );
-[ RECORD 1 ]------------+--
vectorizer_queue_pending | 1000

If this value remains at 0 or a low value, you can determine that the vectorization process is on time. If this value tends to increase beyond the execution interval specified in the schedule or data addition interval, refer to 3.6.2 Checking the Status of Vectorization Processing" to check whether an error has occurred in the vectorization process, and refer to "3.6.3 Checking the Scheduler for Vectorization Processing" to check whether the vectorize scheduler is running.

If no errors have occurred, the vectorization processing speed is likely slow compared to the data addition speed. If the load is temporarily high, start a temporary vectorization process with the pgx_vectorizer.run_vectorize_worker function. If not, change the schedule with the pgx_vectorizer.alter_vectorizer_schedule function, or change the parallelism or the upper limit of the amount of data to be processed in one startup with the pgx_vectorizer.alter_vectorizer_processing function.

Example) Changing the execution interval to 5 minutes

SELECT pgx_vectorizer.alter_vectorizer_schedule(pgx_vectorizer.get_vectorizer_id(view_name => 'sample_embeddings'), interval '5 m');

Example) When changing the parallelism to 2 and the amount of data to be processed at one time to 200

SELECT pgx_vectorizer.alter_vectorizer_processing(pgx_vectorizer.get_vectorizer_id(view_name => 'sample_embeddings'), batch_size => 200, concurrency => 2);

Use a monitoring tool to check the time trend of the number of texts waiting for vectorization. If there is an error in the parameter value set by set_worker_setting, the vectorization process will not be executed. A message will be output to the server log, so please check it together with the ai.vectorizer_status view.