You can check the number of texts waiting to be vectorized for each vectorization definition by referencing the ai.vectorizer_status view, and you can check the number for a specific vectorization definition by using the ai.vectorizer_queue_pending function.
Example) Check the ai.vectorizer_status view
rag_database=> SELECT * FROM ai.vectorizer_status; -[ RECORD 1 ]-+-------------------------------------- id | 1 source_table | public.sample_table target_table | public.sample_embeddings_store view | public.sample_embeddings pending_items | 1000 disabled | f
Example) Check with the ai.vectorizer_queue_pending function
SELECT ai.vectorizer_queue_pending( pgx_vectorizer.get_vectorizer_id(view_name => 'sample_embeddings') ); -[ RECORD 1 ]------------+-- vectorizer_queue_pending | 1000
If this value remains at 0 or a low value, you can determine that the vectorization process is on time. If this value tends to increase beyond the execution interval specified in the schedule or data addition interval, refer to 3.6.2 Checking the Status of Vectorization Processing" to check whether an error has occurred in the vectorization process, and refer to "3.6.3 Checking the Scheduler for Vectorization Processing" to check whether the vectorize scheduler is running.
If no errors have occurred, the vectorization processing speed is likely slow compared to the data addition speed. If the load is temporarily high, start a temporary vectorization process with the pgx_vectorizer.run_vectorize_worker function. If not, change the schedule with the pgx_vectorizer.alter_vectorizer_schedule function, or change the parallelism or the upper limit of the amount of data to be processed in one startup with the pgx_vectorizer.alter_vectorizer_processing function.
Example) Changing the execution interval to 5 minutes
SELECT pgx_vectorizer.alter_vectorizer_schedule(pgx_vectorizer.get_vectorizer_id(view_name => 'sample_embeddings'), interval '5 m');
Example) When changing the parallelism to 2 and the amount of data to be processed at one time to 200
SELECT pgx_vectorizer.alter_vectorizer_processing(pgx_vectorizer.get_vectorizer_id(view_name => 'sample_embeddings'), batch_size => 200, concurrency => 2);
Use a monitoring tool to check the time trend of the number of texts waiting for vectorization. If there is an error in the parameter value set by set_worker_setting, the vectorization process will not be executed. A message will be output to the server log, so please check it together with the ai.vectorizer_status view.