By referring to the ai.vectorizer_errors view, you can check the details of the errors that occurred during the vectorization process, the time of occurrence, the number of occurrences, etc.
Example) Check the details of the most recent error
rag_database=> SELECT * FROM ai.vectorizer_errors ORDER BY recorded DESC LIMIT 50; -[ RECORD 1 ]------------------------------------------------------------------ -------------------------- id | 1 message | embedding provider failed details | {"provider": "ollama", "error_reason": "model \"all-minilm\" not fou nd, try pulling it first"} recorded | 2025-02-03 06:47:35.958882+00 -[ RECORD 2 ]------------------------------------------------------------------ -------------------------- id | 1 message | embedding provider failed details | {"provider": "ollama", "error_reason": "model \"all-minilm\" not fou nd, try pulling it first"} recorded | 2025-02-03 06:47:41.250279+00
Example) Check the number of errors for a vectorization definition
rag_database=> SELECT COUNT(*) FROM ai.vectorizer_errors WHERE id = pgx_vectorizer.get_vectorizer_id(view_name => 'sample_embeddings'); count ------- 20 (1 row)
Check the details of the error and remove the cause. Some embedded providers have a maximum load per period. If an error occurs because the load exceeds these conditions, use the pgx_vectorizer.alter_vectorizer_schedule or pgx_vectorizer.alter_vectorizer_processing function to adjust the worker schedule or parallelism.