Every Retenza response gets a 0β100% confidence score. Understand exactly what drives it, what happens when it's low, and how to tune it for your store.
Every Retenza response gets a 0-100% confidence score. Understand exactly what drives it, what happens when it's low, and how to tune it for your store.
What is a confidence score?
When Retenza generates a reply, it simultaneously evaluates how confident it is that the reply is correct, complete, and appropriate. This evaluation produces a score from 0 to 100%. A score of 95% means the AI is highly confident the reply is accurate. A score of 45% means significant uncertainty β perhaps the intent was ambiguous, or the required Shopify data wasn't available.
What drives the score up
High confidence scores correlate with: clear, single-intent emails (e.g. 'Where is my order #1042?'); successful Shopify order lookup; the intent matching a category your knowledge base covers; the customer's language being one the model handles fluently; and a response length appropriate to the query.
What drives the score down
Low confidence correlates with: ambiguous or multi-intent emails; Shopify order not found (cancelled account, order number typo); a query category not covered in your knowledge base; extremely short or extremely long customer emails; and unusual languages or heavy dialect use.
How to tune your threshold
The default auto-send threshold is 75%. Stores with strict brand voice requirements often raise this to 85-90%; stores prioritising speed often lower it to 65%. You can also set different thresholds per intent category β higher for refund requests (more sensitive), lower for order status (lower risk).
What happens with low-confidence replies
Replies below your threshold are placed in the Review Queue with the confidence score prominently displayed. Agents can review, edit, and approve them with a single click. Every approved low-confidence reply feeds back into the system, gradually improving accuracy on similar queries.