A/B testing has a dirty secret: while the experiment runs, half your users get the worse experience.
If your variant is genuinely 20% better, every day you run the test is a day where 50% of your users are stuck on the inferior version. For a 14-day experiment, that's 7 days of lost value across half your user base.
For some decisions — pricing page layout, onboarding flow — that's fine. You need a rigorous answer and the cost of running the experiment is low.
For other decisions — which AI model to use for each request, which prompt template to serve, which UI suggestion to show — you want to optimize continuously. You want to learn and act at the same time.
That's what multi-armed bandits do.