A/B Testing¶
Scenario: A/B testing and progressive traffic shift for KFServing models
A/B testing enables you to compare two versions of an ML model, and select a winner based on a (business) reward metric. In this tutorial, you will:
- Perform A/B testing.
- Specify user-engagement as the reward metric. This metric will be mocked by Iter8 in this tutorial.
- Combine A/B testing with progressive traffic shifting. Iter8 will progressively shift traffic towards the winner and promote it at the end as depicted below.
Platform setup
Follow these steps to install Iter8, KFServing and Prometheus in your K8s cluster.
1. Create ML model versions¶
Deploy two KFServing inference services corresponding to two versions of a TensorFlow classification model, along with an Istio virtual service to split traffic between them.
kubectl apply -f $ITER8/samples/kfserving/quickstart/baseline.yaml
kubectl apply -f $ITER8/samples/kfserving/quickstart/candidate.yaml
kubectl apply -f $ITER8/samples/kfserving/quickstart/routing-rule.yaml
kubectl wait --for=condition=Ready isvc/flowers -n ns-baseline
kubectl wait --for=condition=Ready isvc/flowers -n ns-candidate
Look inside baseline.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
Look inside candidate.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
Look inside routing-rule.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
|
2. Generate requests¶
Generate requests for your model as follows.
INGRESS_GATEWAY_SERVICE=$(kubectl get svc -n istio-system --selector="app=istio-ingressgateway" --output jsonpath='{.items[0].metadata.name}')
kubectl port-forward -n istio-system svc/${INGRESS_GATEWAY_SERVICE} 8080:80
curl -o /tmp/input.json https://raw.githubusercontent.com/kubeflow/kfserving/master/docs/samples/v1beta1/rollout/input.json
watch --interval 0.2 -x curl -v -H "Host: example.com" localhost:8080/v1/models/flowers:predict -d @/tmp/input.json
3. Define metrics¶
Iter8 defines a custom K8s resource called Metric that makes it easy to use metrics from RESTful metric providers like Prometheus, New Relic, Sysdig and Elastic during experiments.
For the purpose of this tutorial, you will mock the user-engagement metric as follows.
kubectl apply -f $ITER8/samples/kfserving/quickstart/metrics.yaml
Look inside metrics.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Metrics in your environment
You can define and use custom metrics from any database in Iter8 experiments.
For your application, replace the mocked metric used in this tutorial with any custom metric you wish to optimize in the A/B test. Documentation on defining custom metrics is here.
4. Launch experiment¶
Launch the A/B testing & progressive traffic shift experiment as follows. This experiment also promotes the winning version of the model at the end.
kubectl apply -f $ITER8/samples/kfserving/quickstart/experiment.yaml
Look inside experiment.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
|
5. Observe experiment¶
Follow these steps to observe your experiment.
6. Cleanup¶
kubectl delete -f $ITER8/samples/kfserving/quickstart/experiment.yaml
kubectl delete -f $ITER8/samples/kfserving/quickstart/baseline.yaml
kubectl delete -f $ITER8/samples/kfserving/quickstart/candidate.yaml