Quick Start¶
Scenario: A/B testing
A/B testing enables you to compare two versions of an app/ML model, and select a winner based on a (business) reward metric and objectives (SLOs). In this tutorial, you will:
- Perform A/B testing.
- Specify user-engagement as the reward metric, and latency and error-rate based objectives. Iter8 will find a winner by comparing the two versions in terms of the reward, and by validating versions in terms of the objectives.
- Use New Relic as the provider for user-engagement metric, and Prometheus as the provider for latency and error-rate metrics.
- Combine A/B testing with progressive deployment.
Iter8 will progressively shift the traffic towards the winner and promote it at the end as depicted below.
Before you begin, you will need...
- The kubectl CLI.
- Kustomize 3+.
- Go 1.13+.
- (Seldon Only) Helm 3+
This tutorial is available for the following K8s stacks.
Istio KFServing Knative Seldon
Please choose the same K8s stack consistently throughout this tutorial. If you wish to switch K8s stacks between tutorials, start from a clean K8s cluster, so that your cluster is correctly setup.
1. Create Kubernetes cluster¶
Create a local cluster using Kind or Minikube as follows, or use a managed Kubernetes cluster. Ensure that the cluster has sufficient resources, for example, 8 CPUs and 12GB of memory.
kind create cluster --wait 5m
kubectl cluster-info --context kind-kind
Ensuring your Kind cluster has sufficient resources
Your Kind cluster inherits the CPU and memory resources of its host. If you are using Docker Desktop, you can set its resources as shown below.
minikube start --cpus 8 --memory 12288
2. Clone Iter8 repo¶
git clone https://github.com/iter8-tools/iter8.git
cd iter8
export ITER8=$(pwd)
3. Install K8s stack and Iter8¶
Choose the K8s stack over which you are performing the A/B testing experiment.
Setup Istio, Iter8, a mock New Relic service, and Prometheus add-on within your cluster.
$ITER8/samples/istio/quickstart/platformsetup.sh
Setup KFServing, Iter8, a mock New Relic service, and Prometheus add-on within your cluster.
$ITER8/samples/kfserving/quickstart/platformsetup.sh
Setup Knative, Iter8, a mock New Relic service, and Prometheus add-on within your cluster. Knative can work with multiple networking layers. So can Iter8's Knative extension.
Choose the networking layer for Knative.
$ITER8/samples/knative/quickstart/platformsetup.sh contour
$ITER8/samples/knative/quickstart/platformsetup.sh kourier
This step requires Python. This will install glooctl
binary under $HOME/.gloo
folder.
$ITER8/samples/knative/quickstart/platformsetup.sh gloo
$ITER8/samples/knative/quickstart/platformsetup.sh istio
Setup Seldon Core, Seldon Analytics and Iter8 within your cluster.
$ITER8/samples/seldon/quickstart/platformsetup.sh
4. Create app/ML model versions¶
Deploy the bookinfo
microservice application including two versions of the productpage
microservice.
kubectl apply -n bookinfo-iter8 -f $ITER8/samples/istio/quickstart/bookinfo-app.yaml
kubectl apply -n bookinfo-iter8 -f $ITER8/samples/istio/quickstart/productpage-v2.yaml
kubectl wait -n bookinfo-iter8 --for=condition=Ready pods --all
Look inside productpage-v2.yaml (v1 is similar)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
|
Deploy two KFServing inference services corresponding to two versions of a TensorFlow classification model, along with an Istio virtual service to split traffic between them.
kubectl apply -f $ITER8/samples/kfserving/quickstart/baseline.yaml
kubectl apply -f $ITER8/samples/kfserving/quickstart/candidate.yaml
kubectl apply -f $ITER8/samples/kfserving/quickstart/routing-rule.yaml
kubectl wait --for=condition=Ready isvc/flowers -n ns-baseline
kubectl wait --for=condition=Ready isvc/flowers -n ns-candidate
Look inside baseline.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
Look inside candidate.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
Look inside routing-rule.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
|
Deploy two versions of a Knative app.
kubectl apply -f $ITER8/samples/knative/quickstart/baseline.yaml
kubectl apply -f $ITER8/samples/knative/quickstart/experimentalservice.yaml
kubectl wait --for=condition=Ready ksvc/sample-app
Look inside baseline.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Look inside experimentalservice.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
Deploy two Seldon Deployments corresponding to two versions of an Iris classification model, along with an Istio virtual service to split traffic between them.
kubectl apply -f $ITER8/samples/seldon/quickstart/baseline.yaml
kubectl apply -f $ITER8/samples/seldon/quickstart/candidate.yaml
kubectl apply -f $ITER8/samples/seldon/quickstart/routing-rule.yaml
kubectl wait --for condition=ready --timeout=600s pods --all -n ns-baseline
kubectl wait --for condition=ready --timeout=600s pods --all -n ns-candidate
Look inside baseline.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
Look inside candidate.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
Look inside routing-rule.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
|
5. Generate requests¶
Generate requests to your app using Fortio as follows.
# URL_VALUE is the URL of the `bookinfo` application
URL_VALUE="http://$(kubectl -n istio-system get svc istio-ingressgateway -o jsonpath='{.spec.clusterIP}'):80/productpage"
sed "s+URL_VALUE+${URL_VALUE}+g" $ITER8/samples/istio/quickstart/fortio.yaml | kubectl apply -f -
Look inside fortio.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
Generate requests to your model as follows.
INGRESS_GATEWAY_SERVICE=$(kubectl get svc -n istio-system --selector="app=istio-ingressgateway" --output jsonpath='{.items[0].metadata.name}')
kubectl port-forward -n istio-system svc/${INGRESS_GATEWAY_SERVICE} 8080:80
curl -o /tmp/input.json https://raw.githubusercontent.com/kubeflow/kfserving/master/docs/samples/v1beta1/rollout/input.json
while true; do
curl -v -H "Host: example.com" localhost:8080/v1/models/flowers:predict -d @/tmp/input.json
sleep 0.2
done
Generate requests using Fortio as follows.
# URL_VALUE is the URL where your Knative application serves requests
URL_VALUE=$(kubectl get ksvc sample-app -o json | jq .status.address.url)
sed "s+URL_VALUE+${URL_VALUE}+g" $ITER8/samples/knative/quickstart/fortio.yaml | kubectl apply -f -
Look inside fortio.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
Generate requests using Fortio as follows.
URL_VALUE="http://$(kubectl -n istio-system get svc istio-ingressgateway -o jsonpath='{.spec.clusterIP}'):80"
sed "s+URL_VALUE+${URL_VALUE}+g" $ITER8/samples/seldon/quickstart/fortio.yaml | sed "s/6000s/600s/g" | kubectl apply -f -
Look inside fortio.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
|
6. Define metrics¶
Iter8 introduces a Kubernetes CRD called Metric that makes it easy to use metrics from RESTful metric providers like Prometheus, New Relic, Sysdig and Elastic during experiments. Define the Iter8 metrics used in this experiment as follows.
kubectl apply -f $ITER8/samples/istio/quickstart/metrics.yaml
Look inside metrics.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
|
kubectl apply -f $ITER8/samples/kfserving/quickstart/metrics.yaml
Look inside metrics.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
|
kubectl apply -f $ITER8/samples/knative/quickstart/metrics.yaml
Look inside metrics.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 |
|
kubectl apply -f $ITER8/samples/seldon/quickstart/metrics.yaml
Look inside metrics.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
|
Metrics in your environment
You can use metrics from any RESTful provider in Iter8 experiments.
In this tutorial, the metrics related to latency and error-rate objectives are collected by the Prometheus instance created in Step 3. The urlTemplate
field in these metrics point to this Prometheus instance. If you wish to use these latency and error-rate metrics with your own application, change the urlTemplate
values to match the URL of your Prometheus instance.
In this tutorial, the user-engagement metric is synthetically generated by a mock New Relic service/Prometheus service. For your application, replace this metric with any business metric you wish to optimize.
7. Launch experiment¶
Iter8 defines a Kubernetes resource called Experiment that automates A/B, A/B/n, Canary, and Conformance experiments. During an experiment, Iter8 can compare multiple versions, find, and safely promote the winning version (winner) based on business metrics and SLOs.
Launch the Iter8 experiment that orchestrates A/B testing for the app/ML model in this tutorial.
kubectl apply -f $ITER8/samples/istio/quickstart/experiment.yaml
Look inside experiment.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
|
kubectl apply -f $ITER8/samples/kfserving/quickstart/experiment.yaml
Look inside experiment.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
|
kubectl apply -f $ITER8/samples/knative/quickstart/experiment.yaml
Look inside experiment.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
|
kubectl apply -f $ITER8/samples/seldon/quickstart/experiment.yaml
Look inside experiment.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
|
The process automated by Iter8 during this experiment is depicted below.
8. Observe experiment¶
Observe the experiment in realtime.
a) Observe metrics¶
Install iter8ctl
. You can change the directory where iter8ctl
binary is installed by changing GOBIN
below.
GO111MODULE=on GOBIN=/usr/local/bin go get github.com/iter8-tools/iter8ctl@v0.1.3
Periodically describe the experiment.
while clear; do
kubectl get experiment quickstart-exp -o yaml | iter8ctl describe -f -
sleep 8
done
Look inside metrics summary
The iter8ctl
output will be similar to the following.
****** Overview ******
Experiment name: quickstart-exp
Experiment namespace: default
Target: default/sample-app
Testing pattern: A/B
Deployment pattern: Progressive
****** Progress Summary ******
Experiment stage: Running
Number of completed iterations: 8
****** Winner Assessment ******
App versions in this experiment: [sample-app-v1 sample-app-v2]
Winning version: sample-app-v2
Version recommended for promotion: sample-app-v2
****** Objective Assessment ******
> Identifies whether or not the experiment objectives are satisfied by the most recently observed metrics values for each version.
+--------------------------------------------+---------------+---------------+
| OBJECTIVE | SAMPLE-APP-V1 | SAMPLE-APP-V2 |
+--------------------------------------------+---------------+---------------+
| iter8-knative/mean-latency <= | true | true |
| 50.000 | | |
+--------------------------------------------+---------------+---------------+
| iter8-knative/95th-percentile-tail-latency | true | true |
| <= 100.000 | | |
+--------------------------------------------+---------------+---------------+
| iter8-knative/error-rate <= | true | true |
| 0.010 | | |
+--------------------------------------------+---------------+---------------+
****** Metrics Assessment ******
> Most recently read values of experiment metrics for each version.
+--------------------------------------------+---------------+---------------+
| METRIC | SAMPLE-APP-V1 | SAMPLE-APP-V2 |
+--------------------------------------------+---------------+---------------+
| iter8-knative/request-count | 1213.625 | 361.962 |
+--------------------------------------------+---------------+---------------+
| iter8-knative/user-engagement | 10.023 | 14.737 |
+--------------------------------------------+---------------+---------------+
| iter8-knative/mean-latency | 1.133 | 1.175 |
| (milliseconds) | | |
+--------------------------------------------+---------------+---------------+
| iter8-knative/95th-percentile-tail-latency | 4.768 | 4.824 |
| (milliseconds) | | |
+--------------------------------------------+---------------+---------------+
| iter8-knative/error-rate | 0.000 | 0.000 |
+--------------------------------------------+---------------+---------------+
As the experiment progresses, you should eventually see that all of the objectives reported as being satisfied by both versions and the candidate improves over the baseline version in terms of the reward metric. The candidate is identified as the winner and is recommended for promotion.
b) Observe traffic¶
kubectl -n bookinfo-iter8 get vs bookinfo -o json --watch | jq .spec.http[0].route
Look inside traffic summary
The kubectl
output will be similar to the following.
[
{
"destination": {
"host": "productpage",
"port": {
"number": 9080
},
"subset": "productpage-v1"
},
"weight": 35
},
{
"destination": {
"host": "productpage",
"port": {
"number": 9080
},
"subset": "productpage-v2"
},
"weight": 65
}
]
kubectl get vs routing-rule -o json --watch | jq .spec.http[0].route
Look inside traffic summary
[
{
"destination": {
"host": "flowers-predictor-default.ns-baseline.svc.cluster.local"
},
"headers": {
"request": {
"set": {
"Host": "flowers-predictor-default.ns-baseline"
}
},
"response": {
"set": {
"version": "flowers-v1"
}
}
},
"weight": 5
},
{
"destination": {
"host": "flowers-predictor-default.ns-candidate.svc.cluster.local"
},
"headers": {
"request": {
"set": {
"Host": "flowers-predictor-default.ns-candidate"
}
},
"response": {
"set": {
"version": "flowers-v2"
}
}
},
"weight": 95
}
]
kubectl get ksvc sample-app -o json --watch | jq .status.traffic
Look inside traffic summary
The kubectl
output will be similar to the following.
[
{
"latestRevision": false,
"percent": 45,
"revisionName": "sample-app-v1",
"tag": "current",
"url": "http://current-sample-app.default.example.com"
},
{
"latestRevision": true,
"percent": 55,
"revisionName": "sample-app-v2",
"tag": "candidate",
"url": "http://candidate-sample-app.default.example.com"
}
]
kubectl get vs routing-rule -o json --watch | jq .spec.http[0].route
Look inside traffic summary
[
{
"destination": {
"host": "iris-default.ns-baseline.svc.cluster.local",
"port": {
"number": 8000
}
},
"headers": {
"response": {
"set": {
"version": "iris-v1"
}
}
},
"weight": 25
},
{
"destination": {
"host": "iris-default.ns-candidate.svc.cluster.local",
"port": {
"number": 8000
}
},
"headers": {
"response": {
"set": {
"version": "iris-v2"
}
}
},
"weight": 75
}
As the experiment progresses, you should see traffic progressively shift from the baseline version to the candidate version.
c) Observe progress¶
kubectl get experiment quickstart-exp --watch
Look inside progress summary
The kubectl
output will be similar to the following.
NAME TYPE TARGET STAGE COMPLETED ITERATIONS MESSAGE
quickstart-exp Canary default/sample-app Running 1 IterationUpdate: Completed Iteration 1
quickstart-exp Canary default/sample-app Running 2 IterationUpdate: Completed Iteration 2
quickstart-exp Canary default/sample-app Running 3 IterationUpdate: Completed Iteration 3
quickstart-exp Canary default/sample-app Running 4 IterationUpdate: Completed Iteration 4
quickstart-exp Canary default/sample-app Running 5 IterationUpdate: Completed Iteration 5
quickstart-exp Canary default/sample-app Running 6 IterationUpdate: Completed Iteration 6
quickstart-exp Canary default/sample-app Running 7 IterationUpdate: Completed Iteration 7
quickstart-exp Canary default/sample-app Running 8 IterationUpdate: Completed Iteration 8
quickstart-exp Canary default/sample-app Running 9 IterationUpdate: Completed Iteration 9
When the experiment completes, you will see the experiment stage change from Running
to Completed
.
Understanding what happened
- You created two versions of your app/ML model.
- You generated requests for your app/ML model versions. At the start of the experiment, 100% of the requests are sent to the baseline and 0% to the candidate.
- You created an Iter8 experiment with A/B testing pattern and progressive deployment pattern. In each iteration, Iter8 observed the latency and error-rate metrics collected by Prometheus, and the user-engagement metric from New Relic/Prometheus; Iter8 verified that the candidate satisfied all objectives, verified that the candidate improved over the baseline in terms of user-engagement, identified candidate as the winner, progressively shifted traffic from the baseline to the candidate, and promoted the candidate.
9. Cleanup¶
kubectl delete -f $ITER8/samples/istio/quickstart/fortio.yaml
kubectl delete -f $ITER8/samples/istio/quickstart/experiment.yaml
kubectl delete namespace bookinfo-iter8
kubectl delete -f $ITER8/samples/kfserving/quickstart/experiment.yaml
kubectl delete -f $ITER8/samples/kfserving/quickstart/baseline.yaml
kubectl delete -f $ITER8/samples/kfserving/quickstart/candidate.yaml
kubectl delete -f $ITER8/samples/knative/quickstart/fortio.yaml
kubectl delete -f $ITER8/samples/knative/quickstart/experiment.yaml
kubectl delete -f $ITER8/samples/knative/quickstart/experimentalservice.yaml
kubectl delete -f $ITER8/samples/seldon/quickstart/fortio.yaml
kubectl delete -f $ITER8/samples/seldon/quickstart/experiment.yaml
kubectl delete -f $ITER8/samples/seldon/quickstart/baseline.yaml
kubectl delete -f $ITER8/samples/seldon/quickstart/candidate.yaml