Session Affinity¶
Scenario: Canary rollout with session affinity
Session affinity ensures that the version to which a particular user's request is routed remains consistent throughout the duration of the experiment. In this tutorial, you will use an experiment involving two user groups, 1 and 2. Reqeusts from user group 1 will have a userhash
header value prefixed with 111
and will be routed to the baseline version. Requests from user group 2 will have a userhash
header value prefixed with 101
and will be routed to the candidate version. The experiment is shown below.
Platform setup
Follow these steps to install Iter8, KFServing and Prometheus in your K8s cluster.
1. Create ML model versions¶
Deploy two KFServing inference services corresponding to two versions of a TensorFlow classification model, along with an Istio virtual service to split traffic between them.
kubectl apply -f $ITER8/samples/kfserving/quickstart/baseline.yaml
kubectl apply -f $ITER8/samples/kfserving/quickstart/candidate.yaml
kubectl apply -f $ITER8/samples/kfserving/session-affinity/routing-rule.yaml
kubectl wait --for=condition=Ready isvc/flowers -n ns-baseline
kubectl wait --for=condition=Ready isvc/flowers -n ns-candidate
Istio virtual service
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
|
2. Generate requests¶
Generate requests to your model as follows.
INGRESS_GATEWAY_SERVICE=$(kubectl get svc -n istio-system --selector="app=istio-ingressgateway" --output jsonpath='{.items[0].metadata.name}')
kubectl port-forward -n istio-system svc/${INGRESS_GATEWAY_SERVICE} 8080:80
curl -o /tmp/input.json https://raw.githubusercontent.com/kubeflow/kfserving/master/docs/samples/v1beta1/rollout/input.json
while true; do
curl -v -H "Host: example.com" -H "userhash: 1111100000" localhost:8080/v1/models/flowers:predict -d @/tmp/input.json
sleep 0.29
done
curl -o /tmp/input.json https://raw.githubusercontent.com/kubeflow/kfserving/master/docs/samples/v1beta1/rollout/input.json
while true; do
curl -v -H "Host: example.com" -H "userhash: 1010101010" localhost:8080/v1/models/flowers:predict -d @/tmp/input.json
sleep 2.0
done
3. Define metrics¶
Please follow Step 3 of the quick start tutorial.
4. Launch experiment¶
kubectl apply -f $ITER8/samples/kfserving/session-affinity/experiment.yaml
Look inside experiment.yaml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
|
5. Observe experiment¶
Follow these steps to observe your experiment.
6. Cleanup¶
kubectl delete -f $ITER8/samples/kfserving/session-affinity/experiment.yaml
kubectl delete -f $ITER8/samples/kfserving/session-affinity/routing-rule.yaml
kubectl delete -f $ITER8/samples/kfserving/quickstart/candidate.yaml
kubectl delete -f $ITER8/samples/kfserving/quickstart/baseline.yaml