Scenario: Hybrid (A/B + SLOs) testing and progressive traffic shift of Seldon models
Hybrid (A/B + SLOs) testing enables you to combine A/B or A/B/n testing with a reward metric on the one hand with SLO validation using objectives on the other. Among the versions that satisfy objectives, the version which performs best in terms of the reward metric is the winner. In this tutorial, you will:
Perform hybrid (A/B + SLOs) testing.
Specify user-engagement as the reward metric; data for this metric will be provided by Prometheus.
Specify latency and error-rate based objectives; data for these metrics will be provided by Prometheus.
Combine hybrid (A/B + SLOs) testing with progressive traffic shift. Iter8 will progressively shift traffic towards the winner and promote it at the end as depicted below.
Platform setup
Follow these steps to install Seldon and Iter8 in your K8s cluster.
Deploy two Seldon Deployments corresponding to two versions of an Iris classification model, along with an Istio virtual service to split traffic between them.
Iter8 defines a custom K8s resource called Metric that makes it easy to use metrics from RESTful metric providers like Prometheus, New Relic, Sysdig and Elastic during experiments. Define the Iter8 metrics used in this experiment as follows.
apiVersion:v1kind:Namespacemetadata:name:iter8-seldon---apiVersion:iter8.tools/v2alpha2kind:Metricmetadata:name:95th-percentile-tail-latencynamespace:iter8-seldonspec:description:95th percentile tail latencyjqExpression:.data.result[0].value[1] | tonumberparams:-name:queryvalue:|histogram_quantile(0.95, sum(rate(seldon_api_executor_client_requests_seconds_bucket{seldon_deployment_id='$sid',kubernetes_namespace='$ns'}[${elapsedTime}s])) by (le))provider:prometheussampleSize:iter8-seldon/request-counttype:Gaugeunits:millisecondsurlTemplate:http://seldon-core-analytics-prometheus-seldon.seldon-system/api/v1/query---apiVersion:iter8.tools/v2alpha2kind:Metricmetadata:name:error-countnamespace:iter8-seldonspec:description:Number of error responsesjqExpression:.data.result[0].value[1] | tonumberparams:-name:queryvalue:|sum(increase(seldon_api_executor_server_requests_seconds_count{code!='200',seldon_deployment_id='$sid',kubernetes_namespace='$ns'}[${elapsedTime}s])) or on() vector(0)provider:prometheustype:CounterurlTemplate:http://seldon-core-analytics-prometheus-seldon.seldon-system/api/v1/query---apiVersion:iter8.tools/v2alpha2kind:Metricmetadata:name:error-ratenamespace:iter8-seldonspec:description:Fraction of requests with error responsesjqExpression:.data.result[0].value[1] | tonumberparams:-name:queryvalue:|(sum(increase(seldon_api_executor_server_requests_seconds_count{code!='200',seldon_deployment_id='$sid',kubernetes_namespace='$ns'}[${elapsedTime}s])) or on() vector(0)) / (sum(increase(seldon_api_executor_server_requests_seconds_count{seldon_deployment_id='$sid',kubernetes_namespace='$ns'}[${elapsedTime}s])) or on() vector(0))provider:prometheussampleSize:iter8-seldon/request-counttype:GaugeurlTemplate:http://seldon-core-analytics-prometheus-seldon.seldon-system/api/v1/query---apiVersion:iter8.tools/v2alpha2kind:Metricmetadata:name:mean-latencynamespace:iter8-seldonspec:description:Mean latencyjqExpression:.data.result[0].value[1] | tonumberparams:-name:queryvalue:|(sum(increase(seldon_api_executor_client_requests_seconds_sum{seldon_deployment_id='$sid',kubernetes_namespace='$ns'}[${elapsedTime}s])) or on() vector(0)) / (sum(increase(seldon_api_executor_client_requests_seconds_count{seldon_deployment_id='$sid',kubernetes_namespace='$ns'}[${elapsedTime}s])) or on() vector(0))provider:prometheussampleSize:iter8-seldon/request-counttype:Gaugeunits:millisecondsurlTemplate:http://seldon-core-analytics-prometheus-seldon.seldon-system/api/v1/query---apiVersion:iter8.tools/v2alpha2kind:Metricmetadata:name:request-countnamespace:iter8-seldonspec:description:Number of requestsjqExpression:.data.result[0].value[1] | tonumberparams:-name:queryvalue:|sum(increase(seldon_api_executor_client_requests_seconds_sum{seldon_deployment_id='$sid',kubernetes_namespace='$ns'}[${elapsedTime}s])) or on() vector(0)provider:prometheustype:CounterurlTemplate:http://seldon-core-analytics-prometheus-seldon.seldon-system/api/v1/query---apiVersion:iter8.tools/v2alpha2kind:Metricmetadata:name:user-engagementnamespace:iter8-seldonspec:description:Number of feedback requestsjqExpression:.data.result[0].value[1] | tonumberparams:-name:queryvalue:|sum(increase(seldon_api_executor_server_requests_seconds_count{service='feedback',seldon_deployment_id='$sid',kubernetes_namespace='$ns'}[${elapsedTime}s])) or on() vector(0)provider:prometheustype:GaugeurlTemplate:http://seldon-core-analytics-prometheus-seldon.seldon-system/api/v1/query
Metrics in your environment
You can define and use custom metrics from any database in Iter8 experiments.
For your application, replace the mocked user-engagement metric used in this tutorial with any custom metric you wish to optimize in the hybrid (A/B + SLOs) test. Documentation on defining custom metrics is here.
Launch the hybrid (A/B + SLOs) testing & progressive traffic shift experiment as follows. This experiment also promotes the winning version of the model at the end.
apiVersion:iter8.tools/v2alpha2kind:Experimentmetadata:name:quickstart-expspec:target:irisstrategy:testingPattern:A/BdeploymentPattern:Progressiveactions:# when the experiment completes, promote the winning version using kubectl applyfinish:-if:CandidateWon()run:"kubectlapply-fhttps://raw.githubusercontent.com/iter8-tools/iter8/master/samples/seldon/quickstart/promote-v2.yaml"-if:not CandidateWon()run:"kubectlapply-fhttps://raw.githubusercontent.com/iter8-tools/iter8/master/samples/seldon/quickstart/promote-v1.yaml"criteria:requestCount:iter8-seldon/request-countrewards:# Business rewards-metric:iter8-seldon/user-engagementpreferredDirection:High# maximize user engagementobjectives:-metric:iter8-seldon/mean-latencyupperLimit:2000-metric:iter8-seldon/95th-percentile-tail-latencyupperLimit:5000-metric:iter8-seldon/error-rateupperLimit:"0.01"duration:intervalSeconds:10iterationsPerLoop:5versionInfo:# information about model versions used in this experimentbaseline:name:iris-v1weightObjRef:apiVersion:networking.istio.io/v1alpha3kind:VirtualServicename:routing-rulenamespace:defaultfieldPath:.spec.http[0].route[0].weightvariables:-name:nsvalue:ns-baseline-name:sidvalue:iriscandidates:-name:iris-v2weightObjRef:apiVersion:networking.istio.io/v1alpha3kind:VirtualServicename:routing-rulenamespace:defaultfieldPath:.spec.http[0].route[1].weightvariables:-name:nsvalue:ns-candidate-name:sidvalue:iris