Building Blocks¶

We introduce the building blocks of an Iter8 experiment below.

Apps and Versions¶

Iter8 defines an app broadly as any entity that can be deployed (run), versioned, and for which metrics can be collected.

Examples

A stateless K8s application whose versions correspond to deployments.
A stateful K8s application whose versions correspond to statefulsets.
A Knative application whose versions correspond to revisions.
A KFServing inference service, whose versions correspond to model revisions.
A distributed application whose versions correspond to Helm releases.

Objectives (SLOs)¶

Objectives correspond to service-level objectives or SLOs. In Iter8 experiments, objectives are specified as metrics along with acceptable limits on their values. Iter8 will report how versions are performing with respect to these metrics and whether or not they satisfy the objectives.

Examples

The 99^th-percentile tail latency of the application should be under 50 msec.
The precision of the ML model version should be over 92%.
The (average) number of GPU cores consumed by a model should be under 5.0

Reward¶

Reward typically corresponds to a business metric which you wish to optimize during an A/B testing experiment. In Iter8 experiments, reward is specified as a metric along with a preferred direction, which could be high or low.

Examples

User-engagement
Conversion rate
Click-through rate
Revenue
Precision, recall, or accuracy (for ML models)
Number of GPU cores consumed by an ML model

All but the last example above have a preferred direction high; the last example is that of a reward with preferred direction low.

Baseline and candidate versions¶

Every Iter8 experiment involves a baseline version and may also involve zero, one or more candidate versions. Experiments often involve two versions, baseline and a candidate, with the baseline version corresponding to the stable version of your app, and the candidate version corresponds to a canary.

Testing strategy¶

Testing strategy determines how the winning version (winner) in an experiment is identified.

SLO validation¶

SLO validation experiments may involve a single version or two versions.

SLO validation experiment with baseline version and no candidate (conformance testing): If baseline satisfies the objectives, it is the winner. Otherwise, there is no winner.

SLO validation experiment with baseline and candidate versions: If candidate satisfies the objectives, it is the winner. Else, if baseline satisfies the objectives, it is the winner. Else, there is no winner.

SLO validation

A/B testing¶

A/B testing experiments involve a baseline version, a candidate version, and a reward metric. The version which performs best in terms of the reward metric is the winner.

A/B

A/B/n testing¶

A/B/n testing experiments involve a baseline version, two or more candidate versions, and a reward metric. The version which performs best in terms of the reward metric is the winner.

A/B/n

Hybrid (A/B + SLOs) testing¶

Hybrid (A/B + SLOs) testing experiments combine A/B or A/B/n testing on the one hand with SLO validation on the other. Among the versions that satisfy objectives, the version which performs best in terms of the reward metric is the winner. If no version satisfies objectives, then there is no winner.

Hybrid

Rollout strategy¶

Rollout strategy defines how traffic is split between versions during the experiment.

Iter8 makes it easy for you to take total advantage of all the traffic engineering features available in your K8s environment (i.e., supported by the ingress or service mesh technology available in your K8s cluster).

A few common deployment strategies used in Iter8 experiments are described below. In the following description, v1 and v2 refer to the current and new versions of the application respectively.

Simple rollout & rollout¶

This pattern is modeled after the rolling update of a Kubernetes deployment.

After v2 is deployed, it replaces v1.
If v2 is the winner of the experiment, it is retained.
Else, v2 is rolled back and v1 is retained.

All traffic flows to v2 during the experiment.

Simple rollout & rollback

BlueGreen¶

After v2 is deployed, both v1 and v2 are available.
All traffic is routed to v2.
If v2 is the winner of the experiment, all traffic continues to flow to v2.
Else, all traffic is routed back to v1.

BlueGreen

Dark launch¶

After v2 is deployed, it is hidden from end-users.
v2 is not used to serve end-user requests but can still be experimented with.

Built-in load/metrics¶

During the experiment, Iter8 generates load for v2 and collects built-in metrics.

Built-in

Traffic mirroring (shadowing)¶

Mirrored traffic is a replica of the real user requests¹ that is routed to v2, and used to collect metrics for v2.

Mirroring

Canary¶

Canary deployment involves exposing v2 to a small fraction of end-user requests during the experiment before exposing it to a larger fraction of requests or all the requests.

Fixed-%-split¶

A fixed % of end-user requests is sent to v2 and the rest is sent to v1.

Fixed % split

Fixed-%-split with user segmentation¶

Only a specific segment of the users participate in the experiment.
A fixed % of requests from the participating segment is sent to v2. Rest is sent to v1.
All requests from end-users in the non-participating segment is sent to v1.

Fixed % user segmentation

Progressive traffic shift¶

Traffic is incrementally shifted to the winner over multiple iterations.

canary-progressive

Progressive traffic shift with user segmentation¶

Only a specific segment of the users participate in the experiment.
Within this segment, traffic is incrementally shifted to the winner over multiple iterations.
All requests from end-users in the non-participating segment is sent to v1.

canary-progressive-segmentation

Session affinity¶

Session affinity, sometimes referred to as sticky sessions, routes all requests coming from an end-user to the same version consistently throughout the experiment.

User grouping for affinity can be configured based on a number of different attributes of the request including request headers, cookies, query parameters, geo location, user agent (browser version, screen size, operating system) and language.

Session affinity

Version promotion¶

Iter8 can automatically promote the winning version at the end of an experiment.

It is possible to mirror only a certain percentage of the requests instead of all requests. ↩