Skip to main content

Kotlin Notebook

Try this guide as a Kotlin Notebook with Kandy visualizations — run the cells to see charts and explore the data interactively.
This guide walks through an A/B test comparing two checkout flow variants in a mobile app. The primary metric is session duration (seconds); the secondary metric is number of completed steps.

Experiment Data

// Variant A (control): original checkout flow
val controlDurationSec = doubleArrayOf(
    34.2, 41.5, 38.7, 45.1, 36.9, 42.3, 39.8, 44.6, 37.4, 40.1,
    43.2, 35.8, 41.9, 38.3, 46.0, 39.5, 42.7, 37.1, 40.8, 44.3
)

// Variant B (treatment): simplified checkout flow
val treatmentDurationSec = doubleArrayOf(
    29.1, 33.8, 31.5, 35.2, 28.7, 32.4, 30.9, 34.6, 29.8, 33.1,
    31.2, 27.5, 34.0, 30.3, 36.1, 31.8, 33.5, 28.9, 32.7, 35.8
)

Step 1: Summarize Both Groups

val controlSummary = controlDurationSec.describe()
val treatmentSummary = treatmentDurationSec.describe()

controlSummary.mean                // control average
treatmentSummary.mean              // treatment average
controlSummary.standardDeviation   // control spread
treatmentSummary.standardDeviation // treatment spread

Step 2: Check Assumptions

Normality

val controlNormality = shapiroWilkTest(controlDurationSec)
val treatmentNormality = shapiroWilkTest(treatmentDurationSec)

controlNormality.pValue
treatmentNormality.pValue

Variance homogeneity

val variances = leveneTest(controlDurationSec, treatmentDurationSec)
variances.pValue

Step 3: Choose and Run the Test

// Welch's t-test (default: equalVariances = false)
val result = tTest(controlDurationSec, treatmentDurationSec)

result.statistic
result.pValue
result.confidenceInterval // 95% CI for the difference in means
result.isSignificant()    // true if p < 0.05
If the Levene test confirmed equal variances:
val equalVar = tTest(
    controlDurationSec,
    treatmentDurationSec,
    equalVariances = true
)
equalVar.pValue

One-sided tests

When you expect the treatment to reduce session duration:
val oneSided = tTest(
    controlDurationSec,
    treatmentDurationSec,
    alternative = Alternative.GREATER // control > treatment
)
oneSided.pValue

Step 4: Test a Second Metric

Apply the same workflow to the secondary metric.
// Number of completed checkout steps per session
val controlSteps = doubleArrayOf(
    3.0, 4.0, 3.0, 5.0, 3.0, 4.0, 4.0, 5.0, 3.0, 4.0,
    4.0, 3.0, 4.0, 3.0, 5.0, 4.0, 4.0, 3.0, 4.0, 5.0
)
val treatmentSteps = doubleArrayOf(
    5.0, 5.0, 4.0, 5.0, 5.0, 5.0, 4.0, 5.0, 5.0, 5.0,
    4.0, 5.0, 5.0, 4.0, 5.0, 5.0, 5.0, 4.0, 5.0, 5.0
)

// Discrete step counts are typically non-normal
shapiroWilkTest(controlSteps).pValue

val stepsResult = mannWhitneyUTest(controlSteps, treatmentSteps)
stepsResult.pValue
stepsResult.isSignificant()

Step 5: Correlation Between Metrics

Check whether the two metrics move together within each group.
// Within the treatment group: do faster sessions correlate with more completed steps?
val correlation = spearmanCorrelation(treatmentDurationSec, treatmentSteps)

correlation.coefficient // negative means shorter sessions correlate with more steps
correlation.pValue
Spearman correlation is preferred here because one metric (steps) is ordinal.

Paired Before/After Comparison

When the same users are measured before and after a change, use paired tests.
val beforeMs = doubleArrayOf(
    340.2, 415.0, 387.1, 451.3, 369.5, 423.8, 398.0, 446.2, 374.1, 401.5
)
val afterMs = doubleArrayOf(
    310.5, 380.2, 355.8, 410.7, 335.1, 392.4, 365.3, 405.9, 340.8, 371.6
)

val paired = pairedTTest(beforeMs, afterMs)
paired.pValue
paired.confidenceInterval

// Non-parametric alternative
val wilcoxon = wilcoxonSignedRankTest(beforeMs, afterMs)
wilcoxon.pValue
Last modified on March 22, 2026