Testing Assumptions

Kotlin Notebook

Try this guide as a Kotlin Notebook with Kandy visualizations — run the cells to see charts and explore the data interactively.

Parametric methods (t-tests, ANOVA, Pearson correlation) assume specific properties of the data. This guide covers the three most common assumptions and the kstats functions for each.

Normality

Available tests

Test	Function	Best for
Shapiro-Wilk	`shapiroWilkTest(sample)`	General-purpose default, samples up to ~5000
Anderson-Darling	`andersonDarlingTest(sample)`	Sensitive to tail deviations
D’Agostino-Pearson	`dagostinoPearsonTest(sample)`	Omnibus check via skewness and kurtosis, n > 20
Jarque-Bera	`jarqueBeraTest(sample)`	Large samples, checks skewness and kurtosis jointly

Run all four on the same dataset

val sensorReadings = doubleArrayOf(
    150.2, 151.8, 149.6, 152.1, 150.9, 151.3, 149.8, 152.5, 150.4, 151.1,
    150.7, 149.5, 151.6, 150.0, 152.3, 151.0, 149.9, 150.8, 151.5, 150.3,
    151.2, 149.7, 150.6, 152.0, 150.1, 151.4, 149.4, 151.9, 150.5, 151.7
)

val shapiro = shapiroWilkTest(sensorReadings)
val anderson = andersonDarlingTest(sensorReadings)
val dagostino = dagostinoPearsonTest(sensorReadings)
val jarqueBera = jarqueBeraTest(sensorReadings)

shapiro.pValue    // Shapiro-Wilk
anderson.pValue   // Anderson-Darling
dagostino.pValue  // D'Agostino-Pearson
jarqueBera.pValue // Jarque-Bera

When tests disagree, prefer Shapiro-Wilk for small to medium samples and Anderson-Darling when tail behavior matters.

Combine with descriptive statistics

val summary = sensorReadings.describe()
summary.skewness // close to 0 for symmetric data
summary.kurtosis // close to 0 (excess) for Normal-like tails

Variance Homogeneity

When comparing groups (t-test, ANOVA), equal variances are often assumed.

Test	Function	Assumes normality?
Levene	`leveneTest(group1, group2, ...)`	No — robust to non-normality
Bartlett	`bartlettTest(group1, group2, ...)`	Yes — more powerful when data is normal
Fligner-Killeen	`flignerKilleenTest(group1, group2, ...)`	No — non-parametric, median-based

Check variances before ANOVA

val batchA = doubleArrayOf(48.2, 47.8, 49.1, 48.5, 47.9, 48.7, 48.3, 49.0, 48.1, 48.6)
val batchB = doubleArrayOf(51.3, 50.8, 52.1, 51.0, 51.7, 50.5, 51.9, 51.2, 50.9, 51.5)
val batchC = doubleArrayOf(49.5, 50.2, 49.8, 50.0, 49.3, 50.4, 49.7, 50.1, 49.6, 50.3)

val levene = leveneTest(batchA, batchB, batchC)
val bartlett = bartlettTest(batchA, batchB, batchC)
val fligner = flignerKilleenTest(batchA, batchB, batchC)

levene.pValue   // Levene
bartlett.pValue // Bartlett
fligner.pValue  // Fligner-Killeen

A high p-value from all three tests supports proceeding with ANOVA or equal-variance t-test.

Then run ANOVA

val anova = oneWayAnova(batchA, batchB, batchC)
anova.fStatistic
anova.pValue

Goodness-of-Fit

Kolmogorov-Smirnov test

Compare observed data against a theoretical distribution.

val temperatureReadings = doubleArrayOf(
    155.2, 154.8, 156.1, 155.5, 154.3, 155.9, 155.0, 156.3, 154.7, 155.4,
    155.8, 154.5, 156.0, 155.3, 154.9, 155.7, 155.1, 156.2, 154.6, 155.6
)

// Fit Normal from sample
val fitted = NormalDistribution(
    mu = temperatureReadings.mean(),
    sigma = temperatureReadings.standardDeviation()
)

val ks = kolmogorovSmirnovTest(temperatureReadings, fitted)
ks.statistic // smaller means better fit
ks.pValue

Chi-squared goodness-of-fit

Test whether observed category counts match expected proportions.

// Defect counts across 5 product categories
val observedDefects = intArrayOf(12, 18, 25, 15, 30)

// Test against uniform expectation (null = equal probability per category)
val uniform = chiSquaredTest(observedDefects)
uniform.pValue

// Test against specific expected counts
val expectedCounts = doubleArrayOf(20.0, 20.0, 20.0, 20.0, 20.0)
val specific = chiSquaredTest(observedDefects, expectedCounts)
specific.pValue

Two-sample KS test

Compare two samples without assuming a specific distribution.

val morningReadings = doubleArrayOf(
    155.2, 154.8, 156.1, 155.5, 154.3, 155.9, 155.0, 156.3, 154.7, 155.4
)
val nightReadings = doubleArrayOf(
    156.1, 155.3, 157.0, 156.5, 155.8, 156.8, 155.5, 157.2, 155.9, 156.3
)

val twoSampleKs = kolmogorovSmirnovTest(morningReadings, nightReadings)
twoSampleKs.pValue // low p-value suggests different underlying distributions

How-To Guides

Tutorials

Testing Assumptions

Kotlin Notebook

Normality

Available tests

Run all four on the same dataset

Combine with descriptive statistics

Variance Homogeneity

Check variances before ANOVA

Then run ANOVA

Goodness-of-Fit

Kolmogorov-Smirnov test

Chi-squared goodness-of-fit

Two-sample KS test

How-To Guides

Tutorials

Documentation Index

Kotlin Notebook

​Normality

​Available tests

​Run all four on the same dataset

​Combine with descriptive statistics

​Variance Homogeneity

​Check variances before ANOVA

​Then run ANOVA

​Goodness-of-Fit

​Kolmogorov-Smirnov test

​Chi-squared goodness-of-fit

​Two-sample KS test

Normality

Available tests

Run all four on the same dataset

Combine with descriptive statistics

Variance Homogeneity

Check variances before ANOVA

Then run ANOVA

Goodness-of-Fit

Kolmogorov-Smirnov test

Chi-squared goodness-of-fit

Two-sample KS test