Skip to main content

Kotlin Notebook

Try this guide as a Kotlin Notebook with Kandy visualizations — run the cells to see charts and explore the data interactively.
Every distribution encodes assumptions about what values are possible and how likely they are. This guide groups distributions by the kind of data they model and shows how to verify the fit.

Decision Table

Data typeExample domainDistributionConstructor
Symmetric measurements around a centerUser session durationNormalDistributionNormalDistribution(mu, sigma)
Heavier tails than Normal, small samplesEstimated means with few observationsStudentTDistributionStudentTDistribution(degreesOfFreedom)
Positive waiting times or durationsTime between server errorsExponentialDistributionExponentialDistribution(rate)
Positive durations with wear-out patternHardware component lifetimeWeibullDistributionWeibullDistribution(shape, scale)
Right-skewed positive valuesAPI response timesLogNormalDistributionLogNormalDistribution(mu, sigma)
Proportions or rates in [0, 1]Click-through rateBetaDistributionBetaDistribution(alpha, beta)
Event counts per intervalErrors per hourPoissonDistributionPoissonDistribution(rate)
Successes in fixed trialsConversions out of page viewsBinomialDistributionBinomialDistribution(trials, probability)
Overdispersed countsSupport tickets per dayNegativeBinomialDistributionNegativeBinomialDistribution(successes, probability)
Heavy right tail on positive dataIncome distribution, file sizesParetoDistributionParetoDistribution(shape, scale)

Durations and Waiting Times

// API response times — often right-skewed with a long tail
val responseTime = LogNormalDistribution(mu = 4.5, sigma = 0.8)

responseTime.mean              // expected average in ms
responseTime.quantile(0.95)    // P95 latency
responseTime.quantile(0.99)    // P99 latency
responseTime.cdf(200.0)        // probability of responding under 200ms

Counts and Events

// Errors per hour on a production server
val errorsPerHour = PoissonDistribution(rate = 3.2)

errorsPerHour.pmf(0)            // probability of zero errors
errorsPerHour.cdf(5)            // probability of at most 5 errors
errorsPerHour.quantileInt(0.99) // error count exceeded only 1% of the time

Proportions and Rates

// Click-through rate estimated from 120 clicks in 4000 impressions
val ctr = BetaDistribution(alpha = 120.0, beta = 3880.0)

ctr.mean            // point estimate of CTR
ctr.quantile(0.025) // lower bound of 95% credible interval
ctr.quantile(0.975) // upper bound
ctr.cdf(0.035)      // probability that true CTR is below 3.5%

General-Purpose Symmetric

// User session duration in minutes (roughly symmetric)
val sessionDuration = NormalDistribution(mu = 12.5, sigma = 3.2)

sessionDuration.cdf(15.0)      // probability of session under 15 min
sessionDuration.quantile(0.95) // 95th percentile

// Small-sample estimate — heavier tails give more conservative intervals
val smallSampleEstimate = StudentTDistribution(degreesOfFreedom = 8.0)
smallSampleEstimate.quantile(0.975) // critical value for 95% CI

Verifying the Fit

After choosing a distribution, compare it against observed data using the Kolmogorov-Smirnov test.
val processingTimesMs = doubleArrayOf(
    45.2, 51.8, 48.1, 52.3, 47.6, 49.9, 53.1, 46.5, 50.7, 48.8,
    51.2, 47.3, 49.1, 52.8, 46.9, 50.3, 48.5, 51.6, 47.8, 49.4
)

// Fit a Normal from sample statistics
val fitted = NormalDistribution(
    mu = processingTimesMs.mean(),
    sigma = processingTimesMs.standardDeviation()
)

val ks = kolmogorovSmirnovTest(processingTimesMs, fitted)
ks.statistic // KS statistic — smaller means better fit
ks.pValue    // high p-value means data does not contradict the distribution
A non-significant KS test does not prove the distribution is correct — it means the data does not strongly contradict that choice.
Last modified on April 18, 2026