18 continuous and 10 discrete probability distributions with a shared API for density, CDF, quantiles, and sampling in kstats-distributions.
kstats-distributions provides a unified API for continuous and discrete probability models. Every distribution supports the same workflow: construct with parameters, inspect statistical properties, evaluate probabilities, compute quantiles, and draw random samples.
Every distribution implements common statistical properties: mean, variance, standardDeviation, skewness, kurtosis, and entropy.The evaluation methods differ between continuous and discrete distributions:
Operation
Continuous
Discrete
Point value
pdf(x) — density
pmf(k) — exact probability
Log point value
logPdf(x)
logPmf(k)
Cumulative
cdf(x) — P(X≤x)
cdf(k) — P(X≤k)
Upper tail
sf(x) — P(X>x)
sf(k) — P(X>k)
Quantile
quantile(p) → Double
quantileInt(p) → Int
Sample
sample(random) → Double
sample(random) → Int
Batch sample
sample(n, random) → DoubleArray
sample(n, random) → IntArray
Constructors validate parameters eagerly. Invalid values (negative standard deviation, probability outside [0, 1], non-positive degrees of freedom) throw InvalidParameterException at construction time, not at evaluation.
The Gaussian distribution. Models data that clusters symmetrically around a mean with a characteristic bell-shaped curve.Parameters:mu — mean, sigma — standard deviation (must be positive)
val d = NormalDistribution(mu = 100.0, sigma = 15.0)d.mean // 100.0d.cdf(115.0) // 0.8413d.quantile(0.975) // 129.3994
Use when data is approximately symmetric and unbounded.
StudentTDistribution(degreesOfFreedom)
Heavier tails than the normal distribution. Approaches the normal as degrees of freedom increase.Parameters:df — degrees of freedom (must be positive)
val d = StudentTDistribution(degreesOfFreedom = 10.0)d.mean // 0.0d.cdf(2.228) // ≈ 0.975d.quantile(0.975) // 2.2281
Use for confidence intervals and t-tests when the sample size is small.
LogisticDistribution(mu, scale)
Similar shape to the normal but with heavier tails. The CDF has a closed-form logistic function.Parameters:location — center, scale — spread parameter (must be positive)
val d = LogisticDistribution(mu = 0.0, scale = 1.0)d.mean // 0.0d.cdf(0.0) // 0.5d.pdf(0.0) // 0.25
Use when a closed-form CDF is needed or data has slightly heavier tails than normal.
CauchyDistribution(location, scale)
Extremely heavy tails. The mean and variance are undefined.Parameters:location — center (median), scale — half-width at half-maximum (must be positive)
val d = CauchyDistribution(location = 0.0, scale = 1.0)d.pdf(0.0) // 0.3183d.cdf(0.0) // 0.5d.quantile(0.75) // 1.0
Use for data with extreme outliers where the mean is not a meaningful summary.
LaplaceDistribution(mu, scale)
Double-exponential distribution. Sharper peak and heavier tails than the normal.Parameters:location — center (mean and median), scale — spread parameter (must be positive)
val d = LaplaceDistribution(mu = 0.0, scale = 1.0)d.mean // 0.0d.variance // 2.0d.pdf(0.0) // 0.5
Use for data with a sharp peak at the center and exponential tails.
ExponentialDistribution(rate)
Models the time between events in a Poisson process. Memoryless property.Parameters:rate — event rate, the reciprocal of the mean (must be positive)
val d = ExponentialDistribution(rate = 2.0)d.mean // 0.5d.cdf(1.0) // 0.8647d.quantile(0.5) // 0.3466
Use for waiting times, durations, and inter-arrival times.
GammaDistribution(shape, rate)
Generalizes the exponential distribution. Models the sum of independent exponential random variables.Parameters:shape — shape parameter k (must be positive), scale — scale parameter θ (must be positive)
val d = GammaDistribution(shape = 2.0, rate = 0.5)d.mean // 4.0d.variance // 8.0d.cdf(4.0) // 0.5940
Use for positive continuous data with right skew, such as aggregate waiting times or rainfall amounts.
WeibullDistribution(shape, scale)
Flexible distribution for reliability and survival analysis.Parameters:shape — shape k (must be positive), scale — scale λ (must be positive)
val d = WeibullDistribution(shape = 1.5, scale = 1.0)d.mean // 0.9027d.cdf(1.0) // 0.6321
Use for time-to-failure data, wind speed modeling, and survival analysis.
LogNormalDistribution(mu, sigma)
The logarithm of the variable is normally distributed. Always positive, right-skewed.Parameters:mu — mean of the log, sigma — standard deviation of the log (must be positive)
val d = LogNormalDistribution(mu = 0.0, sigma = 1.0)d.mean // 1.6487d.quantile(0.5) // 1.0d.cdf(1.0) // 0.5
Use for multiplicative processes: stock prices, biological measurements, file sizes.
NakagamiDistribution(mu, omega)
Models the magnitude of multi-path fading in wireless communications.Parameters:shape — shape m≥0.5, spread — spread Ω (must be positive)
val d = NakagamiDistribution(mu = 1.0, omega = 1.0)d.mean // 0.8862d.variance // 0.2146
Use for signal envelope modeling in fading channels.
LevyDistribution(mu, c)
Heavy-tailed distribution with infinite mean. A special case of the stable distribution.Parameters:location — shift parameter, scale — scale parameter (must be positive)
val d = LevyDistribution(mu = 0.0, c = 1.0)d.cdf(1.0) // 0.3173d.sf(1.0) // 0.6827
Use for extreme-value modeling with very heavy right tails.
BetaDistribution(alpha, beta)
Defined on [0, 1]. Flexible shape controlled by two parameters.Parameters:alpha — shape α (must be positive), beta — shape β (must be positive)
val d = BetaDistribution(alpha = 2.0, beta = 5.0)d.mean // 0.2857d.cdf(0.3) // 0.5798d.pdf(0.2) // 2.4576
Use for proportions, probabilities, and bounded data on [0, 1].
UniformDistribution(min, max)
Equal probability across the interval [a, b].Parameters:a — lower bound, b — upper bound (must satisfy a < b)
val d = UniformDistribution(min = 0.0, max = 10.0)d.mean // 5.0d.variance // 8.3333d.cdf(3.0) // 0.3
Use when all values in a range are equally likely.
TriangularDistribution(a, b, c)
Bounded distribution with a peak at the mode. Defined by minimum, maximum, and mode.Parameters:a — minimum, b — maximum, c — mode (must satisfy a ≤ c ≤ b)
val d = TriangularDistribution(a = 0.0, b = 10.0, c = 3.0)d.mean // 4.3333d.cdf(3.0) // 0.3
Use for rough estimates when only the minimum, maximum, and most likely value are known.
ParetoDistribution(shape, scale)
Power-law distribution. Models phenomena where a small fraction of values account for a large share of the total.Parameters:xm — minimum value (scale, must be positive), alpha — shape (tail index, must be positive)
val d = ParetoDistribution(shape = 2.0, scale = 1.0)d.mean // 2.0d.cdf(2.0) // 0.75
Use for wealth distributions, city populations, and file sizes.
GumbelDistribution(mu, beta)
Extreme value distribution (Type I). Models the maximum of a sample from many distributions.Parameters:location — mode, scale — spread (must be positive)
val d = GumbelDistribution(mu = 0.0, beta = 1.0)d.mean // 0.5772d.cdf(0.0) // 0.3679
Use for extreme-value analysis: maximum temperatures, flood levels, structural loads.
ChiSquaredDistribution(degreesOfFreedom)
The distribution of the sum of squares of standard normal random variables. Central to chi-squared tests and confidence intervals for variance.Parameters:df — degrees of freedom (must be positive)
val d = ChiSquaredDistribution(degreesOfFreedom = 5.0)d.mean // 5.0d.variance // 10.0d.cdf(11.07) // ≈ 0.95
Use internally by chi-squared tests or for direct variance inference.
FDistribution(dfNumerator, dfDenominator)
Ratio of two chi-squared variables divided by their degrees of freedom. Central to F-tests and ANOVA.Parameters:df1 — numerator degrees of freedom (must be positive), df2 — denominator degrees of freedom (must be positive)
val d = FDistribution(dfNumerator = 5.0, dfDenominator = 10.0)d.mean // 1.25d.cdf(3.33) // ≈ 0.95
Models the number of events in a fixed interval when events occur independently at a constant rate.Parameters:lambda — expected number of events (must be positive)
val d = PoissonDistribution(rate = 3.0)d.mean // 3.0d.pmf(5) // 0.1008d.cdf(5) // 0.9161d.quantileInt(0.95) // 6
Use for count data: defects per batch, arrivals per hour, events per day.
BinomialDistribution(trials, probability)
Models the number of successes in a fixed number of independent Bernoulli trials.Parameters:trials — number of trials (must be non-negative), probability — success probability per trial (must be in [0, 1])
val d = BinomialDistribution(trials = 10, probability = 0.3)d.mean // 3.0d.pmf(3) // 0.2668d.cdf(3) // 0.6496d.quantileInt(0.5) // 3
Use for yes/no experiments repeated a known number of times.
Models the number of failures before achieving a specified number of successes.Parameters:r — number of successes (must be positive), p — success probability (must be in (0, 1])
val d = NegativeBinomialDistribution(successes = 5, probability = 0.5)d.mean // 5.0d.variance // 10.0d.pmf(3) // probability of exactly 3 failures before 5 successes
Use for over-dispersed count data or modeling the number of trials until a target is reached.
GeometricDistribution(probability)
Models the number of trials until the first success. A special case of the negative binomial.Parameters:probability — success probability per trial (must be in (0, 1])
val d = GeometricDistribution(probability = 0.3)d.mean // 3.3333d.pmf(1) // 0.3d.cdf(3) // 0.657
Use for “how many tries until it works” questions.
Models the number of successes in draws without replacement from a finite population.Parameters:populationSize — total population, successStates — number of success items, trials — number of draws
val d = HypergeometricDistribution(population = 50, successes = 10, draws = 5)d.mean // 1.0d.pmf(2) // probability of exactly 2 successes in 5 draws
Use when sampling without replacement matters: quality control, card games, lottery problems.
BetaBinomialDistribution(trials, alpha, beta)
Compound distribution where the success probability follows a Beta distribution. More variance than a standard binomial.Parameters:trials — number of trials, alpha — Beta shape parameter, beta — Beta shape parameter
val d = BetaBinomialDistribution(trials = 10, alpha = 2.0, beta = 3.0)d.mean // 4.0d.pmf(4) // probability of exactly 4 successes
Use for over-dispersed binomial data where the success probability varies.
BernoulliDistribution(probability)
A single trial with two outcomes: success (1) or failure (0). The simplest discrete distribution.Parameters:probability — success probability (must be in [0, 1])
val d = BernoulliDistribution(probability = 0.7)d.mean // 0.7d.pmf(1) // 0.7d.pmf(0) // 0.3
Use for coin-flip–like binary outcomes.
UniformDiscreteDistribution(min, max)
Equal probability for each integer in [a, b].Parameters:a — lower bound, b — upper bound (must satisfy a ≤ b)
val d = UniformDiscreteDistribution(min = 1, max = 6)d.mean // 3.5d.pmf(3) // 0.1667d.cdf(3) // 0.5
Use for fair dice, uniform random integer selection.
ZipfDistribution(numberOfElements, exponent)
Power-law distribution over ranks 1 through n. Frequency is inversely proportional to rank.Parameters:n — number of elements (must be positive), s — exponent (must be positive)
val d = ZipfDistribution(numberOfElements = 100, exponent = 1.0)d.pmf(1) // probability of rank 1 (the most common)d.pmf(100) // probability of rank 100 (the least common)
Use for word frequencies, city sizes, and website traffic distributions.
LogarithmicDistribution(probability)
Also known as the log-series distribution. Models the number of species with a given abundance.Parameters:p — parameter in (0, 1)
val d = LogarithmicDistribution(probability = 0.5)d.mean // 1.4427d.pmf(1) // 0.7213d.pmf(2) // 0.1803
Use for species abundance data and similar long-tailed count distributions.
When unsure, start with NormalDistribution for continuous data and PoissonDistribution for counts. These are the most common defaults and serve as reasonable baselines.