kstats-sampling provides preprocessing and resampling utilities that sit at the edges of an analysis workflow. The module covers two distinct areas: transforming numeric data and drawing random samples.
Data Transformation
Ranking
rank() replaces numeric values with their ordered positions. Tie handling is controlled by the TieMethod parameter.
Normalization
Two standard scaling methods: z-score standardization (mean 0, standard deviation 1) and min-max scaling.zScore() is appropriate when the downstream method assumes standardized input. minMaxNormalize() scales to [0, 1] by default, or to a custom range.
Math details
Math details
Binning
bin() groups values into equal-width intervals and returns the items in each bin. frequencyTable() returns interval boundaries, counts, relative frequencies, and cumulative frequencies.
bin() returns the actual items that fall into each interval — useful for further processing. frequencyTable() returns summary statistics per bin — useful for histogram-like reports.
Both
bin() and frequencyTable() accept either a bin count (number of bins) or a bin width (size of each interval). The binByDouble() variant accepts a valueSelector function, allowing binning of non-numeric collections by a numeric property.Sampling and Randomness
Random and Bootstrap Sampling
randomSample() draws without replacement. bootstrapSample() draws with replacement — the same element can appear multiple times.
randomSample() draws without replacement — each element appears at most once. The sample size must not exceed the collection size.bootstrapSample() draws with replacement — the sample size can be larger than the collection. This is the basis of bootstrap resampling for estimating confidence intervals and standard errors.Weighted Random Outcomes
WeightedCoin simulates a biased coin flip. WeightedDice simulates a weighted random selection from a set of outcomes.
WeightedDice works with any type as the outcome.
API Reference
Full API Reference
Browse all sampling functions, transformation utilities, and parameter overloads in the Dokka-generated reference.