ExampleAs an example, let’s say we want to create a set of test data that closely matches a real life distribution, in this case population by continent – here’s the population distribution Asia : 4,140 m Africa : 994 m Europe : 738m North America : 528 m South America : 385 m Ociania : 36 m Antartica : .004 m
ResultBased on this distribution we want to create some random test data, that looks like this. The pie chart shows the generated distribution for 1000 rows
Code and formulasFirst of all we setup a couple of lists – one with the list of possibilities, and other with the weights for each of the possibilities. You can use any relative values – they will be normalized to a percentage of the total. In this case I simply used the actual populations. Here’s my comma separated lists, placed in cells d3 and d4.
|Antartica,Asia,Africa,Europe,North America,South America,Oceania|
Function biasedRandom(possibilities, weights) As String Dim w As Variant, a As Variant, p As Variant, _ r As Double, i As Long ' comes in as 2 lists a = Split(weights, ",") p = Split(possibilities, ",") ReDim w(LBound(a) To UBound(a))
' create cumulative For i = LBound(w) To UBound(w) w(i) = CDbl(a(i)) If i > LBound(w) Then w(i) = w(i - 1) + w(i) Next i
' get random index r = Rnd() * w(UBound(w))
' find its weighted position For i = LBound(w) To UBound(w) If (r <= w(i)) Then biasedRandom = p(i) Exit Function End If Next i End Function