Q: How do I generate random numbers of specified frequency distribution?

A: Visual C++ has a function to generate random numbers (integers) between 0 and predefined integer ‘RAND_MAX’. This can be used to generate random numbers of integers within a range, say ‘IMax’ and ‘IMin’:

Code:

int I = IMin + rand() % (IMax - IMin);

or of double type between ‘XMax’ and ‘XMin’:

Code:
double X = XMin + rand() * (XMax - XMin) / RAND_MAX;

When you generate integer random numbers within a range, make sure that ‘IMax > IMin’ since a modulus operation is involved.

Generating random numbers according to a specified distribution function involves some things more. Let us say that we have to generate random numbers between ‘XMin’ and ‘XMax’ by the distribution function ‘Fun(x)’. Evaluation of this function of any value between these limits does not generate errors like division by zero, square root of negative number etc.

First thing is to evaluate the minimum and maximum values of this function (say ‘YMin’ and ‘YMax’) within the specified limits ‘XMin’ and ‘XMax’.

Generate random numbers in pairs, say ‘X’ and ‘Y’ – ‘X’ between ‘XMin’ and ‘XMax’ and ‘Y’ between ‘YMin’ and ‘YMax’. Discard all random numbers ‘X’ where the value ‘Y’ is below ‘Fun(X)’. Remaining numbers will have the required frequency distribution.

The function shown below generates a random number each time between ‘xmin’ and ‘xmax’, repeated use of which will generate random numbers of fairly accurate distribution function ‘fun()’.

Code:
double Random(double (*fun)(double), double xmin = 0, double xmax = 1)
{
static double (*Fun)(double) = NULL, YMin, YMax;
static bool First = true;

// Initialises random generator for first call
if (First)
{
First = false;
srand((unsigned) time(NULL));
}

// Evaluates maximum and minimum of function
if (fun != Fun)
{
Fun = fun;
YMin = YMax = Fun(xmin);
for (int iX = 1; iX < RAND_MAX; iX++)
{
double X = xmin + (xmax - xmin) * iX / RAND_MAX;
double Y = Fun(X);
YMin = Y < YMin ? Y : YMin;
YMax = Y > YMax ? Y : YMax;
}
}

// Gets random values for X & Y
double X = xmin + (xmax - xmin) * rand() / RAND_MAX;
double Y = YMin + (YMax - YMin) * rand() / RAND_MAX;

// Returns if valid and try again if not valid
return Y < fun(X) ? X : Random(Fun, xmin, xmax);
}

I have added codes for initialising random number generator and to calculate the maximum and minimum values of the function.

On later thought, it is felt as advantageous to take ‘YMin’ as zero. There are three reasons for that.
  • The code will execute a bit faster since operations during checking of minimum and maximum is reduced to half
  • Since it is frequency distribution (normalised / unnormalised), all y values are positive
  • Chance of no random number generated corresponding to the ‘YMin’ point exists in earlier code, if ‘YMin’ is not equal to zero.

So part of the code is modified to

Code:
// Evaluates maximum of function
if (fun != Fun)
{
Fun = fun;
YMin = 0, YMax = Fun(xmin);
for (int iX = 1; iX < RAND_MAX; iX++)
{
double X = xmin + (xmax - xmin) * iX / RAND_MAX;
double Y = Fun(X);
YMax = Y > YMax ? Y : YMax;
}
}
I thought of adding a snap shot of frequency distribution of random numbers generated using the method described above. There were 100000 numbers generated between -1 and 1 as per the distribution function

Code:

double Fun(double x)
{
return 2.5 + sin(10 * x) / x;
}


Distribution of random numbers are shown in 50 intervals.





The first formula
Code:
int I = IMin + rand() % (IMax - IMin);

generate numbers in the interval [IMin, IMax), thus excluding IMax.

To include the IMax among the possible generated numbers use this formula:

Code:
int I = IMin + rand() % (IMax - IMin + 1);

The second formula is problematic too,

Code:
double X = XMin + rand() * (XMax - XMin) / RAND_MAX;

It generates a random double between XMin and up to and including XMax. This is not standard. Normally the upper limit is not inclusive.

So the first formula doesn’t include the upper limit which it should, and the second formula does include the upper limit which it shouldn’t.

While the first formula is downright wrong the second can be used if you accept the upper limit is inclusive. But make sure the variables are floating point. If they’re integers the formula may fail.

A better formula is,

Code:
double r = double(rand())/(double(RAND_MAX)+1.0) // random double in range 0.0 to 1.0 (non inclusive)
double X = XMin + r*(XMax - XMin); // transform to wanted range


reflink: http://vuau.wordpress.com/2010/04/16/how-do-i-generate-random-numbers-of-specified-frequency-distribution/