Next Previous Up Contents
Next: Shapes
Up: Functions
Previous: Maths

#### 10.6.15 Randoms

Functions concerned with random number generation.

There are two flavours of functions here: index-based (`random*`) and sequential (`nextRandom*`). Briefly, the index-based ones are safer to use, but provide poorer random statistics, while the sequential ones provide decent randomness but are not suitable for use in some/most contexts. They are documented separately below.

Index-based functions

The functions named `random*` all take an `index` parameter which determines the value of the result; the same index always leads to the same output, but there is not supposed to be any obvious relationship between index and output. An explicit index is required to ensure that a given cell always has the same value, since cell values are in general calculated on demand. The quality of the randomness for these functions may not be that good.

In most cases, the table row index, available as the special token `\$0`, is a suitable value for the `index` parameter.

If several different random values are required in the same table row, one way is to supply a different row-based index value for each one, e.g. `random(2*\$0)` and `random(2*\$0+1)`. However, this tends to introduce a correlation between the random values in the same row, so a better (though in some cases slower) solution is to use one of the array-generating functions, e.g. `randomArray(\$0,2)` and `randomArray(\$0,2)`.

The output is deterministic, in the sense that the same invocation will always generate the same "random" number, even across different machines. However, in view of the comments in the implementation note below, the output may be subject to change in the future if some improved algorithm can be found, so this guarantee does not necessarily hold across software versions.

Implementation Note: The requirement for mapping a given input index deterministically to a pseudo-random number constrains the way that the random number generation is done; most well-studied RNGs generate sequences of random numbers, but that approach cannot be used here, since these sequences do not admit of random-access. What we do instead is to scramble the input index somewhat and use that as the seed for an instance of Java's `Random` class, which is then used to produce one or more random numbers per input index. Some thought and experimentation has gone into the current implementation (I bought a copy of Knuth Vol. 2 specially!) and an eyeball check of the results doesn't look all that bad, but it's still probably not very good, and is not likely to pass random number quality tests (though I haven't tried). A more respectable approach might be to use a cryptographic-grade hash function on the supplied index, but that's likely to be much slower. If there is demand, something like that could be added as an alternative option. In the mean time, beware if you use these random numbers for scientifically sensitive output.

Sequential functions

The functions named `nextRandom*` have no arguments, and supply the next value in a global sequence when they are evaluated. These can be used if scanning through a table once (for instance when writing a table using STILTS), but they are not suitable for contexts that should supply a fixed value. For instance if you use them to define the value of a table cell in TOPCAT, that cell may have a different value every time you look at it, which may have disconcerting results. These use the java.util.Random class in a more standard way than the index-based functions and should provide random numbers of reasonable quality.

`random( index )`
Generates a pseudo-random number sampled from a uniform distribution between 0 and 1.

Note: The randomness may not be very high quality.

• Parameters:
• `index` (long integer): input value, typically row index "`\$0`"
• Return value
• (floating point): random number between 0 and 1

`randomGaussian( index )`
Generates a pseudo-random number sampled from a Gaussian distribution with mean of 0.0 and standard deviation of 1.0.

Note: The randomness may not be very high quality.

• Parameters:
• `index` (long integer): input value, typically row index "`\$0`"
• Return value
• (floating point): random number

`randomArray( index, n )`
Generates an array of pseudo-random numbers sampled from a uniform distribution between 0 and 1.

Note: The randomness may not be very high quality.

• Parameters:
• `index` (long integer): input value, typically row index "`\$0`"
• `n` (integer): size of output array
• Return value
• (array of floating point): `n`-element array of random numbers between 0 and 1

`randomGaussianArray( index, n )`
Generates an array of pseudo-random numbers sampled from a Gaussian distribution with mean of 0.0 and standard deviation of 1.0.

Note: The randomness may not be very high quality.

• Parameters:
• `index` (long integer): input value, typically row index "`\$0`"
• `n` (integer): size of output array
• Return value
• (array of floating point): `n`-element array of random numbers

`nextRandom( )`
Returns the next value in a random sequence, sampled from a uniform distribution between 0 and 1.

This function will give a different result every time, hence it is not suitable for use in an expression which should have a fixed value, for instance to define a TOPCAT column.

• Return value
• (floating point): random number between 0 and 1

`nextRandomGaussian( )`
Returns the next value in a random sequence, sampled from a Gaussian distribution with mean of 0.0 and standard deviation of 1.0.

This function will give a different result every time, hence it is not suitable for use in an expression which should have a fixed value, for instance to define a TOPCAT column.

• Return value
• (floating point): random number

Next Previous Up Contents
Next: Shapes
Up: Functions
Previous: Maths

STILTS - Starlink Tables Infrastructure Library Tool Set