The Extreme Data Set project is intended to allow processing of ``unusually'' large data sets by Starlink software, although the sizes for which special measures are required will become less and less unusual as time goes on. The principal underlying problem is that as images get larger, 32 bits are no longer enough to index into an image. The largest integer that can be stored in 32 bits is approximately 4 x 109 (unsigned) or 2 x 109 (signed). If the operating system itself uses unsigned 32 bit pointers to address bytes in memory, this means that it is impossible to map into memory an image of more than 4Gbyte or, say, two images of half that size. This could correspond to, for instance, an input and an output image simultaneously mapped each with an HDS type of _REAL and size of 23k pixels square.
For this sort of work therefore an operating system with 64-bit pointers is required.
For the systems supported by Starlink this currently means that Compaq Tru64 Unix can be used, as can Solaris running in 64-bit mode. On appropriate hardware the Solaris kernel may be compiled for 32 bit or 64 bit mode; but almost1all binaries which run on the 32-bit version will run equally well on the 64-bit version, so that reconfiguring a system from 32-bit to 64-bit should be fairly painless from a software point of view. You can tell if your Solaris kernel is 64-bit by using the isainfo -v command; on a 64-bit system the following response will be given
% isainfo -v 64-bit sparcv9 applications 32-bit sparc applications
User code will run up against similar problems to those faced by the operating system when coping with large images. It is often necessary to count the pixels, or the bytes, in an image, and this is typically done using a Fortran INTEGER or a C int. These are normally signed 32-bit values, with a maximum value of about 2 x 109; the pixel count of a 47k x 47k image, or the byte count of a 16k x 16k _DOUBLE image, will overflow this limit.
Another common requirement is holding a pointer to allocated memory, which has ultimately been acquired from a C routine such as malloc, in a variable. In C this will be taken care of automatically because the compiler ensures that pointer types are long enough to hold memory addresses. In Fortran 77 however there is no pointer type so that INTEGERs, which are normally 32 bits, have to be used. The solution to this, explained in SUN/209, is to use the CNF_PVAL function.
The issues addressed in this document apply to user programs which link against Starlink libraries as well as to the code which forms the USSC; if the USSC has been built for a 64-bit system, then user code which uses its libraries will need to be modified at the source level in order to work. Depending on the complexity of the code, it may be easier to do this with a few manual adjustments than by using the automatic tools supplied with the EXTREME package. The discussion here should be of use in any case.
This package provides some tools and instructions for software maintainers to use in modifying their source code to take advantage of a 64-bit environment. The rest of this document is organised as follows: