The transferal of a numerical problem to a computer must be supported by considerable effort. The choice of programming language, operating system and CPU can all have consequences. The LT CCD software must be run on a SUN workstation, as the CCD camera drivers, written by the CCD designers at San-Diego University, were coded to run specifically on such a machine. In an effort to reduce hardware problems, such as inter-operability, the LT PL will also operate on a SUN workstation.
The LT CS is written in the C programming language connected through the use of Java programs, which will control the input/output of data between the actual control programs and have been termed “Java laminate control structures”. This decision was made to enable easy integration with the telescope’s core software, produced by Telescope Technologies Limited (TTL) - the telescope manufacturers, and written in C. To ensure the PL can be integrated into the CS structure most routines will be written in C. The numerically intensive photometry code however is to be written in FORTRAN 95. This is because FORTRAN was designed to be efficient at numerical calculations; for a numerically intensive application, where processing speed is very important, FORTRAN is the only sensible language (Gray, 1998; Wallace, 2001). This is most easily illustrated using an example presented by Gray (1998) using the two languages’ loop semantics;
In Fortran (loop 1):
do 10, i=1,n
c process array element a(i)
10 continue
for (i=0; i<n; i++){
/*process array element a[i] */
}
do 10, i=1,n,4
c process array element a(i)
c process array element a(i+1)
c process array element a(i+2)
c process array element a(i+3)
10 continue,
c process array element a(i)
c process array element a(i+1)
c process array element a(i+2)
c process array element a(i+3)

c process array element a(i+n).
In using two programming languages it is important to understand how variables are stored
in memory (see Section 5.3.1 for a brief summary) and how that memory is accessed to
facilitate efficient coding. This is particularly important in multi-dimensional arrays, such
arrays are stored as contiguous single dimension blocks of memory. In FORTRAN arrays go
from (1
n) with the left-most index increasing fastest. In C, arrays go from (0
n - 1)
with the right-most index varying fastest. It is important to progress thorough an array in
the “natural” order lest significant time be wasted. This can be shown as follows (loop 3):
do 10, j=1,1000
do 20, i=1,1000
call some-calc (array(i,j))
20 continue
10 continue
If the array is to be used to hold 10 integers and therefore does not require all the memory allocated to the array (assuming 1 integer requires 1 block), 100 blocks have still been allocated and are not available to any other part of the program or any other program running on the computer. Conversely if 100 blocks is too little memory an array overflow occurs, causing different errors depending on the platform and programming language, which are sometimes imperceptible until much later in the program. Statically allocated memory cannot be resized to accommodate a data set changing in size.
Dynamic allocation is a means by which a program can overcome this problem by obtaining memory whilst it is running. Whilst global variables are allocated at compile time, non-static local variables are allocated from the stack and dynamically allocated memory is allocated from the heap - that free memory which is available after static allocations. A dynamically allocated version of the integer array would simply involve creating space for the start of an integer array (e.g., 1 block) each time an integer is added to the array the memory allocated to the array increases by 1 block. If the array is not used during the program only 1 block (the start block) has been “wasted”, as opposed to 100 blocks in the statically allocated example.
There are of course problems inherent with dynamic memory allocation (DMA). The most common of these are freeing memory and fragmentation of memory. To enable even more efficient use of memory DMA enables the “freeing” of memory once the data held is no longer needed. This is a simple process, but a considerable responsibility for the programmer. By not freeing memory the available memory is reduced, by freeing memory which is still needed the program will not function. Most importantly when freeing memory which has not been allocated the consequences to the allocation system are serious (Schildt, 2000) and “undefined”. Memory is allocated in complete blocks, if a DMA requests 50 blocks of a 100 block heap the memory is allocated at a random starting position. If this position is the 26th block, the memory will occupy places 26-75. If another DMA of 26 blocks is necessary, before freeing the first DMA, the allocation can not take place. If the random starting position of the 50 block DMA had been at block 1 there would be a single 26 block stretch of memory to be allocated. The lack of memory due to this effect is called fragmentation.
Fragmentation can be reduced by using linked data structures. A linked data structure effectively allows allocation of non-contiguous blocks of memory. If an array of 10×10 integers requires 10×10 blocks of memory then it is necessary to locate 100 contiguous blocks of memory. During a DMA the memory address of the start of the memory block is explicitly known and the memory address of the rest of the block is implicitly known due to its contiguous nature. A linked data structure makes this implicit process more explicit. By splitting the the 10×10 array into to 10, 10 block arrays and by storing the address of the 2nd array in the first block of the 1st array (effectively increasing the memory block to 11) and likewise until the 10th array which would contain a terminator in its first block to indicate that it was the last allocated memory in the linked list, the memory can be fragmented across 10 smaller non-contiguous blocks instead of 1 large, and possibly un-allocatable block of memory.
The LT PL uses these techniques to ensure efficient memory usage in both the C and FORTRAN 95 codes it implements.
To explain what endianess means to computing, it is necessary to understand how computers store data. Data is a series of bits of 0s and 1s. Eight bits comprise a byte, and it is these eight-bit byte structures that the computer stores. Big endian systems store the most significant byte in the lowest numeric byte address. Little endian systems store the least significant byte in this address. A variety of UNIX variants exist today, each running in either big endian or little endian mode. Sun’s Solaris on SPARC, Hewlett-Packard’s HP-UX and the Internet are big endian, while Sun’s Solaris on x86 (e.g., INTEL) and Compaq’s UNIX variants are little endian (SUN, 2001). A little endian system may translate a set of stored bytes as the decimal value 1. A big endian system would translate the same information as 16. Assumptions made by programmers as to the endianess of the processor can result in severe inaccuracies in data.
The bytes in each word of a FITS data array follow the ANSI/IEEE-754 standard(IEEE, 1985). They are in the order of decreasing significance, with the sign bit first, little-endian. Thus big-endian machines, like those the LT PL will operate on will need to implement byte-swapping algorithms before data can be operated on, then re-swap before re-writing FITS files.
Papers by Zucker (1998) and SUN (2001) discuss the despair of debate over which system of endianess is better; Big endian systems and little endian systems perform equally well and neither system is inherently better than the other. The difficulty arises from issues of inter-operability. The complexities faced in a networked environment comprised of different endian systems and endian dependent applications are overcome in the LT PL by reading the FITS file directly from a local disc. As such the LT PL is a stand-alone software package designed to operate on a Sun Solaris using SPARC architecture.
The SUN FORTRAN compiler defines the the magnitude of a normalized single-precision floating-point value (a real) to be in the approximate range (1.175494E-38, 3.402823E+38), anything outside this range must use double precision. The magnitude of an IEEE normalized double-precision floating-point value must be in the approximate range (2.225074D-308, 1.797693D+308) (SUN, 1996).
It is possible to generate a double precision level variable mid-way through a calculation, while the final answer may only be of real size. Double precision level arithmetic causes the calculation to be slower and computationally more intensive. This is because a real is stored in 8 memory blocks while a double precision number is stored in 16. Careful algorithm construction can reduce or even remove the need to use double precision variables. Table 5.3 shows two programs which implement different algorithms to generate the same answer. The maximum size of a “real” variable, with the LT PL computer configuration is 1E38. Program Overflow necessitates a double precision variable to avoid numerical overflow, while Program Efficient is able to generate the result in a real variable and avoid overflow problems. The simple example described in Table 5.3 has been extracted from the LT PL code.
|