next up previous
Next: Plan of work Up: Proposed research and context Previous: Project overview and goals

Background and supporting work

The allocation of computational and data resources is one of the most significant concerns in Grid computing. The Globus toolkit contains a dedicated resource manager, the Globus Resource Allocation Manager (GRAM) and allocator, the Globus Architecture for Reservation and Allocation (GARA) [5]. Grid applications frequently require the simultaneous co-allocation of resources of different types in different locations. Resources are assigned either by advance reservation or by immediate reservation. Resource reservations are specified in a dedicated Resource Specification Language (RSL).

As noted in [6], for these reservations to be meaningful ``(1) a controller must have a hard enforcement mechanism, or (2) all users must engage the controller and observe policies. If this is not the case, then there is nothing to prevent a rogue (or simply unaware) user from consuming excessive resource capacity.'' Our work will remove the assumption that users will be well-behaved and well-informed and will respect resource usage policies. User codes will carry attached certificates of resource consumption which are unforgeable, mathematically checkable proofs. A controller can inspect these resource consumption certificates and refuse access to rogue or erroneous codes which would consume excessive capacity. Our work will develop methods for producing proofs of resource consumption via enhanced type systems and extended static analysis with inference of resource bounds in some cases.

Our approach to Grid resource consumption improves on the present state-of-the-art both technically and practically. By being able to refuse to run codes which cannot successfully complete, we conserve Grid resources for well-behaved codes. This is a significant improvement over the current practice whereby an application which executes for a period and then runs out of resources is terminated by an upcall from the computing fabric. A thoroughly robust application might be able to recover and restart but it is more likely that this run will be terminated. The most optimistic outcome is that part of the computational effort invested in this run will be wasted. Generally, all of the computational effort and attendant resource reservation and allocation will have been wasted. Failures such as these mean that a significant proportion of the work of the world's most powerful computers is wasted when the failure of the code could instead have been predicted in advance.

Our work supports the mobile-code paradigm where the program is sent to run on the host where the data resides. This approach is highly relevant to e-Science: given 1Pbyte of sky survey data, a 40Gb/s link and 100Kbytes of application code it would be impractical to ship the data to the code.

Traditional all-or-nothing security analyses are inappropriate in the context of a resource-sharing, service provisioning system such as the Grid. In such a system it is immediate that resources can be accessed and it is then necessary to be able to quantify the amount of resource consumption. When dealing with very large data sets, as in e-Science, asymptotic complexity-theoretic analysis of an algorithm is simply not accurate enough. It is not sufficient to know that the run-time of an algorithm is O(n log n)--it is necessary to expose the multiplicative constant of the run-time analysis also. The resource-bound certification methods of Aspinall and Hofmann [7,8,9] which we use compute the resource consumption of an algorithm as a function of the input size and do not elide constants of proportionality. Using this degree of accuracy it is possible to distinguish a run-time of (say) 2n log n from 100n log n. This former may be effectively computable in the available resources whereas the latter may not.

JavaTM is the leading programming language for the Grid. Many vendors provide direct support for it and there is a substantial user community. A specialised version of the Open Grid Services Infrastructure [10] is defined for Java (the Java OGSI [11]). The Java Community Grid Kit (Java CoG) provides access to Grid services through the Java framework [12]. Modern Grid toolkits are implemented in Java [13,14]. Globus itself is implemented in Java.

In addition to this Java provides built-in support for networking, accessing databases, reading and writing semi-structured data in formats such as XML, visualisation, object serialisation and more. Scientific and numerical codes can make use of Java versions of libraries of numerical routines such as BLAS. Legacy Fortran codes can be cross-compiled to run on the Java Virtual Machine (the JVM). Java can inter-operate with C routines via the Java Native Interface (JNI). Java has been increasingly used in parallel, distributed and high-performance computing.


next up previous
Next: Plan of work Up: Proposed research and context Previous: Project overview and goals