Java for Scientific Computing, Pros and Cons
Total Page:16
File Type:pdf, Size:1020Kb
Journal of Universal Computer Science, vol. 4, no. 1 (1998), 11-15 submitted: 25/9/97, accepted: 1/11/97, appeared: 28/1/98 Springer Pub. Co. Java for Scienti c Computing, Pros and Cons Jurgen Wol v. Gudenb erg Institut fur Informatik, Universitat Wurzburg wol @informatik.uni-wuerzburg.de Abstract: In this article we brie y discuss the advantages and disadvantages of the language Java for scienti c computing. We concentrate on Java's typ e system, investi- gate its supp ort for hierarchical and generic programming and then discuss the features of its oating-p oint arithmetic. Having found the weak p oints of the language we pro- p ose workarounds using Java itself as long as p ossible. 1 Typ e System Java distinguishes b etween primitive and reference typ es. Whereas this distinc- tion seems to b e very natural and helpful { so the primitives which comprise all standard numerical data typ es have the usual value semantics and expression concept, and the reference semantics of the others allows to avoid p ointers at all{ it also causes some problems. For the reference typ es, i.e. arrays, classes and interfaces no op erators are available or may b e de ned and expressions can only b e built by metho d calls. Several variables may simultaneously denote the same ob ject. This is certainly strange in a numerical setting, but not to avoid, since classes have to b e used to intro duce higher data typ es. On the other hand, the simple hierarchy of classes with the ro ot Object clearly b elongs to the advantages of the language. It is also used to intro duce further concepts like exception handling, which thus is fully integrated in the ob ject oriented avor of the language and as extendeble as all other classes. Multidimensional arrays in Javahave to b e handled as arrays of arrays. This means that there is no guarantee that a matrix is allo cated in contiguous space. Certainly most numerical algorithms will su er from that fact. In a p olymorphic environment it often is helpful to get information ab out the current sp eci c typ e of an ob ject at runtime. This can b e achieved by the metho d getClass which returns the typ e as a constant ob ject of the class Class. In that class metho ds to get more information ab out the interface or name of the typ e are provided. 1.1 Workaround: Op erator Overloading Esp ecially the lack of op erator overloading is annoying. It can b e overcome by a precompiler which maps the op erator to metho d calls. Since metho ds may return arbitrary classes this precompiler only has to change the notation of each expression lo cally. So the corresp ondence between the original source and the generated Java program is quite obvious. Using such a precompiler we can also allow for the de nition of user de ned op erator names or symb ols including the determination of its precedence. 12 v. Gudenberg J.W.: Java for Scientific Computing, Pros and Cons 2 Hierarchical Programming The paradigm of hierarchical programming means the classi cation of the data using inheritance. Base class comp osition or programming abstract data typ es are alternate names. We recommend to separate b etween interfaces and imple- mentations of classes as well as to collect common b ehavior in general abstract sup erclasses which then can b e extended for the sp eci c application. This design leads to reusable, easily extendable classes. The separation of interfaces and classes and the use of packages in Java provide a go o d supp ort for clearly structured programs. Since there is only one hierarchy of classes, we always can cast or convert an ob ject of a typ e to its sup ertyp e. This is the usual \is-a" relation in ob ject ori- ented inheritance structures. A conversion in the other direction from sup ertyp e to subtyp e is also p ossible. Hence, if we know the sp eci c typ e of an ob ject, we can make use of its particular features. These downcasts are obtained by writing the name of the destination class in front of the ob ject. In a similar wayupward and downward casting of primitive typ es is p ossible, where we now have the following subtyp e relation. int long float double Note that this relation only indicates the upward direction for casts and do es not mean that there is no information lost, when you convert an int or long to a float. Metho ds are b ound dynamically by default. The mo di er final,however, which can be attached either to a metho d meaning that this metho d can not b e rede ned, or to a whole class which then prohibits the inheritance from this class, op ens p ossibilities for optimization. 3 Generic Programming The paradigm of generic programming in general deals with homogeneous con- tainers, i.e. data structures comp osed of elements of the same typ e. These con- tainers de ne one or several iterators to access all or some of their elements in a sp eci ed order. These iterators then link generally de ned algorithms to the container. In the package java.util a simple iterator interface is provided which can easily b e extended to the user's needs. This style of programming yields a high degree of reuse, since the same structure is preserved. The data structures generally are parameterised by the elementtyp es. The algorithms are further generic with resp ect to the iterators. This kind of parametrization is usually provided by templates, a feature which unfortunatedly is lacking in Java. 3.1 Workaround: Templates In the language manuals it is suggested to use containers of the typ e Object instead and p erform appropriate typ e casts. This do es not work well, since prim- itive typ es are no ob jects, so we can not ll a container with primitive typ es. To cop e with this situation wrapp er classes are included in the language, which simply wrap a constantvalue of the corresp onding primitivetyp e. Since they are ob jects, no op erations are available for wrapp er classes, not even an assignment v. Gudenberg J.W.: Java for Scientific Computing, Pros and Cons 13 of a new value is p ossible. Hence, if wewant to write generic co de for primitive data, wehave to explicitly check the currenttyp e and cast. This kind of co de { a large conditional satement dep ending on typ e of the current ob ject{ has not much to do with ob ject orientation, even if it is encapsulated in a help er class, like NumberHelper [8]. This class provides metho ds for arithmetic op erations for the wrapp er classes which use the prede ned primitive op erations. So this approach only helps for the limited numb er of wrapp er classes. We can provide a similar help er class for higher arithmetic data typ es. The main disadvantage of this approach is that it is not extendable. If we add a new typ e with the same interface, wehave to edit the help er class and to add a new branch in the conditional statement. We might try to overcome the diculty by using the runtime information to get the name of the class of the elements and then cast to it. This, however, can not be compiled, since the syntax requires an identi er and not a string as a typ ename. Therefore, we write a factory class which pro duces a help er class for each elementtyp e. The name of this help er class contains the name of the element class. All help er classes implement one generic interface whichnow replaces the template. The application program calls the makeHelper metho d of the factory class with an ob ject of elementtyp e as parameter. This lo oks for the corresp onding help er class and loads it, if available. If not, it writes the source to a le inserting the ob ject's typ e name where appropriate, compiles this le on the y and then loads the corresp onding class. Since all help ers have identical structure this pro cess of writing and compiling can b e made automaticly. 4 Arithmetic Java is one of very few languages which completely sp ecify the semantics of their oating-p ointtyp es. The IEEE 754 formats sing l e and doubl e have to b e used for the float and double data typ e, resp ectively. All 4 basic arithmetic op era- tions are to b e p erformed with full precision using the rounding to the nearest neighb or. Expression evaluation has to use these basic op erations to compute eachintermediate result, i.e. round after each op eration. This rule ignores the facilities which some hardware units supply, like fused multiply and add or ex- tended temp oraries, and therefore has lead to some discussion [4]. In our opinion this feature of strict expression evaluation is mandatory for a language which claims to get identical results on all computers. And, indeed, this goal is achieved by that rule. So even programs using oating p oint op erations are completely p ortable. An interesting Java extension where expression evaluation may b e switched between this strict mo de and a so-called natural mo de which makes use of ex- tended hardware features and hence pro duces results which are at least as go o d, is discussed in [3].