1 2

NHSE

Netlib, NHSE and other Sources

 National HPCC Software Exchange

 NASA funded CRPC pro ject

Jack Dongarra

 Center for ResearchonParallel Computation CRPC

{ Argonne National Lab oratory

Computer Science Department

{ California Institute of Technology

UniversityofTennessee

{ Rice University

{ Syracuse University

and

{ UniversityofTennessee

Mathematical Sciences Section

 Uniform interface to distributed HPCC software rep os-

Oak Ridge National Lab oratory

itories

 Facilitation of cross-agency and interdisciplinary soft-

ware reuse

http://w ww .ne tl ib. org /ut k/p e opl e/J ackDo ngarra.html

 Material from ASTA, HPCS, and I ITA comp onents of

the HPCC program

 http://www.netlib.org/nhse/

3 4

Goals:

NHSE Comp onents

 Outreach and technology transition

{To the HPCC user community and industry

 To facilitate an active exchange program for HPCC

software and enabling technologies via the National In-

 Distribution via the WWW

formation Infrastructure.

 Discipline oriented rep ositories

 Uniform user interface

 Software review

 To promote contributions and use by Grand Challenge

teams, as well as other memb ers of the high p erfor-

{ Simple submission

mance computing community.

{ On going review

 Measurement

 Hyp ertext road map

Software here includes algorithms, sp eci cations, designs,

do cumentations, rep orts...

 Publishing to ols

{ Rep ository-in-a-b ox

 Naming & authentication architecture

 Selective capitalization of emerging technologies

5 6

Physical Physical Repository Physical Repository 2 Repository Bene ts 1

n

1. Faster development of high-quality software so that sci-

entists can sp end less time writing and debugging pro-

catalog catalog h problems.

info info grams and more time on researc

ware development e ort by shar- catalog 2. Less duplication of soft

info file

ware mo dules.

request ing of soft

3. Less time and e ort sp ent in lo cating relevant software

and information through the use of appropriate index-

h mechanisms and domain-sp eci exp ert retrieved ing and searc

file help systems.

NHSE

verload through the use of lters

Search/Browse 4. Reducing information o

Interface

and automatic search mechanisms. search results

search request

Virtual Repository Architecture

7 8

NHSE

 Based on Existing Technologies

{ WWW Browser Mosaic / Netscap e / etc

Intended Audience

 Distributed / Scalable

 HPCC application and computer science community

 URL: http://www.netlib.org/nhse/

{ Source of material for NHSE

{ Netlib

 Rep ository for math software since 1985

 Users of NASA, NSF, DOE and other sup ercomputer

centers

 Rep ositories Currently Available

{ Go o d targets for NHSE

{ Netlib, Softlib, CITlib

{ Natural supp ort organization: sup ercomputer cen-

{ ASSET - Asset Source for SW Engineering Tech.

ter sta

{ CARDS - Comprehensive Approachto

 Other users of high p erformance computers

Reusable Defense SW

{ ELSA - Electronic Library Services and Appl.

{ Current and p otential industrial users

{ GAMS Virtual Software Rep ository

{ No natural supp ort organization

{STARS - SW Technology for Adaptable,

 Applicable to other domains

Reliable Systems

{ Many examples related to GC problems

 Currently Available Information

{ NHSE currently p oints to 200+ mo dules

 software catalog

 tech rep orts and pap ers

 parallel pro cessing to ols

 libraries of reuseable comp onents

 Grand Challenge prototyp e co des

9 10

Netlib - Network Access to

Mathematical Software and Data

 Began in 1985

{ JD and Eric Grosse, AT&T

 Motivated by the need for cost-e ective, timely dis-

Requests Made to the Netlib Repositories

y mathematical software to the

tribution of high-qualit at the Univ. of Tennessee & ORNL

community.

8,345,418 total requests to these repositories as of Dec 20, 1995

 Designed to send, by return electronic mail, requested items.

4,959,282

 Automatic mechanism for the disseminate of public do-

ware.

main soft 4000

{ Still in use and growing

3000

{ Mirrored at a numb er of sites

 netlib2.cs.utk.edu 2000

Requests x 1,000

 netlib1.epm.ornl.gov

 research.att.com

1000

 netlib.no

 unix.hensa.ac.uk 0

1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995

 ftp.zip-b erlin.de Year

HTTP Gopher FTP XNetlib EMail

Data as of - Dec 20, 1995 at 02:21:52

11 12

wing typ es of software are b eing made

The follo

vailable: Breakdown of requests to each Netlib library a

(an alphabetical listing is also available.)

Systems software and software to ols. Data as of 12/01/95 at 02:42:11  Library Name Number of accesses

469,634 { compilers

pvm3 375,582

{ message-passing communication subsystems linpack 254,992

245,724 { parallel monitors and debuggers. blas 176,606

clapack 128,081

 Basic building blo cks for accomplishing common com-

linalg 126,834 unication tasks. 125,404 putational and comm

slatec/src 117,374

ks are meant to b e used by Grand Chal- toms 115,825 { Building blo c

f2c 96,255 lenge teams

c++ 96,182

Research co des that have b een develop ed to solve di- benchmark 85,217  master 69,587 f2c/src 65,913 cult computational problems.

minpack 60,158

yhave b een develop ed to solve sp eci c problems fn 59,266 { Man

fftpack 58,083

{ Serve as pro ofs of concept

na-digest 50,750 eloping general-purp ose reusable soft- port 48,820 { Mo dels for dev

slatec/lin 46,232

ware hence 45,103 confdb 42,631 c++/answerbook 37,298 slatec/chk 37,203

13 14

 NHSE Software Catalog

Review Pro cedure

 Benchmark and example programs 4

 Treat the review and submission as indep endent pro-

 Data analysis and visualization 22

cesses.

 Numerical libraries and routines 57

 Submissions are \sub ject" to ongoing review.

{ Computational geometry 7

 Review status abstract for each submission.

{ Linear algebra 18

{ Based on author comments

{ Optimization 4

{Package do cumentation

{Partial di erential equations 3

{ Our indep endent reviewer testing

{ Other 25

{ Comments from users.

 Parallel pro cessing to ols

{ Communication libraries 25

{ Execution and p erformance analyzers 31

{Parallel I/O systems 5

{Parallel programming environments 12

{Parallel programming languages and compilers 26

{Parallel runtime systems 10

{ Source co de analyzers and restructurers 7

{ Miscellaneous 16

 Scienti c and engineering applications 66

15 16

NHSE Software Submission

Technical and Political Issues

Goals

 How will the naive user nd the right software?

 Ensure xity of publication

 le ngerprints, unique name

{ Answer: Via the NHSE search/browser interface

and the Road Map

 Prevent imp ersonation and unauthorized changes

 How will authentication, integrity, and version control

digital signatures

b e implemented?

 Exercise quality control

{ Answer: By a publishing system that includes unique

review classi cation

naming & digital signatures.

 Promote interop erability

 Will the NHSE supp ort distribution of software that is

RIG Basic Interop erability Data Mo del

not free or cannot b e freely distributed?

{ Answer: Yes, but...

 only provide classi cation, review, and access

 use of encryption and separate key distribution

 NHSE will not have an accounting department

 Will the NHSE b e resp onsible for supp ort of software?

{ Answer: NO!

 Any supp ort will be by author or appropriate

agent

17 18

User Pro les

 Currently pro cessed manually

Aliations, Collab orations, and Intimacies

{ Click on Submit: User Pro le on NHSE home page

{ Fill in email address, software needs, information

needs

3

 W3 Consortium aliate member W C 

{ NHSE Librarian sends you search results and keeps

you p osted on future items of interest enditemiza

 Reuse Library Interop erablity Group RIG Member

{Future - automate pro cessing

 Active within IETF

 Handle larger volume more quickly

 Interaction with Corp oration for National Research Ini-

 Intelligent information agents

tiatives CNRI

 Semantic ltering

19 20

Road Map

Di erent disciplines will maintain their own soft-

ware rep ositories

 Structured Knowledge Base on Software Comp onents

 Users should not need to access each of these rep osito-

{ Similar to a hyp ertext encyclop edia

ries separately

{ Assembled with the help of panels of exp erts

 NHSE will provide a uniform interface to a virtual HPCC

 various-p ersp ective catalog

software rep ository which will b e built on top of the dis-

{ Glossaries tributed set of discipline-oriented rep ositories.

{ Algorithms

 The interface will assist the user in lo cating relevant

{ Applications categories resources and in retrieving these resources.

{ Enabling software

 A combined browse/searchinterface will allow the user

to explore the various HPCC areas and b ecome familiar { Benchmarks

with the available resources.

{ Hardware

 A longer term goal of the NHSE is to provide users with { Education and training

domain-sp eci c exp ert help in lo cating and understand-

{ Links to other sites

ing relevant resources.

 Prototyp e Currently Available

{Topic: available applications co des and their comp o-

nent HPCC technologies

21 22

Research Issues Needs for Browsing and Searching

 Rep ository in a Box  CurrentWeb interfaces are dicult and frustrating for

the user who is attempting to lo cate sp eci c informa-

{Everything required for a site to setup a rep ository

tion.

and get connected to the NHSE

 Browsing by following hyp ertext links is slow and can

 publishing to ols

b e disorienting.

 ftp, http, gopher, email servers

 Keyword searching su ers from the vo cabulary mis-

 logging and statistical to ols

match problem and is unsuitable for users with impre-

 integration with replication service

cise information and software needs.

 guidance and supp ort

 LIFN

{ assigned to a particular sequence of bytes,

{ once the assignment has b een made, the same LIFN

cannot subsequently b e used to name any other se-

quence of bytes.

 Indexing and Searching

{ Currently limited to syntatic keyword matching

{ Harvest system provides a set of customizable to ols

23 24

Capitalization of Emerging Technologies

Combined Searching and Browsing Interface

 Three Levels of Software Development

 Augment a users mental p erception and knowledge of

Pasadena I Workshop

sp eci c domains,

{ Research prototyp e

 Improve a users understanding of the problem to be

{ Advanced development prototyp e

solved, and

{ Commercial pro duct

 Form successively more fo cused and b etter articulated

queries.

 Fo cus:

The interface will be in the form of a thesaurus-based

{Move from researchto advanced development pro-

roadmap.

totyp e

 Strategy

{ Convene outside review panel

{ Select as many pro jects as budget will p ermit

{ Imp ose requirements for inclusion at Level 2, Re-

viewed Level

25 26

HPCC Thesaurus HPCC Cataloging

 The NHSE will de ne the top levels of an HPCC the-  To enable searching, cataloging information must be

saurus, drawing on an existing HPCC glossary and on made available for NHSE assets.

the current NHSE contents to generate thesaurus terms.

 Eachphysical rep ository will b e resp onsible for main-

 Sub ject area sp ecialists will b e called up on to re ne the taining a network-accessible le containing such cata-

lower levels. loging information.

 The thesaurus, along with a high-level classi cation scheme,  These les will b e retrieved and indexed by an NHSE

will form the basis of a hyp ertext roadmap. indexer on a regular basis, and the resulting searchable

index will b e replicated for reliability.

 The roadmap will include scop e notes and annotations

to familiarize users with various HPCC areas and will  The NHSE will use the Harvest system from University

serve as a springb oard for thesaurus-assisted searches. of Colorado to do the collection, indexing, and index

replication.

27 28

Naming and Authentication

Quality Control

 The main advantage of distributing the rep ository is to

 Users of the NHSE need to have con dence that the

allow the software to b e maintained by those in the b est

software they obtain is high-quality and well-tested.

p osition to keep it up-to-date.

 Quality control will be imp ossible to automate com-

 Copies of p opular software packages may b e transpar-

pletely b ecause it requires human judgement and eval-

ently mirrored to increase availability, improve resp onse

uation.

time, and prevent b ottlenecks.

 To ols and pro cedures need to b e develop ed to facilitate

Research Issues

the testing and classi cation or lab eling of software with

 Assignment of unique identi ers to retrievable assets

resp ect to its quality and reliability.

 Collection and merging of cataloging information, so

 Research is needed to determine what quality control

that, veri cation of the authenticity, integrity of re-

information is most useful and can reasonably be ob-

trieved assets, and tracking can take place

tained, how to acquire this information, and in what

NHSE's approach to these issues

format it should b e stored for easy access.

 Implement a lo cation-indep endent naming architecture

 Academic journal mo del

that unambiguously asso ciates a unique name with the

byte contents of a published asset and that includes

mechanisms for authentication and integritychecking.

 Publishing to ols will be made available to assist pub-

lishers with naming and cryptographically signing pub-

lished assets, and with exp orting asset descriptions to

an NHSE search service.

 Authentication of assets will be handled by an asym-

metric public-private key encryption system.

29 30

Interop eration Measurement

 The NHSE will catalog and provide access to software  Keep statistics on downloads from NHSE pages

and software-related artifacts from all the HPCC soft-

{ Including userid of request source

ware rep ositories.

 not always p ossible

 Assets accessible from other existing software rep osi-

 invasion of privacy

tories, such as ASSET, CARDS, DSRS, and ESLA, to

name just a few, may also b e of interest to NHSE users.

 Statistical Survey of Unreviewed and Partially Reviewed

Software

 The NHSE will b e participating in a small-scale inter-

op erability exp eriment with the ab ove rep ositories to

{ Identi cation of candidates for full review

help de ne requirements for further interop eration ef-

{ Email inquiries to determine usage pattern

forts.

 Systematic Survey of Reviewed Software

 The NHSE will also b e working with the Reuse Library

Interop erability Group RIG on establishing standards

{ Application

for unique naming, asset description and classi cation,

{ Usage pattern

and asset evaluation.

{ Industry versus academia

 In the future, the NHSE will interop erate with these

other rep ositories so that software from them maybe

retrieved directly from the NHSE interface.

31 32

Software Survey

Reliable File Mirrowing

 200+ software items from NHSE collection whichhave

 Current mirroring systems inadequate

b een abstracted and keyword indexed

 Replication service to provide:

 testb ed for search strategies

{Fast selective le transfer

 software review candidates

{ Indep endent of machine architecture and op erating

system dep endenices

 Integrated with lo cation lo okup mechanisms

33 34

NHSE at ANL

 Mo dular web rob ot

 Parallel web indexing engine

 DNS/geographical mapping

 Autonomous agents for collecting information on the

web

35 36

Overall Strategy for the NHSE:

Naming of Distributed Resources

 E ectiveness of the NHSE will dep end on discipline-

oriented groups and Grand Challenge teams having own-

ership of the discipline-oriented software rep ositories.

 Need single name to refer to an ob ject

 The information and software residing in these rep osi-

 Mayhavemultiple lo cations

tories will b e b est maintained and kept up-to-date by

the individual disciplines, rather than by centralized

 Maychange over time

administration.

 2 kinds

 Central administration will be used instead to handle

{ urn:netlib:Welcome Dynamic

interop eration and meet common needs.

{ lifn:netlib:Welcome_940328 Fixed

 Although the various disciplines will haveownership of

 Fixed reference to a published ob ject

the rep ositories, they should not b e exp ected to develop

the software and to ols for building, managing, and in-

terfacing to their rep ositories.

 Much useful information retrieval IR software is cur-

rently available, b oth in the form of client and server

programs e.g., http servers and WWW browsers, as

and this software should b e incorp orated into the NHSE.

37 38

Summary Summary continued

 Initial implementation built on existing technologies  Measurement and evaluation

{ WWW { Statistics for unreviewed and partially reviewed

 Distributed {Evaluations for reviewed

 Scalable

 Outreach and technology transition

{ Netlib, etc

{ Educational activities aimed at the user community

{ Rapid deployment

{Fostering technology developmentby industry

 Multilevel review and classi cation scheme

 Capitalization of emerging technologies

{ Unreviewed

{Foster transition from research to advanced devel-

{Partially reviewed

opment prototyp e

{ Reviewed

 Working on standardization within WWW community

 Road Map

{ memb er of RIG, IETF, IESG, WWW consort

39

Some Url's Related to Scienti c Computing

 National HPCC Software Exchange

http://www.netlib.org/nhse/

 NHSE Software Catalog

http://www.netlib.org/nhse/sw

catalog/index.html

 Netlib Rep ository

http://www.netlib.org/

 Computational Science Education

http://www.netlib.org/nhse/cse

edu.html

 Bo oks, Course Materials, and Tutorials

http://www.netlib.org/nhse/cse

edu.htmlb o oks

 CS267 - Applications of Parallel Computers

U.C. Berkeley CS267: Spring 1995 Jim Demmel

http://www.icsi.b erkeley.edu/cs267/

 18.337 Parallel Scienti c Computing

MIT Spring, 1995 Alan Edelman

http://web.mit.edu/18.337/WWW/home.html

 North Carolina State University

Visualizations in Materials Science

http://vims.ncsu.edu/cgi/index.acgi

 Computational Science Education Pro ject

http://csep1.phy.ornl.gov/csep.html