What we have covered

1. fundamentals Beyond Ball -and -Stick 2. Basic chemistry data types and representations

Part 2: Practical Chemistry Visualization 3. Some ideas about breaking barriers and increase usefulness of visualization

Mario Valle Swiss National Supercomputing Centre (CSCS)

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

What we cover now Data issues

1. Data and data management  Representation follows data logical structure and intended usage 2. Good visualization tool characteristics

3. Visualization tools here at CSCS  Horror stories from not knowing the data

4. Digital storytelling tools and ideas  One thing is logical data format, another the physical data file format

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

Usual chemistry data types Non quantitative data

O

O

OH

1D and 2D O Structures tables Scalar volumes

Data from prof. A. Oganov – ETH Zürich

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

1 Structure Structure visualization goals

 Atoms coordinates  Show spatial configurations  Atom types  With time show peculiar movements  Eventually bonds data  Show correlation between position/structure and other quantities  Optional scalar values (like charges) or vector  Show matching or spatial related configurations values (like vibration modes)

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

Basic structure representation High level structures

CPK

Surfaces

Ball and Stick

Licorice

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 AccessibleBeyond Ball-and-Sticksurface Tutorial – Mario Secondary Valle – CSCS U serstructure Day 27/09/2005 Secondary structure

Less is more Problem: structures are too big

Data from prof. A. Oganov – ETH Zürich

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

2 Related scalar quantities Vector data (vibration modes)

 Static (arrows)  Animated

Colored by atom type Colored by charge

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

Visualization techniques Data formats

 Select part of a structure Base for almost everything  Show two or more structures together  PDB  Show surfaces  Cube  Add high level “summary” geometries (polyhedrons,  Etc. planes, etc.) But there are incomplete formats (missing atoms types, unit cell, etc.)  Other?

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

Time dependent data Frame -by -frame or summary?

Sergey Churakov – PSI Villigen

Summary over trajectory

Images from AmiraMol Frame-by-frame

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

3 Time -dependent visualization goals Trajectory data formats

 Show spatial configurations changing over time  Kino  Show phase transitions  PDB  List of PDB  DCD Working around limitations: Kino + file with base vectors  Other ?

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

Volume data Volume data file formats

The following formats contains structure + uniform grid of scalar values:  Gaussian Cube  CHGCAR  Other?

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

1D data Choose the right mapping

ABC analyzer warm-up

33

31 33 29

31 27 temperature(°C)

25 29 temp 0 3 6 9 12 15 18 27 time from power-on (min)

32 25 25 0 3 6 9 12 15 time

27

32

29 31

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Davide Donadio – ETH Zürich

4 2D Tables 3D time dependent data

Trajectory in parameter space Added perceptual cues to help understand 3D trajectory in parameter space

COSY NMR Spectra

Dotplot for protein

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

1D & 2D tools Trend in scientific data lifecycle

From this (publish and forget)…

 Gnuplot  Grace  Scigraphica  R  Matlab  Other?

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

Trend in scientific data lifecycle Metadata are not evil!

…to this (use, reuse, recycle)… Metadata is information about data We often use metadata without even knowing it

If you had two cans without Tuna? labels, which would you eat?

Without a label, how would Cat Food? you know which was tuna and which was cat food?

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

5 Metadata inside/outside data Multiple files, different parameters

Noise variance = 0  Inside the file • E.g. PDB  TITLE (describe what the PDB is about)  KEYWDS (some keywords to retrieve the file) 0.003  AUTHOR (who to blame for bad data)  REMARK 6 – 99 (free form remarks)  Encoded in the filename or file path /simulations/20050921/param1=0.033/result_run_1.pdb  In a companion file • Protein.pdb + Protein.xml • Desc.xml for a set of files

0.010

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

My chemistry file collection The quest for the perfect tool

Description file editor

The perfect tool makes everyone happy. But:  Chemistry is a very wide design space  No single definition of perfection  Visualization is still an art

Generated page view in browser

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

The visualization “Holy Grail” Perfect tools – somewhere else

Your data

Smart Perfect system visualizations!

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

6 Chemistry visualization tools Difficulty 1: tools inflexibility

Nice crystallography programs, What a chemistry visualization tool should provide: but animation missing and no  Load and display everything I’m working on way to extend them  Enable exploration and comparison  Produce high quality images and movies for publication  Has flexibility in adding customizing visualization techniques or analysis scripts  Let me experiment with techniques and rendering modes But there are difficulties…

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

Difficulty 2: formats a gog ò Difficulty 3: minimal integration

PDB, Gaussian Log, Gaussian Cube, Plan coordinates, Kino, SHEL -X, VASP POSCAR, XDATCAR, ADF, DCD, DL_POLY, VASP XDATCAR, Concatenated VASP POSCAR

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

ChemViz@CSCS STM3 toolkit

 We do not endorse any specific program, but encourage you to use the tool best suited to you STM3 is a framework in research which to develop unusual and enhanced techniques  We use the STM3 platform to implement unusual for molecular visualization and advanced techniques STM3 goal is not to  The other two application we suggest (if you ask) supplant existing tools are:  VMD The toolkit is built on top  Molekel of the commercial visualization environment  The OpenBabel file type converter could be useful AVS/Express for accessing strange file formats

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

7 STM3 modules VMD

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

(new) Molekel Tools resources

 STM3  http://www.cscs.ch/~mvalle/ChemViz/  http://www.cscs.ch/projects/AVSChemistry.php  VMD  http://www.ks.uiuc.edu/Research/vmd/  http://www.theochem.ruhr-uni-bochum.de/~axel.kohlmeyer/cpmd-vmd/  Molekel  http://www.cscs.ch/a-display.php?id=138  OpenBabel  http://openbabel.sourceforge.net/

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

Digital storytelling Different roles of visualization

1. Help understanding data www.smartmoney.com/marketmap

2. Communicate and present results www.peets.com/selector_coffee/coffee_selector.asp We “see” a good story develops in our mind

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

8 Why scientists are often boring? A not so unusual presentation…

When we added no drug the viral infection spread and lead to cell death within 17 hours of the initial challenge. “...drawing graphs, like motor-car driving and When we added the drug at 5 nM the viral infection persisted and love-making, is one of those activities which spread in the culture leading to cell death within 18 hours post almost every researcher thinks he or she can challenge. do well without instruction.” When we added the drug at 5 nM levels the 78% of cells survived viral infection and grew slowly without cell division. When we added the drug at 10 nM levels 73% of cells survived viral Wainer & Thissen, 1991 infection and grew slowly without cell division. Annual Review of Psychology When we added the drug at 15 nM levels 6 % of the cells resisted viral infection but did not grow or divide. When we added the drug at >15 nM levels all of the cells died within 10 hours.

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

… that could be better Excellence in thinking

Protection from viral infection

Survival time Growth/ [Drug] post challenge division Toxicity “[…] clarity and excellence in thinking is very much like clarity and excellence in the display of data. When control 0 17 hrs +/+ 0% principles of design replicate principles of thought, the 1 5 nM 18 hrs +/+ 22% act of arranging information becomes an act of insight.”

2 10 nM ∞ +/- 27% 3 15 nM ∞ -/- 94% Edward Tufte 1998 p. 9

4 > 15 nM 10 hrs -/- 100%

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

Visual messages capture attention Visual preempts verbal messages

105

PowerPoint 1.0 derives from a 104 product called “Presenter” developed by Forethought A verbal message 103 Inc. at the beginning of 1987. never dominates over a non verbal 102 Microsoft bought Presenter on one. 101 august 1987 for 14 millions of dollars. 100 1999 2000 Immagine da: albinoblacksheep.com We walk away from this chart thinking that 2000 results are twice the ones from 1999 (instead of a mere +2%).

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

9 Visual messages capture attention 14 Ways To Say Nothing…

Do you remember how much Microsoft paid Presenter?

Do you remember what the guy was doing in the photo?

Right! Wrong (violates rules) “““14“14 Ways to Say Nothing with Scientific VisualizatioVisualization”n” Al GlobusGlobus,, Eric Raible ––– NASA –––July 1994

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

14 Ways To Say Nothing… Siggraph 1993: VIZ -O-MATIC

1. Never Include a Color Legend 2. Avoid Annotation 3. Never Mention Error Characteristics 4. When in Doubt, Smooth 5. Avoid Providing Performance Data 6. Quietly Use Stop-Frame Video Techniques 7. Never Learn Anything About the Data or Scientific Discipline 8. Never Compare Your Results with Other Visualization Techniques 9. Avoid Visualization Systems (e.g. AVS) 10. Never Cite References for the Data 11. Claim Generality but Show Results from a Single Data Set 12. Use Viewing Angle to Hide Blemishes 13. If Viewing Angle Fails, Try Specularity or Shadows 14. “This is easily extended to 3-D”

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

Why chemistry should be boring? It is not a media, but it is important

Design your presentation remembering that not all your readers are perfect: There are color blind people  Don’t rely on color alone for decisions  Use appropriate color schemes There are people with low sight  Avoid low contrast in visualizations and icons  Avoid tiny fonts

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

10 Media and tools Images – usual concerns

 Image file formats  Images  No JPEG, better using PNG  Usual problems  Leave TIFF for process printing  Tools  Images for the web: JPEG, PNG, Animated GIF  Movies  Quality  Usual problems  Antialiasing lines and borders  Tools  Resolution  Web  Screen 75dpi  better than 1000dpi for printers  New, unusual output platforms?  Colors  Unreadable colors (especially on conference beamers!)  Bad colors in print vs. good on screen  Black & White requested by some publications

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

Image formats Image quality

Edges on printed paper looks terrible at the printer resolution

Screen Paper JPEG

PNG

“mosquitoes” around sharp edges

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

Color gamut Swiss Knife for images

Each media has a Image Magick set of colors that can be reproduced convert img-from img-to convert –scale 50% img-from img-to This set is called the convert –flip img-from img-to device color gamut convert img-from –colorspace Gray img-to The number of colors convert img-from –annotate 0x0+50+10 ‘© Mario Valle 2005’ img-to available quantize the convert img-from –draw ‘image Over 50,10 0,0 logo.tif’ img-to color gamut. If the set is too small artifacts montage –tile 2x2 –geometry 512x512 img[1234].png tot.png (color banding) arise identify –verbose image.jpg

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

11 Other useful image tools LittleCMS example of processing

Color corrected for  GIMP HP1200PS  Littlecms  Photoshop

Check

Original image

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

Different rendering intents Movies – usual concerns

Codec availability  Impact playability on different machines. Unfortunately there is no universal no-fuss coding method  Codec and container file format are two distinct things Quality  Depends strongly on bitrate and compression method (trades compression time for quality)  Suggestions for natural movies did not apply to sharp edge ‘cartoon-like’ movies File size  Related to bitrate (quality), frame size, compressibility Output usage Rendering for visualization Rendering for presentation  Personal projection, web, TV program, DVD burning, etc.

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

Input to movie creation Codecs and containers

 Usually a set of frames (image files) Normally everything is perfect on your own workstation...... and nothing works on the conference room PC.  Better if the images are uncompressed (not to lose quality) Codecs  MPEG1  Should choose a framerate (usually 10-12 fps, but some  MPEG4 (different implementations: MS Mpeg4 V2, Divx, xvid) standards have a fixed framerate)  X264  Not too much frames! (500 frames @ 12 fps  42 sec) Containers  AVI  MOV  MPEG  MP4

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

12 Movie tools Movie players

Mencoder Default ones  My preferred workhorse  Windows Media Player  It has a myriad of options (difficult to master)  Quick Time Player (a triumph of non-usability) Adobe Premiere Other ones  Everything you can dream of  Xine (plays AVI on )  Perfect for adding titles and transitions  Mplayer (plays almost everything on Linux and Windows)  VLC (quick and multiplatform) VirtualDub  Simple editing Other  MJPEG, transcode, ffmpeg, ffmpegX, etc.

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

GUI for mencoder Web page

 Has unique opportunity for more interactive experience. Applets can be used in place of static images.  The browser is an universal user interface for data collections and some decision support systems.  Limits are in the available screen space and the heterogeneity of the client browsers (for platform, type and installed plugins)  Load time is a critical factor, if it is too long, the user is discouraged and can go away.

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

Consider for web display Unusual media: PDA & cellphones

Movies  Download time  Provide more than one resolution Images  Reduce number of colors  Reduce size Applets  Chime Currently very Animated GIF limited and not an  Quick and light usual medium, but who know...

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

13 Immersive visualization What we have covered

1. Data and data management  Enhanced interaction experience: you are  Know your data “inside” your data.  Record useful metadata  Needs special 2. Good visualization tool characteristics hardware to view,  Do not search the perfect tool, but think about your goal interact and track user position 3. Visualization tools here at CSCS  Besides stereo  Ask and consult with us! projection it adds 4. Digital storytelling tools and ideas viewpoint change  Present to persuade and communicate

Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005 Beyond Ball-and-Stick Tutorial – Mario Valle – CSCS User Day 27/09/2005

Beyond Ball -and -Stick

Thanks for your attention!

Mario Valle

[email protected] http://www.cscs.ch/~mvalle/

14