Capacity Building Workshop

Python for Climate Data Analysis

Christoph Menz November 26, 2019 Module Overview

• Lecture: Introduction to Python • Hands-On: Python for Beginners • Exercise: Python • Lecture: Introduction to Python Libraries • Hands-On: Access and analysis of netCDF Data with Python • Lecture: Python Libraries for Data Visualization • Hands-On: Visualization of Scientific Data with matplotlib • Hands-On: Visualization of Geospatial Data with cartopy • Exercise: Analysis and Visualization of netCDF Data with python

, Christoph Menz RD II: Climate Resilience 2 Introduction to Python

, Christoph Menz RD II: Climate Resilience 3 Introduction to Python

• High-level general-purpose programming language • Emerges in the late 80s and early 90s (first release 1991) • Based on teaching/prototyping language ABC • Freely available under Python Software Foundation License • Major design philosophy: readability and performance • Important features: • Dynamic types (type automatically declared and checked at runtime) • Automatized memory management • Objects, Loops, Functions • Easily extendible by various libraries (numpy, netCDF4, scikit-learn, ...)

, Christoph Menz RD II: Climate Resilience 4 Scope of this Course

• Basic programming and scripting with python • Read, preparation, statistics and visualization of netCDF based data • Focus on python version 3.x using Anaconda platform • Online Tutorials: Anaconda Tutorials https://docs.python.org/3/tutorial https://www.tutorialspoint.com/python3 https://scipy.org & http://scikit-learn.org https://matplotlib.org/ http://scitools.org.uk/cartopy

, Christoph Menz RD II: Climate Resilience 5 Anaconda - Data Science Platform

, Christoph Menz RD II: Climate Resilience 6 Jupyter - Interactive Computing Notebook

, Christoph Menz RD II: Climate Resilience 7 Spyder - Interactive Computing Notebook

, Christoph Menz RD II: Climate Resilience 8 Variables in python

• Variable types are automatically defined at runtime

variable_name = value

• Uses dynamic and static type casting: • 5*5.0 is a float and "Hello world" is a string • str(5) is a string and int("9") is a integer • python got 13 different built-in types: • bool, int, float, str, list, tuple, dict, bytearray, bytes, complex, ellipsis, frozenset, set • Possibility to create your own type for object-oriented programming ( class statement)

, Christoph Menz RD II: Climate Resilience 9 Variable Types • Basic variable types

Boolean - bool Int, Float and Complex - int, float, complex In [1]: x = True In [1]: x = 5 In [2]: y = False In [2]: y = 5.0 In [3]: Y = True In [3]: z = 5.0+2.0j

Characters and Strings - str In [1]: char = "a" In [2]: string = 'python' In [3]: SeNtEnCe = "This is a sentence." In [4]: x = """This is a sentence ...: across multiple lines""" In [5]: string[0:2] In [5]: py

, Christoph Menz RD II: Climate Resilience 10 Variable Types - Lists

• Lists are sequences of variables of arbitrary type (also lists of lists of lists ... possible) • Lists are mutable • Single elements of lists can be accessed by indexing (from 0 to length - 1)

List In [1]: List = [2.0, 5.0, True, 7.0, "text"] In [2]: ListList = [[2.0, 5.0], [True, 7.0, "more text"]]

In [3]: ListList[0] = List[4] In [4]: ListList Out [4]: ["text", [True, 7.0, "more text"]]

, Christoph Menz RD II: Climate Resilience 11 Variable Types - Tuples

• Tuples are similar to lists • But tuples are immutable

Tuple In [1]: Tuple = (2.0, 5.0, True, 7.0, "text") In [2]: TupleTuple = ((2.0, 5.0), (True, 7.0, "more text"))

In [3]: TupleTuple[0] = Tuple[4] ------TypeError Traceback (most recent call last) in () ----> 1 TupleTuple [0] = Tuple [4]

TypeError : 'tuple' object does not support item assignment

, Christoph Menz RD II: Climate Resilience 12 Variable Types - Dictionaries

• Dictionaries are unordered collections of arbitrary variables • Dictionaries are mutable • Elements of dictionaries get accessed by keys instead of indices • Keys in dictionaries are unique

Dictionary In [1]: my_dict = {"a":2.0, "b":5.0, "zee":[True, True]} In [2]: my_dict["b"] = 23.0 In [3]: my_dict Out [3]: {'a':2.0, 'b':23, 'zee':[True,True]} In [4]: {"a":2.0, "b":5.0, "zee":[True, True], "a":7} Out [4]: {'a':7, 'b':5.0, 'zee':[True,True]}

, Christoph Menz RD II: Climate Resilience 13 Operations

Addition & Subtraction • python supports the usual In [1]: 3 + 5.0 mathematical operations on Out [1]: 8.0 float, int and complex In [2]: 3 - 5 • Out [2]: -2 Dynamic casting depends on operator and variable type Multiplication & Division In [1]: 4 * 4 Power & Root Out [1]: 16 In [2]: 8 / 2 In [1]: 4**2 Out [2]: 4.0 Out [1]: 16 In [3]: 7 // 3 In [2]: 4**2.5 Out [3]: 2 Out [2]: 32.0 In [4]: 7 % 3 In [3]: 16**0.5 Out [4]: 1 Out [3]: 4.0

, Christoph Menz RD II: Climate Resilience 14 • in -Operator permits an easy search functionality

in-Operator In [7]: 7 in [1, 2, 3, 4, 5] Out [7]: False In [8]: "b" in {"a":4, "b":6, "":8} Out [8]: True

Boolean Operations

Comparisons • Python uses usual comparison In [1]: 5 > 3 operations Out [1]: True In [2]: 5 >= 3 Out [2]: True In [3]: 5 < 3 Out [3]: False In [4]: 5 <= 3 Out [4]: False In [5]: 5 == 3 Out [5]: False In [6]: 5 != 3 Out [6]: True

, Christoph Menz RD II: Climate Resilience 15 Boolean Operations

Comparisons • Python uses usual comparison In [1]: 5 > 3 operations Out [1]: True • In [2]: 5 >= 3 in -Operator permits an easy search Out [2]: True functionality In [3]: 5 < 3 Out [3]: False In [4]: 5 <= 3 in-Operator Out [4]: False In [7]: 7 in [1, 2, 3, 4, 5] In [5]: 5 == 3 Out [7]: False Out [5]: False In [8]: "b" in {"a":4, "b":6, "c":8} In [6]: 5 != 3 Out [8]: True Out [6]: True

, Christoph Menz RD II: Climate Resilience 15 Boolean Operators

Logical NOT • python supports the basic logical Operator Results operators to combine booleans not True False not False True

Logical AND Logical OR x Operator y Results x Operator y Results True and True True True or True True True and False False True or False True False and True False False or True True False and False False False or False False

, Christoph Menz RD II: Climate Resilience 16 Methods of Objects/Variables Object methods • Python variables are not In [1]: x = [] just atomic variables In [2]: x.append(3) • Python variables are In [3]: x.append(5) objects by themself In [4]: print(x) [3,5] • Each variable already In [5]: y = {"a":1,"b":2,"c":3} comes with associated In [6]: print(y.keys()) methods dict_keys(['a','b','c']) In [7]: "This is a sentence".split("") • Syntax: In [7]: ['This','is','a','sentence'] variable.method In [8]: "".join(["This","is","a","list"]) In [8]: 'This is a list'

You can use the dir() function to get an overview of all methods available for a given variable.

, Christoph Menz RD II: Climate Resilience 17 • Python uses indentation (leading whitespaces) instead of bracketsINDENTATION to seperate code blocks

INDENTATION

Condition and Indentation • Condition start with if and ends with : (equivalent to ”then” in other languages) • Syntax: if expression : statement

if In [1]: x = 7 In [2]: if x >= 5 and x <= 10: ...: print("x is above 5") ...: print("x is below 10") ...: x is above 5 x is below 10

, Christoph Menz RD II: Climate Resilience 18 INDENTATION

INDENTATION

Condition and Indentation • Condition start with if and ends with : (equivalent to ”then” in other languages) • Syntax: if expression : statement • Python uses indentation (leading whitespaces) instead of brackets to seperate code blocks

if In [1]: x = 7 In [2]: if x >= 5 and x <= 10: ...: print("x is above 5") ...: print("x is below 10") ...: x is above 5 x is below 10

, Christoph Menz RD II: Climate Resilience 18 Condition and Indentation • Condition start with if and ends with : (equivalent to ”then” in other languages) • Syntax: if expression : statement • Python uses indentation (leading whitespaces) instead of bracketsINDENTATION to seperate code blocks

if In [INDENTATION1]: x = 7 In [2]: if x >= 5 and x <= 10: ...: print("x is above 5") ...: print("x is below 10") ...: x is above 5 x is below 10

, Christoph Menz RD II: Climate Resilience 18 Condition and Indentation

• if conditions support arbitrary number of elif conditions and one possible else condition

if ... elif ... else In [1]: x = 20 In [2]: if x >= 5 and x <= 10: ...: print("x is between 5 and 10") ...: elif x < 5: ...: print("x is below 5") ...: elif x in [15,20,25]: ...: print("x is 15, 20 or 25") ...: else: ...: print("x is out of bound") ...: x is 15, 20 or 25

, Christoph Menz RD II: Climate Resilience 19 Loops

• For loops iterate only a specific number of times • Syntax: for variable in iterable : statement • Iterable are objects you can iterate over (list, tuple, dict, iterators, etc.)

for-Loop In [1]: for x in [2,4,6,8]: ...: print(x*2) ...: 4 8 12 16

, Christoph Menz RD II: Climate Resilience 20 Built-In Functions

• Python ships with several built-in functions for daily usage • Syntax: function(arguments) • Function arguments are comma seperated values

print() Function len() Function In [1]: print("123") In [1]: len("123456") 123 Out [1]: 6

In [2]: print(123) In [2]: len([3, 5, 8]) 123 Out [2]: 3 In [3]: print(1,2,3,"123") In [3]: len({"a":13,"b":21}) 1 2 3 123 Out [3]: 2

, Christoph Menz RD II: Climate Resilience 21 Type Related Built-In Functions

• Use the type() function to type() Function get the type of any variable In [1]: type("PyThOn") • Out [1]: str Type conversion can be In [2]: type(3) done using one of the Out [2]: int following functions: In [3]: type(3.0) bool(), int(), float(), str(), Out [3]: float list() tuple() dict() In [4]: type({"a":13,"b":21}) , , Out [4]: dict

Type Conversion I Type Conversion II In [1]: bool(0) In [1]: list((2,3,5)) Out [1]: False Out [1]: [2, 3, 5] In [2]: bool(2.2) In [2]: tuble([2,3,5]) Out [2]: True Out [2]: (2, 3, 5) In [3]: int(2.8) In [3]: float("3.14") Out [3]: 2 Out [3]: 3.14

, Christoph Menz RD II: Climate Resilience 22 Mathematical Built-In Functions • Python supports basic mathematical operations • Work on numbers: abs and round • Work on list and tuples: min, max, sum and sorted

min(), max(), sum() and sorted() abs() and round() In [1]: min([55,89,144,233]) Out [1]: 55 In [1]: abs(-5) In [2]: max([55,89,144,233]) Out [1]: 5 Out [2]: 233 In [2]: round(24.03198) In [3]: sum([55,89,144,233]) Out [2]: 24 Out [3]: 521 In [3]: round(24.03198,3) In [4]: sorted([12,3,17,3]) Out [3]: 24.032 Out [4]: [3, 3, 12, 17] In [5]: sorted(["b","aca","aaa","cd"]) Out [5]: ['aaa', 'aca', 'b', 'cd']

, Christoph Menz RD II: Climate Resilience 23 Help Built-In Function • The most important built-in function is help() • Gives you a short description on the given argument (variables or other functions) help() In [1]: help(max) Help on built-in function max in module builtins:

max(...) max(iterable, *[, default=obj, key=func]) -> value max(arg1, arg2, *args, *[, key=func]) -> value

With a single iterable argument, return its biggest item. The default keyword-only argument specifies an object to return if the provided iterable is empty. With two or more arguments, return the largest argument.

, Christoph Menz RD II: Climate Resilience 24 mandatory optional parameters parameters

User-Defined Functions • Python supports also user-defined functions • Arbitrary number of function parameters (also optional arguments possible)

User-Defined Function: my_function In [1]: def my_function(x, y, opt_arg1 = 1, opt_arg2 = "abc"): ...: out = x + y ...: print(opt_arg1) ...: print(opt_arg2) ...: return out In [2]: z = my_function( 2, 3, opt_arg1 = "cba") cba abc In [3]: print(z) 5

, Christoph Menz RD II: Climate Resilience 25 User-Defined Functions • Python supports also user-defined functions • Arbitrary number of function parameters (also optional arguments possible) mandatory optional parameters parameters User-Defined Function: my_function In [1]: def my_function(x, y, opt_arg1 = 1, opt_arg2 = "abc"): ...: out = x + y ...: print(opt_arg1) ...: print(opt_arg2) ...: return out In [2]: z = my_function( 2, 3, opt_arg1 = "cba") cba abc In [3]: print(z) 5

, Christoph Menz RD II: Climate Resilience 25 Hands-On: Python for Beginners

, Christoph Menz RD II: Climate Resilience 26 Exercise: Python

, Christoph Menz RD II: Climate Resilience 27 Exercise

1. Test if the following operations between various types are possible: float*int, bool*int, bool*float, bool+bool, string*bool, string*int, string*float, string+int 2. What is the result of the following operations: ["a","b","c"]*3, (1,2,3)*3 and {"a":1,"b":2,"c":3}*3. Could you explain why the last operation isn’t working? 3. Print all even numbers between 0 and 100 to the screen (hint: use a for loop and if condition).

, Christoph Menz RD II: Climate Resilience 28 Exercise

4. Write a function that calculates the mean of a given list of floats (hint: use sum() and len()). 5. Write a function that calculates the median of a given list of floats (hint: use sorted() and len() to determine the central value of the sorted list, use if condition to distinguish between even and odd length lists). 6. Test your mean and median function with the following lists: list mean median [4,7,3,2,7,4,2] 4.143 4.0 [2,6,3,1,8,5,4] 4.143 4.0 [2,1,4,5,7,9] 4.667 4.5 [2,7,4,8,5,1] 4.500 4.5

, Christoph Menz RD II: Climate Resilience 29 Introduction to Python Libraries

, Christoph Menz RD II: Climate Resilience 30 Libraries • Basic functionality of python is limited • Libraries extend the functionality of python to various fields (I/O of various formats, math/statistics, visualization, etc.) • Import syntax: import • Sublibrary/Function import: from import • Use syntax: . Libraries In [1]: import os In [2]: from os import listdir In [3]: listdir("/") In [3]: ['root','etc','usr','bin', ... ,'srv','tmp','mnt'] In [4]: import numpy as np In [4]: np.sqrt(2) In [3]: 1.4142135623730951

, Christoph Menz RD II: Climate Resilience 31 Python Package Index

• Search for libraries on the web • Short description, install instructions and source files https://pypi.org

, Christoph Menz RD II: Climate Resilience 32 Install with Anaconda Navigator • Anaconda Navigator can install libraries (→ Environment) • You can install multiple environments with different libraries

, Christoph Menz RD II: Climate Resilience 33 Important Libraries

os OS routines implementation in python cftime Implementation of date and time objects Fast general-purpose processing of multi-dimensional numpy arrays scikit-learn Machine-learning routines in python Easy and intuitive handling of structured and time series data netCDF4 I/O of netCDF files matplotlib Basic 2D visualization in python cartopy Draw geospatial data in python

, Christoph Menz RD II: Climate Resilience 34 Introduction to python-numpy

, Christoph Menz RD II: Climate Resilience 35 Introduction to python-numpy Fast general-purpose processing for large multidimensional arrays

• Implements a powerful N-dimensional array type (huge improvement over lists/tuples) • Basic linear algebra, Fourier transform, and random number capabilities • I/O of formated and unformated data • Based on C and FORTRAN77 routines in the background • Requirement for most scientific python libraries (matplotlib, pandas, netCDF4, etc.) Import Numpy In [1]: import numpy as np

, Christoph Menz RD II: Climate Resilience 36 • Ndarrays implement a couple of new methods .ndim get number of dimensions .reshape change shape of array .shape get shape of array .flatten make array flat .size get total size of array .swapaxes swap dimensions Ndarray methods In [4]: y = np.array([[1,2,3],[4,5,6]]) In [5]: y.shape Out [5]: (2,3) In [6]: y.flatten() Out [6]: array([1,2,3,4,5,6])

Numpy ndarray • Key element of numpy is the Create ndarray new variable class: In [2]: x = np.array([1,2,3]) In [3]: type(x) ndarray Out [3]: numpy.ndarray

, Christoph Menz RD II: Climate Resilience 37 • Ndarrays implement a couple of new methods .ndim get number of dimensions .reshape change shape of array .shape get shape of array .flatten make array flat .size get total size of array .swapaxes swap dimensions Ndarray methods In [4]: y = np.array([[1,2,3],[4,5,6]]) In [5]: y.shape Out [5]: (2,3) In [6]: y.flatten() Out [6]: array([1,2,3,4,5,6])

Numpy ndarray • Key element of numpy is the Create ndarray new variable class: In [2]: x = np.array([1,2,3]) In [3]: type(x) ndarray Out [3]: numpy.ndarray

, Christoph Menz RD II: Climate Resilience 37 Numpy ndarray • Key element of numpy is the Create ndarray new variable class: In [2]: x = np.array([1,2,3]) In [3]: type(x) ndarray Out [3]: numpy.ndarray • Ndarrays implement a couple of new methods .ndim get number of dimensions .reshape change shape of array .shape get shape of array .flatten make array flat .size get total size of array .swapaxes swap dimensions Ndarray methods In [4]: y = np.array([[1,2,3],[4,5,6]]) In [5]: y.shape Out [5]: (2,3) In [6]: y.flatten() Out [6]: array([1,2,3,4,5,6])

, Christoph Menz RD II: Climate Resilience 37 Numpy Functions • Numpy implements several array functions arange numpy version of range zeros array filled with 0 repeat repeat n times ones array filled with 1 linspace vector from interval meshgrid matrices from vectors random random numbers

Numpy Functions In [1]: np.linspace(1, 3, 5) Out [1]: array([1., 1.5, 2., 2.5, 3.]) In [2]: np.random.randint(1, 100, 5) Out [2]: array([52, 75, 29, 52, 24]) In [3]: np.zeros([3, 5]) Out [3]: array([[0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]])

, Christoph Menz RD II: Climate Resilience 38 Math Functions • Mathematical functions for elementwise evaluation Exponential exp and log are defined as natural exponential and In [1]: np.exp([0, 1, np.log(2)]) Out [1]: array([1. , 2.71828183, 2. ]) logarithm (base e) In [2]: np.log([0, np.e, np.e**0.5]) Out [2]: array([0. , 1. , 0.5]) log is invers to exp

Trigonometric Functions Further Functions In [1]: x = np.array([0, np.pi, 0.5*np.pi]) arcsin, arccos, arctan, In [2]: np.sin(x) deg2rad, rad2deg, sinh, Out [2]: array([0., 0., 1.]) cosh, tanh, arcsinh, In [3]: np.cos(x) Out [3]: array([1., -1., 0.]) arccosh, arctanh, sqrt, In [4]: np.tan(x) log2, log10, exp2, ... Out [4]: array([0., 0., 0.])

, Christoph Menz RD II: Climate Resilience 39 Statistical Functions mean: mean(x, axis = ) • Numpy implements usual sum: sum(x, axis = ) statistical functions median: median(x, axis = ) maximum: max(x, axis = ) • Implementation as function minimum: min(x, axis = ) (np.mean) and array dimensions along to evaluate : method (x.mean) (int or tuple of ints)

Statistic Functions I Statistic Functions II In [1]: x = np.random.random((4,2,8)) In [5]: np.min(x, axis = 2) In [2]: np.mean(x) Out [5]: array([[0.0381, 0.2301], Out [2]: 0.46376 [0.0220, 0.1045], In [3]: x.sum(axis = (0,2)) [0.1903, 0.2746], Out [3]: array([15.59966 , 14.08082]) [0.0539, 0.0203]]) In [4]: np.median(x) In [6]: x.max() Out [4]: 0.38988 Out [6]: 0.9788

, Christoph Menz RD II: Climate Resilience 40 Statistical Functions • Specific percentile of a given array: percentile(x, q = , axis = ) : percentile in [0,100]

Statistic Functions IV In [1]: x = np.random.normal(0,1,1000) In [2]: np.percentile(x, q = 15) Out [2]: -1.07467 In [3]: np.percentile(x, q = 85) Out [3]: 1.04699 In [4]: np.percentile(x, q = (2.5, 97.5)) Out [4]: array([-1.85338831, 2.011201 ])

, Christoph Menz RD II: Climate Resilience 41 Statistical Functions • Specific percentile of a given array: percentile(x, q = , axis = ) : percentile in [0,100]

Statistic Functions IV In [1]: x = np.random.normal(0,1,1000) In [2]: np.percentile(x, q = 15) Out [2]: -1.07467 In [3]: np.percentile(x, q = 85) Out [3]: 1.04699 In [4]: np.percentile(x, q = (2.5, 97.5)) Out [4]: array([-1.85338831, 2.011201 ])

, Christoph Menz RD II: Climate Resilience 41 Statistical Functions • Specific percentile of a given array: percentile(x, q = , axis = ) : percentile in [0,100]

Statistic Functions IV In [1]: x = np.random.normal(0,1,1000) In [2]: np.percentile(x, q = 15) Out [2]: -1.07467 In [3]: np.percentile(x, q = 85) Out [3]: 1.04699 In [4]: np.percentile(x, q = (2.5, 97.5)) Out [4]: array([-1.85338831, 2.011201 ])

, Christoph Menz RD II: Climate Resilience 41 Statistical Functions • Specific percentile of a given array: percentile(x, q = , axis = ) : percentile in [0,100]

Statistic Functions IV In [1]: x = np.random.normal(0,1,1000) In [2]: np.percentile(x, q = 15) Out [2]: -1.07467 In [3]: np.percentile(x, q = 85) Out [3]: 1.04699 In [4]: np.percentile(x, q = (2.5, 97.5)) Out [4]: array([-1.85338831, 2.011201 ])

, Christoph Menz RD II: Climate Resilience 41 Statistical Functions • Specific percentile of a given array: percentile(x, q = , axis = ) : percentile in [0,100]

Statistic Functions IV In [1]: x = np.random.normal(0,1,1000) In [2]: np.percentile(x, q = 15) Out [2]: -1.07467 In [3]: np.percentile(x, q = 85) Out [3]: 1.04699 In [4]: np.percentile(x, q = (2.5, 97.5)) Out [4]: array([-1.85338831, 2.011201 ])

, Christoph Menz RD II: Climate Resilience 41 What else is numpy capable of

• Logical functions: isfinite(x), isnan(x), equal(x, y), all(b, axis = ), any(b, axis = ), ... • Various functions for linear algebra (ordinary matrix multiplication, matrix decomposision, eigenvalues and eigenvectors, determinant, solve linear equations) • I/O functions to read and write formated ascii or unformated (raw binary) files • Draw random numbers from various distributions (uniform, Gaussian, binomial, Poisson, chi-square, ...) • ... https://docs.scipy.org/doc/numpy/reference/routines.html

, Christoph Menz RD II: Climate Resilience 42 Introduction to python-netCDF4

, Christoph Menz RD II: Climate Resilience 43 Introduction to python-netCDF4

• Read and write netCDF4 files in python • Based on Unidata group netCDF4-C libraries • Uses python-numpy arrays to store data in python • We will cover only the read-functionality in this course

python-netCDF4 In [1]: from netCDF4 import Dataset In [2]: from cftime import num2date

Dataset Main object to read and write netCDF files num2date Contains functions to translate the dates

, Christoph Menz RD II: Climate Resilience 44 Read a netCDF Dataset • netCDF files can be accessed by: Dataset() • New object type netCDF4._netCDF4.Dataset • Can access every detail of the netCDF file (dimensions, variables, attributes) python-netCDF4 In [1]: from netCDF4 import Dataset In [2]: nc = Dataset("some_netcdf_file_name.nc")

...

In [999]: nc.close()

• New object nc implements various object-methods nc. to access the netCDF file • nc needs to be closed after everything is done: nc.close()

, Christoph Menz RD II: Climate Resilience 45 Access Global Attributes • Get list of all global attributes: nc.ncattrs() • Get value of specific attribute: nc.getncattr(””) Access Global Attributes In [3]: nc.ncattrs() In [3]: ['institution', 'institute_id', 'experiment_id', ... 'cmor_version'] In [4]: nc.getncattr("institution") In [4]: 'Max Planck Institute for ' In [5]: nc.getncattr("experiment") In [5]: 'RCP8.5'

, Christoph Menz RD II: Climate Resilience 46 Access Dimensions • Get a dictionary of all dimension: nc.dimensions (not a function) Access Dimensions In [3]: nc.dimensions.keys() In [3]: odict_keys(['time', 'lat', 'lon', 'bnds']) In [4]: nc.dimensions["time"].name In [4]: 'time' In [5]: nc.dimensions["time"].size In [5]: 1461 In [6]: nc.dimensions["time"].isunlimited In [6]: True nc.dimensions[””].name name of nc.dimensions[””].size size of nc.dimensions[””].isunlimited() True if is record (size of record dimensions (time) can increase unlimited)

, Christoph Menz RD II: Climate Resilience 47 Access Variables • Get a dictionary of all variables: nc.variables (not a function)

Access Variables In [3]: nc.variables.keys() In [3]: odict_keys(['lon', 'lat', 'time', 'time_bnds', 'pr']) In [4]: nc.variables["pr"].ncattrs() In [4]: ['standard_name', 'long_name', 'units', ... 'comment'] In [5]: nc.variables["pr"].getncattr("standard_name") In [5]: 'precipitation_flux'

, Christoph Menz RD II: Climate Resilience 48 Access Variables

• Access data of given variable: nc.variables[””][:] • Data is represented by a numpy-array

Access Variables In [3]: data = nc.variables["pr"][:] In [4]: type(data) In [4]: numpy.array In [5]: data.mean() In [5]: 0.5545673

, Christoph Menz RD II: Climate Resilience 49 Access Time

• Time variable is usually saved as numerical value in given units and calendar • Function num2date can be used to translate numerical value to datetime

Access Time In [3]: time = nc.variables["time"][:] In [4]: type(time) In [4]: numpy.array In [5]: time In [5]: array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., ...

, Christoph Menz RD II: Climate Resilience 50 Access Time • Function num2date can be used to translate numerical value to datetime • Returns a numpy-array of datetime-objects (containing: year, month, day, ...) Convert Time In [3]: time = nc.variables["time"][:] In [4]: units = nc.variables["time"].units In [5]: calendar = nc.variables["time"].calendar In [5]: cftime.num2date(time, units = units, calendar = calendar) array([ cftime.datetime(1979, 1, 1, 0, 0), cftime.datetime(1979, 1, 2, 0, 0), cftime.datetime(1979, 1, 3, 0, 0), ...

units: ’days since 1979-1-1 00:00:00’ calendar: ’standard’

, Christoph Menz RD II: Climate Resilience 51 Hands-On: Access and analysis of netCDF Data with Python

, Christoph Menz RD II: Climate Resilience 52 Python Libraries for Data Visualization

, Christoph Menz RD II: Climate Resilience 53 Introduction to python-matplotlib

, Christoph Menz RD II: Climate Resilience 54 Introduction to python-matplotlib • Library for 2D plotting in python • Originates in emulating MATLAB graphics commands • Produce nice looking plots fast and easy, but user still have the power to change every detail (line properties, font, ticks, colors, etc.) https://matplotlib.org

, Christoph Menz RD II: Climate Resilience 55 Basic Code Layout fig = pyplot.figure( figsize4 =( ,4)) Import plotting module from matplotlib: ax = fig.add_subplot(1,1,1) from matplotlib import pyplot

Creating plotting environment: fig = pyplot.figure( figsize4 =( ,4)) ax = fig.add_subplot(1,1,1)

Saving to file and closing plotting environment: fig.savefig(””) pyplot.close(fig)

, Christoph Menz RD II: Climate Resilience 56 Line Plot • Line plot y versus x (w/wo point markers): ax.plot(x, y, color = , ls = , lw = , marker = , ms = , ...)

Optional Parameters color of plot (string, hex, ...) line width (float) line style (”-”, ”–”, ...) marker style (”x”, ”o”, ”.”, ...) marker size (float) , Christoph Menz RD II: Climate Resilience 57 Bar Plot

• Bar plot height versus x (w/wo point markers):

ax.bar(x, height, width = , yerr = , fc = , ec = , ...)

Optional Parameters width of each bar (float or array) optional error (float or array) color of bar faces (string, hex, ...) color of bar edges (string, hex, ...)

, Christoph Menz RD II: Climate Resilience 58 Histogram

• Histogram plot of given values:

ax.hist(x, bins = , density = , histtype = , fc = , ec = ...)

Optional Parameters bins of histogram (integer or vector) count or density (True/False) type of histogram (’bar’, ’barstacked’, ’step’, ’stepfilled’)

, Christoph Menz RD II: Climate Resilience 59 ax.set_ylabel() ax.set_title() ax.set_yticks() ax.set_yticklabels()

ax.set_ylim()

ax.set_xlim() ax.set_xticks() ax.set_ylabel() ax.set_xticklabels()

Plot Layout

, Christoph Menz RD II: Climate Resilience 60 ax.set_ylabel() ax.set_title() ax.set_yticks() ax.set_yticklabels()

ax.set_xticks() ax.set_ylabel() ax.set_xticklabels()

Plot Layout

ax.set_ylim()

ax.set_xlim()

, Christoph Menz RD II: Climate Resilience 60 ax.set_ylabel() ax.set_title()

ax.set_ylabel()

Plot Layout

ax.set_yticks() ax.set_yticklabels()

ax.set_ylim()

ax.set_xlim() ax.set_xticks() ax.set_xticklabels()

, Christoph Menz RD II: Climate Resilience 60 ax.set_title()

Plot Layout ax.set_ylabel()

ax.set_yticks() ax.set_yticklabels()

ax.set_ylim()

ax.set_xlim() ax.set_xticks() ax.set_ylabel() ax.set_xticklabels()

, Christoph Menz RD II: Climate Resilience 60 Plot Layout ax.set_ylabel() ax.set_title() ax.set_yticks() ax.set_yticklabels()

ax.set_ylim()

ax.set_xlim() ax.set_xticks() ax.set_ylabel() ax.set_xticklabels()

, Christoph Menz RD II: Climate Resilience 60 Hands-On: Visualization of Scientific Data with matplotlib

, Christoph Menz RD II: Climate Resilience 61 Introduction to python-cartopy

, Christoph Menz RD II: Climate Resilience 62 Mesh Plot • Plot a colored map with given longitude, latitude and data: ax.pcolormesh(lon, lat, data, cmap = , vmin = , vmax = , ...)

Optional Parameters color definition of the map (Colormap) minimum value for color bounds (float) maximum value for color bounds (float)

, Christoph Menz RD II: Climate Resilience 63 Introduction to python-cartopy

• Matplotlib can only plot raw data without referencing underlying geographical information (no countries, no lakes, no projection, ...) • Cartopy builds on matplotlib and implements advanced mapping features • Developed by UK Met Office • Added features: • Boundaries of continents, countries and states • Adding rivers and lakes to map • Adding content from shape file to map • Relate map to a projections and translate between different projections

, Christoph Menz RD II: Climate Resilience 64 Basic Code Layout Import matplotlib.pyplot and coordinate reference system (crs) from cartopy: from matplotlib import pyplot import cartopy.crs as ccrs Creating figure environment: fig = pyplot.figure( figsize4 =( ,4)) Creating axes with specific map projection: proj_map = ccrs.Robinson() ax = fig.add_subplot(1,1,1, projection = proj_map) Adding mesh plot with projection of given data: proj_data = ccrs.PlateCarree() ax.pcolormesh(lon, lat, data, cmap = cm.jet, transform = proj_data)

, Christoph Menz RD II: Climate Resilience 65 Projections: Overview

ccrs.PlateCarree() ccrs.Robinson() ccrs.Orthographic() ... Map Projection

Transformation between projections: proj_cyl = ccrs.PlateCarree() proj_rot = ccrs.RotatedPole(77, 43) lon = [-170, 170, 170,-170,-170] lat = [-30,-30, 30, 30,-30] ax.fill(lon, lat, transform = proj_cyl) ax.fill(lon, lat, transform = proj_rot) Data Transformation

, Christoph Menz RD II: Climate Resilience 66 Adding Features to Map Cartopy implements various map features

import cartopy.feature as cfeature

coastline = cfeature.COASTLINE borders = cfeature.BORDERS lakes = cfeature.LAKES rivers = cfeature.RIVERS

ax.add_feature()

• Features in 3 different resolutions (110 m, 50 m and 10m) from www.natrualearthdata.com • External shapefiles can also be plotted

, Christoph Menz RD II: Climate Resilience 67 Colorbar

• Add a colorbar to an existing map plot: map = ax.pcolormesh(lon, lat, data) fig.colorbar(map, ax = , label =

, Christoph Menz RD II: Climate Resilience 68 Further Plotting Routines • Besides pcolormesh matplotlib/cartopy supports additional plotting routines

Contour Plot ax.contour(lon, lat, data)

Filled Contour Plot ax.contourf(lon, lat, data)

Wind Vector Plot ax.quiver(lon, lat, U, V) U and V are zonal and meridional wind components

, Christoph Menz RD II: Climate Resilience 69 Hands-On: Visualization of Geospatial Data with cartopy

, Christoph Menz RD II: Climate Resilience 70 Exercise: Analysis and Visualization of netCDF Data with python

, Christoph Menz RD II: Climate Resilience 71 Exercise

1. Create a line plot showing the annual temperature anomaly timeseries of observation and GCM model simulation of the Manila grid box. The anomaly is defined as the temperature of each year minus the average of 1981 to 2000. Hints: • Use read_single_data() to read the data from file. • Use ilon = 12; ilat = 19 as coordinates of Manila. • Select the timeframe 1981 to 2000 using get_yindex(). • Calculate the average using either np.mean() function or data.mean() method. • Use create_lineplot() and save_plot() to create and save the plot.

, Christoph Menz RD II: Climate Resilience 72 Exercise

2. Create a map plot of the GCM temperature bias (for the period 1981 to 2000). Here the bias is defined as the difference of the long term averages (1981 to 2000) between GCM simulation and observation (GCM minus observation). Hints: • Use read_single_data() to read the data from file. • Select the timeframe 1981 to 2000 using get_yindex(). • Use create_mapplot() and save_plot() to create and save the plot.

, Christoph Menz RD II: Climate Resilience 73