<<

Lists Also see Arrays Libraries Python For Data Science Cheat Sheet >>> a = 'is' Import libraries Python Basics >>> b = 'nice' >>> import numpy Learn More Python for Data Science Interactively at www.datacamp.com >>> my_list = ['my', 'list', a, b] >>> import numpy as np >>> my_list2 = [[4,5,6,7], [3,4,5,6]] Selective import >>> from math import pi Scientific computing 2D plotting Variables and Data Types Selecting List Elements Index starts at 0 Install Python Variable Assignment Subset >>> my_list[1] Select item at index 1 >>> x=5 >>> my_list[-3] Select 3rd last item >>> x Slice 5 >>> my_list[1:3] Select items at index 1 and 2 Calculations With Variables >>> my_list[1:] Select items after index 0 Select items before index 3 Leading open data science platform Free IDE that is included Create and share Sum of two variables >>> my_list[:3] >>> x+2 Copy my_list powered by Python with Anaconda documents with live code, >>> my_list[:] visualizations, text, ... 7 Subset Lists of Lists >>> x-2 Subtraction of two variables my_list[list][itemOfList] >>> my_list2[1][0] 3 Numpy Arrays Also see Lists >>> my_list2[1][:2] >>> x*2 Multiplication of two variables >>> my_list = [1, 2, 3, 4] 10 List Operations >>> x**2 Exponentiation of a variable >>> my_array = np.array(my_list) 25 >>> my_list + my_list >>> my_2darray = np.array([[1,2,3],[4,5,6]]) >>> x%2 Remainder of a variable ['my', 'list', 'is', 'nice', 'my', 'list', 'is', 'nice'] Selecting Numpy Array Elements Index starts at 0 1 >>> my_list * 2 >>> x/float(2) Division of a variable ['my', 'list', 'is', 'nice', 'my', 'list', 'is', 'nice'] Subset Select item at index 1 2.5 >>> my_list2 > 4 >>> my_array[1] True 2 Types and Type Conversion Slice List Methods Select items at index 0 and 1 str() '5', '3.45', 'True' Variables to strings >>> my_array[0:2] Get the index of an item array([1, 2]) >>> my_list.index(a) Count an item Subset 2D Numpy arrays int() 5, 3, 1 Variables to integers >>> my_list.count(a) Append an item at a time my_2darray[rows, columns] >>> my_list.append('!') >>> my_2darray[:,0] Remove an item array([1, 4]) float() 5.0, 1.0 Variables to floats >>> my_list.remove('!') >>> del(my_list[0:1]) Remove an item Numpy Array Operations Reverse the list bool() True, True, True Variables to booleans >>> my_list.reverse() Append an item >>> my_array > 3 >>> my_list.extend('!') array([False, False, False, True], dtype=bool) Remove an item >>> my_list.pop(-1) >>> my_array * 2 Asking For Help Insert an item >>> my_list.insert(0,'!') array([2, 4, 6, 8]) Sort the list >>> help(str) >>> my_list.sort() >>> my_array + np.array([5, 6, 7, 8]) Strings array([6, 8, 10, 12]) Numpy Array Functions >>> my_string = 'thisStringIsAwesome' String Operations Index starts at 0 >>> my_string >>> my_array.shape Get the dimensions of the array 'thisStringIsAwesome' >>> my_string[3] >>> np.append(other_array) Append items to an array Insert items in an array String Operations >>> my_string[4:9] >>> np.insert(my_array, 1, 5) >>> np.delete(my_array,[1]) Delete items in an array String Methods Mean of the array >>> my_string * 2 >>> np.mean(my_array) String to uppercase Median of the array 'thisStringIsAwesomethisStringIsAwesome' >>> my_string.upper() >>> np.median(my_array) String to lowercase Correlation coefficient >>> my_string + 'Innit' >>> my_string.lower() >>> my_array.corrcoef() Count String elements Standard deviation 'thisStringIsAwesomeInnit' >>> my_string.count('w') >>> np.std(my_array) Replace String elements >>> 'm' in my_string >>> my_string.replace('e', 'i') True >>> my_string.strip() Strip whitespaces DataCamp Learn Python for Data Science Interactively Python For Data Science Cheat Sheet Working with Different Programming Languages Widgets Kernels provide computation and communication with front-end interfaces Notebook widgets provide the ability to visualize and control changes Jupyter Notebook like the notebooks. There are three main kernels: in your data, often as a control like a slider, textbox, etc. Learn More Python for Data Science Interactivelyat www.DataCamp.com You can use them to build interactive GUIs for your notebooks or to IRkernel IJulia synchronize stateful and stateless information between Python and Installing Jupyter Notebook will automatically install the IPython kernel. JavaScript. Saving/Loading Notebooks Restart kernel Interrupt kernel Create new notebook Restart kernel & run Interrupt kernel & Download serialized Save notebook clear all output state of all widget Open an existing all cells with interactive models in use Make a copy of the notebook Connect back to a widgets Restart kernel & run remote notebook current notebook all cells Embed current Rename notebook Run other installed widgets kernels Revert notebook to a Save current notebook previous checkpoint and record checkpoint Command Mode: Download notebook as Preview of the printed - IPython notebook 15 notebook - Python - HTML Close notebook & stop - Markdown 13 14 - reST running any scripts - LaTeX - PDF 1 2 3 4 5 6 7 8 9 10 11 12 Writing Code And Text Code and text are encapsulated by 3 basic cell types: markdown cells, code cells, and raw NBConvert cells. Edit Cells Edit Mode: 1. Save and checkpoint 9. Interrupt kernel 2. Insert cell below 10. Restart kernel Cut currently selected cells 3. Cut cell 11. Display characteristics Copy cells from 4. Copy cell(s) 12. Open command palette to clipboard clipboard to current 5. Paste cell(s) below 13. Current kernel cursor position 14. Kernel status Paste cells from 6. Move cell up Executing Cells 7. Move cell down 15. Log out from notebook server clipboard above Paste cells from Run current cells down 8. Run current cell current cell clipboard below Run selected cell(s) and create a new one current cell Paste cells from below Asking For Help clipboard on top Run current cells down Delete current cells of current cel and create a new one Walk through a UI tour Split up a cell from above Run all cells Revert “Delete Cells” List of built-in keyboard current cursor Run all cells above the Run all cells below invocation shortcuts position current cell the current cell Edit the built-in Merge current cell Merge current cell keyboard shortcuts Change the cell type of toggle, toggle Notebook help topics with the one above with the one below current cell scrolling and clear Description of Move current cell up Move current cell toggle, toggle current outputs markdown available Information on down Adjust metadata scrolling and clear in notebook unofficial Jupyter all output Notebook extensions underlying the Find and replace Python help topics current notebook in selected cells View Cells IPython help topics Remove cell Copy attachments of NumPy help topics attachments Toggle display of Jupyter SciPy help topics current cell Toggle display of toolbar logo and filename help topics Paste attachments of Insert image in SymPy help topics current cell selected cells Toggle display of cell help topics action icons: Insert Cells - None About Jupyter Notebook - Edit metadata Toggle line numbers - Raw cell format Add new cell above the Add new cell below the - Slideshow in cells current one current one - Attachments - Tags DataCamp Learn Python for Data Science Interactively Python For Data Science Cheat Sheet Inspecting Your Array Subsetting, Slicing, Indexing Also see Lists >>> a.shape Array dimensions Length of array Subsetting >>> len(a) 1 2 3 Select the element at the 2nd index NumPy Basics Number of array dimensions >>> a[2] Learn Python for Data Science Interactively at www.DataCamp.com >>> b.ndim 3 >>> e.size Number of array elements 1.5 2 3 >>> b[1,2] Select the element at row 0 column 2 >>> b.dtype Data type of array elements 6.0 4 5 6 (equivalent to b[1][2]) >>> b.dtype.name Name of data type >>> b.astype(int) Convert an array to a different type Slicing NumPy >>> a[0:2] 1 2 3 Select items at index 0 and 1 2 array([1, 2]) Asking For Help 1.5 2 3 The NumPy is the core library for scientific computing in >>> b[0:2,1] Select items at rows 0 and 1 in column 1 >>> np.info(np.ndarray.dtype) array([ 2., 5.]) 4 5 6 Python. It provides a high-performance multidimensional array 1.5 2 3 object, and tools for working with these arrays. Array >>> b[:1] Select all items at row 0 array([[1.5, 2., 3.]]) 4 5 6 (equivalent to b[0:1, :]) Same as Use the following import convention: Arithmetic Operations >>> [1,...] [1,:,:] array([[[ 3., 2., 1.], Subtraction [ 4., 5., 6.]]]) >>> import numpy as np >>> g = a - b Reversed array array([[-0.5, 0. , 0. ], >>> a[ : :-1] a NumPy Arrays array([3, 2, 1]) [-3. , -3. , -3. ]]) Subtraction Boolean Indexing 1D array 2D array 3D array >>> np.subtract(a,b) Select elements from less than 2 >>> b + a Addition >>> a[a<2] 1 2 3 a axis 1 axis 2 array([[ 2.5, 4. , 6. ], array([1]) 1 2 3 axis 1 [ 5. , 7. , 9. ]]) Fancy Indexing 1.5 2 3 Addition Select elements (1,0),(0,1),(1,2) and (0,0) axis 0 axis 0 >>> np.add(b,a) >>> b[[1, 0, 1, 0],[0, 1, 2, 0]] 4 5 6 >>> a / b Division array([ 4. , 2. , 6. , 1.5]) array([[ 0.66666667, 1. , 1. ], >>> b[[1, 0, 1, 0]][:,[0,1,2,0]] Select a subset of the ’s rows [ 0.25 , 0.4 , 0.5 ]]) array([[ 4. ,5. , 6. , 4. ], and columns >>> np.divide(a,b) Division [ 1.5, 2. , 3. , 1.5], Multiplication [ 4. , 5. , 6. , 4. ], Creating Arrays >>> a * b [ 1.5, 2. , 3. , 1.5]]) array([[ 1.5, 4. , 9. ], >>> a = np.array([1,2,3]) [ 4. , 10. , 18. ]]) >>> b = np.array([(1.5,2,3), (4,5,6)], dtype = float) >>> np.multiply(a,b) Multiplication Array Manipulation >>> c = np.array([[(1.5,2,3), (4,5,6)], [(3,2,1), (4,5,6)]], >>> np.exp(b) Exponentiation dtype = float) >>> np.sqrt(b) Square root Transposing Array >>> np.sin(a) Print sines of an array >>> i = np.transpose(b) Permute array dimensions Initial Placeholders >>> np.cos(b) Element-wise cosine >>> i.T Permute array dimensions >>> np.log(a) Element-wise natural logarithm Create an array of zeros Dot product Changing Array Shape >>> np.zeros((3,4)) >>> e.dot(f) Flatten the array >>> np.ones((2,3,4),dtype=np.int16) Create an array of ones array([[ 7., 7.], >>> b.ravel() >>> d = np.arange(10,25,5) Create an array of evenly [ 7., 7.]]) >>> g.reshape(3,-2) Reshape, but don’t change data spaced values (step value) Create an array of evenly Adding/Removing Elements >>> np.linspace(0,2,9) Comparison Return a new array with shape (2,6) spaced values (number of samples) >>> h.resize((2,6)) Create a constant array Element-wise comparison >>> np.append(h,g) Append items to an array >>> e = np.full((2,2),7) >>> a == b Insert items in an array >>> f = np.eye(2) Create a 2X2 identity matrix array([[False, True, True], >>> np.insert(a, 1, 5) Create an array with random values [False, False, False]], dtype=bool) >>> np.delete(a,[1]) Delete items from an array >>> np.random.random((2,2)) Element-wise comparison >>> np.empty((3,2)) Create an empty array >>> a < 2 Combining Arrays array([True, False, False], dtype=bool) Concatenate arrays >>> np.array_equal(a, b) Array-wise comparison >>> np.concatenate((a,d),axis=0) I/O array([ 1, 2, 3, 10, 15, 20]) Aggregate Functions >>> np.vstack((a,b)) Stack arrays vertically (row-wise) array([[ 1. , 2. , 3. ], Saving & Loading On Disk [ 1.5, 2. , 3. ], >>> a.sum() Array-wise sum [ 4. , 5. , 6. ]]) >>> np.save('my_array', a) >>> a.min() Array-wise minimum value >>> np.r_[e,f] Stack arrays vertically (row-wise) >>> np.savez('array.npz', a, b) >>> b.max(axis=0) Maximum value of an array row >>> np.hstack((e,f)) Stack arrays horizontally (column-wise) >>> np.load('my_array.npy') >>> b.cumsum(axis=1) Cumulative sum of the elements array([[ 7., 7., 1., 0.], Saving & Loading Text Files >>> a.mean() Mean [ 7., 7., 0., 1.]]) >>> b.median() Median >>> np.column_stack((a,d)) Create stacked column-wise arrays >>> a.corrcoef() Correlation coefficient array([[ 1, 10], >>> np.loadtxt("myfile.txt") Standard deviation [ 2, 15], >>> np.genfromtxt("my_file.csv", delimiter=',') >>> np.std(b) [ 3, 20]]) >>> np.savetxt("myarray.txt", a, delimiter=" ") >>> np.c_[a,d] Create stacked column-wise arrays Copying Arrays Splitting Arrays >>> np.hsplit(a,3) Split the array horizontally at the 3rd Data Types >>> h = a.view() Create a view of the array with the same data [array([1]),array([2]),array([3])] index Signed 64-bit integer types >>> np.copy(a) Create a copy of the array >>> np.int64 Create a deep copy of the array >>> np.vsplit(c,2) Split the array vertically at the 2nd index >>> np.float32 Standard double-precision floating point >>> h = a.copy() [array([[[ 1.5, 2. , 1. ], >>> np.complex Complex numbers represented by 128 floats [ 4. , 5. , 6. ]]]), Boolean type storing and values array([[[ 3., 2., 3.], >>> np.bool TRUE FALSE Sorting Arrays [ 4., 5., 6.]]])] >>> np.object Python object type >>> np.string_ Fixed-length string type >>> a.sort() Sort an array Fixed-length unicode type Sort the elements of an array's axis DataCamp >>> np.unicode_ >>> c.sort(axis=0) Learn Python for Data Science Interactively Python For Data Science Cheat Sheet Also see NumPy You’ll use the and modules. Note that contains and expands on . SciPy - Linear Algebra linalg sparse .linalg numpy.linalg Learn More Python for Data Science Interactively at www.datacamp.com >>> from scipy import linalg, sparse Matrix Functions Creating Matrices Addition >>> np.add(A,D) Addition >>> A = np.matrix(np.random.random((2,2))) Subtraction SciPy >>> B = np.asmatrix(b) Subtraction >>> C = np.mat(np.random.random((10,5))) >>> np.subtract(A,D) The SciPy library is one of the core packages for >>> D = np.mat([[3,4], [5,6]]) Division Division scientific computing that provides mathematical Basic Matrix Routines >>> np.divide(A,D) and convenience functions built on the Multiplication Multiplication NumPy extension of Python. Inverse >>> np.multiply(D,A) >>> np.dot(A,D) Dot product >>> A.I Inverse Vector dot product Inverse >>> np.vdot(A,D) Interacting With NumPy Also see NumPy >>> linalg.inv(A) Inner product >>> A.T Tranpose matrix >>> np.inner(A,D) >>> np.outer(A,D) Outer product >>> import numpy as np >>> A.H Conjugate transposition Tensor dot product Trace >>> np.tensordot(A,D) >>> a = np.array([1,2,3]) >>> np.trace(A) >>> np.kron(A,D) Kronecker product >>> b = np.array([(1+5j,2j,3j), (4j,5j,6j)]) Norm Exponential Functions >>> c = np.array([[(1.5,2,3), (4,5,6)], [(3,2,1), (4,5,6)]]) >>> linalg.norm(A) Frobenius norm >>> linalg.expm(A) Matrix exponential Matrix exponential (Taylor Series) Index Tricks >>> linalg.norm(A,1) L1 norm (max column sum) >>> linalg.expm2(A) L inf norm (max row sum) >>> linalg.expm3(D) Matrix exponential (eigenvalue Create a dense meshgrid >>> linalg.norm(A,np.inf) decomposition) >>> np.mgrid[0:5,0:5] Rank >>> np.ogrid[0:2,0:2] Create an open meshgrid Logarithm Function Stack arrays vertically (row-wise) Matrix rank >>> np.r_[[3,[0]*5,-1:1:10j] >>> np.linalg.matrix_rank(C) >>> linalg.logm(A) Matrix logarithm >>> np.c_[b,c] Create stacked column-wise arrays Determinant Trigonometric Tunctions Determinant Shape Manipulation >>> linalg.det(A) >>> linalg.sinm(D) Matrix sine Solving linear problems >>> linalg.cosm(D) Matrix cosine >>> np.transpose(b) Permute array dimensions Solver for dense matrices >>> linalg.tanm(A) Matrix tangent Flatten the array >>> linalg.solve(A,b) >>> b.flatten() >>> E = np.mat(a).T Solver for dense matrices Hyperbolic Trigonometric Functions >>> np.hstack((b,c)) Stack arrays horizontally (column-wise) >>> linalg.lstsq(D,E) Least-squares solution to linear matrix Hypberbolic matrix sine Stack arrays vertically (row-wise) equation >>> linalg.sinhm(D) >>> np.vstack((a,b)) >>> linalg.coshm(D) Hyperbolic matrix cosine >>> np.hsplit(c,2) Split the array horizontally at the 2nd index Generalized inverse Hyperbolic matrix tangent Split the array vertically at the 2nd index >>> linalg.tanhm(A) >>> np.vpslit(d,2) >>> linalg.pinv(C) Compute the pseudo-inverse of a matrix Matrix Sign Function (least-squares solver) Polynomials >>> np.sigm(A) Matrix sign function >>> linalg.pinv2(C) Compute the pseudo-inverse of a matrix (SVD) Matrix Square Root >>> from numpy import poly1d Matrix square root Create a polynomial object >>> linalg.sqrtm(A) >>> p = poly1d([3,4,5]) Creating Sparse Matrices Arbitrary Functions Vectorizing Functions >>> linalg.funm(A, lambda x: x*x) Evaluate matrix function >>> F = np.eye(3, k=1) Create a 2X2 identity matrix >>> def myfunc(a): Create a 2x2 identity matrix if a < 0: >>> G = np.mat(np.identity(2)) Decompositions return a*2 >>> C[C > 0.5] = 0 else: >>> H = sparse.csr_matrix(C) Compressed Sparse Row matrix Eigenvalues and Eigenvectors return a/2 Compressed Sparse Column matrix Solve ordinary or generalized Vectorize functions >>> I = sparse.csc_matrix(D) >>> la, v = linalg.eig(A) >>> np.vectorize(myfunc) >>> J = sparse.dok_matrix(A) Dictionary Of Keys matrix eigenvalue problem for square matrix >>> E.todense() Sparse matrix to full matrix Unpack eigenvalues Identify sparse matrix >>> l1, l2 = la Type Handling >>> sparse.isspmatrix_csc(A) >>> v[:,0] First eigenvector >>> v[:,1] Second eigenvector >>> np.real(c) Return the real part of the array elements Unpack eigenvalues Return the imaginary part of the array elements Sparse Matrix Routines >>> linalg.eigvals(A) >>> np.imag(c) Singular Value Decomposition >>> np.real_if_close(c,tol=1000) Return a real array if complex parts close to 0 Cast object to a data type Inverse >>> U,s,Vh = linalg.svd(B) Singular Value Decomposition (SVD) >>> np.cast['f'](np.pi) Inverse >>> sparse.linalg.inv(I) >>> M,N = B.shape Other Useful Functions Norm >>> Sig = linalg.diagsvd(s,M,N) Construct sigma matrix in SVD Norm LU Decomposition Return the angle of the complex argument >>> sparse.linalg.norm(I) >>> np.angle(b,deg=True) Solving linear problems >>> P,L,U = linalg.lu(C) LU Decomposition Create an array of evenly spaced values >>> g = np.linspace(0,np.pi,num=5) Solver for sparse matrices (number of samples) >>> sparse.linalg.spsolve(H,I) >>> g [3:] += np.pi Sparse Matrix Decompositions >>> np.unwrap(g) Unwrap Create an array of evenly spaced values (log scale) Sparse Matrix Functions >>> np.logspace(0,10,3) >>> la, v = sparse.linalg.eigs(F,1) Eigenvalues and eigenvectors >>> np.select([c<4],[c*2]) Return values from a list of arrays depending on Sparse matrix exponential SVD conditions >>> sparse.linalg.expm(I) >>> sparse.linalg.svds(H, 2) >>> misc.factorial(a) Factorial >>> misc.comb(10,3,exact=True) Combine N things taken at k time Asking For Help >>> misc.central_diff_weights(3) Weights for Np-point central derivative DataCamp >>> misc.derivative(myfunc,1.0) Find the n-th derivative of a function at a point >>> help(scipy.linalg.diagsvd) Learn Python for Data Science Interactively >>> np.info(np.matrix) Python For Data Science Cheat Sheet Asking For Help Dropping >>> help(pd.Series.loc) Pandas Basics >>> s.drop(['a', 'c']) Drop values from rows (axis=0) Selection Also see NumPy Arrays >>> df.drop('Country', axis=1) Drop values from columns(axis=1) Learn Python for Data Science Interactively at www.DataCamp.com Getting Sort & Rank >>> s['b'] Get one element -5 Pandas >>> df.sort_index() Sort by labels along an axis Get subset of a DataFrame >>> df.sort_values(by='Country') Sort by the values along an axis The Pandas library is built on NumPy and provides easy-to-use >>> df[1:] Assign ranks to entries Country Capital Population >>> df.rank() data structures and data analysis tools for the Python 1 India New Delhi 1303171035 programming language. 2 Brazil Brasília 207847528 Retrieving Series/DataFrame Information Selecting, Boolean Indexing & Setting Basic Information Use the following import convention: By Position (rows,columns) Select single value by row & >>> df.shape >>> import pandas as pd >>> df.iloc([0],[0]) >>> df.index Describe index Describe DataFrame columns 'Belgium' column >>> df.columns Pandas Data Structures >>> df.info() Info on DataFrame >>> df.iat([0],[0]) Number of non-NA values >>> df.count() Series 'Belgium' Summary A one-dimensional labeled array a 3 By Label Select single value by row & >>> df.sum() Sum of values capable of holding any data type b -5 >>> df.loc([0], ['Country']) Cummulative sum of values 'Belgium' column labels >>> df.cumsum() >>> df.min()/df.max() Minimum/maximum values c 7 Minimum/Maximum index value Index >>> df.at([0], ['Country']) >>> df.idxmin()/df.idxmax() d 4 'Belgium' >>> df.describe() Summary Mean of values >>> df.mean() By Label/Position >>> df.median() Median of values >>> s = pd.Series([3, -5, 7, 4], index=['a', 'b', 'c', 'd']) >>> df.ix[2] Select single row of DataFrame Country Brazil subset of rows Applying Functions Capital Brasília Columns Population 207847528 >>> f = lambda x: x*2 Country Capital Population >>> df.apply(f) Apply function A two-dimensional labeled >>> df.ix[:,'Capital'] Select a single column of >>> df.applymap(f) Apply function element-wise data structure with columns 0 Brussels subset of columns 0 Belgium Brussels 11190846 of potentially different types 1 New Delhi 2 Brasília Data Alignment 1 India New Delhi 1303171035 Index >>> df.ix[1,'Capital'] Select rows and columns Internal Data Alignment 2 Brazil Brasília 207847528 'New Delhi' NA values are introduced in the indices that don’t overlap: Boolean Indexing >>> data = {'Country': ['Belgium', 'India', 'Brazil'], >>> s3 = pd.Series([7, -2, 3], index=['a', 'c', 'd']) >>> s[~(s > 1)] Series s where value is not >1 'Capital': ['Brussels', 'New Delhi', 'Brasília'], >>> s[(s < -1) | (s > 2)] s where value is <-1 or >2 >>> s + s3 'Population': [11190846, 1303171035, 207847528]} >>> df[df['Population']>1200000000] Use filter to adjust DataFrame a 10.0 b NaN >>> df = pd.DataFrame(data, Setting c 5.0 columns=['Country', 'Capital', 'Population']) >>> s['a'] = 6 Set index a of Series s to 6 d 7.0 I/O Arithmetic Operations with Fill Methods You can also do the internal data alignment yourself with Read and Write to CSV Read and Write to SQL Query or Database Table the help of the fill methods: >>> pd.read_csv('file.csv', header=None, nrows=5) >>> from sqlalchemy import create_engine >>> s.add(s3, fill_value=0) >>> df.to_csv('myDataFrame.csv') >>> engine = create_engine('sqlite:///:memory:') a 10.0 >>> pd.read_sql("SELECT * FROM my_table;", engine) b -5.0 Read and Write to Excel c 5.0 >>> pd.read_sql_table('my_table', engine) d 7.0 >>> pd.read_excel('file.xlsx') >>> pd.read_sql_query("SELECT * FROM my_table;", engine) >>> s.sub(s3, fill_value=2) >>> pd.to_excel('dir/myDataFrame.xlsx', sheet_name='Sheet1') >>> s.div(s3, fill_value=4) read_sql()is a convenience wrapper around read_sql_table() and Read multiple sheets from the same file >>> s.mul(s3, fill_value=3) read_sql_query() >>> xlsx = pd.ExcelFile('file.xls') >>> df = pd.read_excel(xlsx, 'Sheet1') >>> pd.to_sql('myDf', engine) DataCamp Learn Python for Data Science Interactively Python For Data Science Cheat Sheet Create Your Model Evaluate Your Model’s Performance Scikit-Learn Supervised Learning Estimators Classification Metrics Learn Python for data science Interactively at www.DataCamp.com Linear Regression Accuracy Score Estimator score method >>> from sklearn.linear_model import LinearRegression >>> knn.score(X_test, y_test) >>> lr = LinearRegression(normalize=True) >>> from sklearn.metrics import accuracy_score Metric scoring functions Support Vector Machines (SVM) >>> accuracy_score(y_test, y_pred) Scikit-learn >>> from sklearn.svm import SVC Classification Report >>> svc = SVC(kernel='linear') >>> from sklearn.metrics import classification_report Precision, recall, f1-score Scikit-learn is an open source Python library that Naive Bayes >>> print(classification_report(y_test, y_pred)) and support implements a range of machine learning, Confusion Matrix >>> from sklearn.naive_bayes import GaussianNB preprocessing, cross-validation and visualization >>> from sklearn.metrics import confusion_matrix >>> gnb = GaussianNB() >>> print(confusion_matrix(y_test, y_pred)) algorithms using a unified interface. KNN Regression Metrics A Basic Example >>> from sklearn import neighbors >>> knn = neighbors.KNeighborsClassifier(n_neighbors=5) Mean Absolute Error >>> from sklearn import neighbors, datasets, preprocessing Unsupervised Learning Estimators >>> from sklearn.model_selection import train_test_split >>> from sklearn.metrics import mean_absolute_error >>> from sklearn.metrics import accuracy_score >>> y_true = [3, -0.5, 2] >>> iris = datasets.load_iris() Principal Component Analysis (PCA) >>> mean_absolute_error(y_true, y_pred) >>> X, y = iris.data[:, :2], iris.target >>> from sklearn.decomposition import PCA Mean Squared Error >>> X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=33) >>> pca = PCA(n_components=0.95) >>> from sklearn.metrics import mean_squared_error >>> scaler = preprocessing.StandardScaler().fit(X_train) K Means >>> mean_squared_error(y_test, y_pred) >>> X_train = scaler.transform(X_train) ² Score >>> X_test = scaler.transform(X_test) >>> from sklearn.cluster import KMeans >>> knn = neighbors.KNeighborsClassifier(n_neighbors=5) >>> k_means = KMeans(n_clusters=3, random_state=0) >>> from sklearn.metrics import r2_score >>> knn.fit(X_train, y_train) >>> r2_score(y_true, y_pred) >>> y_pred = knn.predict(X_test) Clustering Metrics >>> accuracy_score(y_test, y_pred) Model Fitting Adjusted Rand Index Supervised learning Also see NumPy & Pandas Fit the model to the data >>> from sklearn.metrics import adjusted_rand_score Loading The Data >>> lr.fit(X, y) >>> adjusted_rand_score(y_true, y_pred) >>> knn.fit(X_train, y_train) Your data needs to be numeric and stored as NumPy arrays or SciPy sparse >>> svc.fit(X_train, y_train) Homogeneity matrices. Other types that are convertible to numeric arrays, such as Pandas Unsupervised Learning >>> from sklearn.metrics import homogeneity_score Fit the model to the data >>> homogeneity_score(y_true, y_pred) DataFrame, are also acceptable. >>> k_means.fit(X_train) V-measure >>> pca_model = pca.fit_transform(X_train) Fit to data, then transform it >>> import numpy as np >>> from sklearn.metrics import v_measure_score >>> X = np.random.random((10,5)) >>> metrics.v_measure_score(y_true, y_pred) >>> y = np.array(['M','M','F','F','M','F','M','M','F','F','F']) >>> X[X < 0.7] = 0 Prediction Cross-Validation Supervised Estimators >>> from sklearn.cross_validation import cross_val_score Predict labels >>> print(cross_val_score(knn, X_train, y_train, cv=4)) Training And Test Data >>> y_pred = svc.predict(np.random.random((2,5))) >>> print(cross_val_score(lr, X, y, cv=2)) >>> y_pred = lr.predict(X_test) Predict labels >>> from sklearn.model_selection import train_test_split >>> y_pred = knn.predict_proba(X_test) Estimate probability of a label > > > X_train, X_test, y_train, y_test = train_test_split(X , Unsupervised Estimators Tune Your Model y, Predict labels in clustering algos random_state=0) >>> y_pred = k_means.predict(X_test) Grid Search >>> from sklearn.grid_search import GridSearchCV >>> params = {"n_neighbors": np.arange(1,3), Preprocessing The Data "metric": ["euclidean", "cityblock"]} >>> grid = GridSearchCV(estimator=knn, Standardization Encoding Categorical Features param_grid=params) >>> grid.fit(X_train, y_train) >>> from sklearn.preprocessing import StandardScaler >>> from sklearn.preprocessing import LabelEncoder >>> print(grid.best_score_) >>> print(grid.best_estimator_.n_neighbors) >>> scaler = StandardScaler().fit(X_train) >>> enc = LabelEncoder() >>> standardized_X = scaler.transform(X_train) >>> y = enc.fit_transform(y) >>> standardized_X_test = scaler.transform(X_test) Randomized Parameter Optimization Normalization Imputing Missing Values >>> from sklearn.grid_search import RandomizedSearchCV >>> params = {"n_neighbors": range(1,5), >>> from sklearn.preprocessing import Normalizer "weights": ["uniform", "distance"]} >>> from sklearn.preprocessing import Imputer >>> rsearch = RandomizedSearchCV(estimator=knn, >>> scaler = Normalizer().fit(X_train) >>> imp = Imputer(missing_values=0, strategy='mean', axis=0) param_distributions=params, >>> normalized_X = scaler.transform(X_train) >>> imp.fit_transform(X_train) cv=4, >>> normalized_X_test = scaler.transform(X_test) n_iter=8, random_state=5) Binarization Generating Polynomial Features >>> rsearch.fit(X_train, y_train) >>> print(rsearch.best_score_) >>> from sklearn.preprocessing import Binarizer >>> from sklearn.preprocessing import PolynomialFeatures >>> binarizer = Binarizer(threshold=0.0).fit(X) >>> poly = PolynomialFeatures(5) DataCamp >>> binary_X = binarizer.transform(X) >>> poly.fit_transform(X) Learn Python for Data Science Interactively Python For Data Science Cheat Sheet Plot Anatomy & Workflow Plot Anatomy Workflow Matplotlib Axes/Subplot The basic steps to creating plots with matplotlib are: Learn Python Interactively at www.DataCamp.com 1 Prepare data 2 Create plot 3 Plot 4 Customize plot 5 Save plot 6 Show plot >>> import matplotlib.pyplot as plt >>> x = [1,2,3,4] Step 1 >>> y = [10,20,25,30] >>> fig = plt.figure() Step 2 Matplotlib Step 3 Y-axis Figure >>> ax = fig.add_subplot(111) Matplotlib is a Python 2D plotting library which produces >>> ax.plot(x, y, color='lightblue', linewidth=3) Step 3, 4 >>> ax.scatter([2,4,6], publication-quality figures in a variety of hardcopy formats [5,15,25], and interactive environments across color='darkgreen', marker='^') platforms. X-axis >>> ax.set_xlim(1, 6.5) >>> plt.savefig('foo.png') Step 6 1 Prepare The Data Also see Lists & NumPy >>> plt.show() 1D Data 4 Customize Plot >>> import numpy as np Colors, Color Bars & Color Maps Mathtext >>> x = np.linspace(0, 10, 100) >>> y = np.cos(x) >>> plt.plot(x, x, x, x**2, x, x**3) >>> plt.title(r'$sigma_i=15$', fontsize=20) >>> z = np.sin(x) >>> ax.plot(x, y, alpha = 0.4) >>> ax.plot(x, y, c='k') Limits, Legends & Layouts 2D Data or Images >>> fig.colorbar(im, orientation='horizontal') >>> im = ax.imshow(img, Limits & Autoscaling >>> data = 2 * np.random.random((10, 10)) cmap='seismic') >>> ax.margins(x=0.0,y=0.1) Add padding to a plot >>> data2 = 3 * np.random.random((10, 10)) Set the aspect ratio of the plot to 1 >>> Y, X = np.mgrid[-3:3:100j, -3:3:100j] Markers >>> ax.axis('equal') >>> ax.set(xlim=[0,10.5],ylim=[-1.5,1.5]) Set limits for x-and y-axis >>> U = -1 - X**2 + Y Set limits for x-axis >>> V = 1 + X - Y**2 >>> fig, ax = plt.subplots() >>> ax.set_xlim(0,10.5) >>> from matplotlib.cbook import get_sample_data Legends >>> ax.scatter(x,y,marker=".") Set a title and x-and y-axis labels >>> img = np.load(get_sample_data('axes_grid/bivariate_normal.npy')) >>> ax.plot(x,y,marker="o") >>> ax.set(title='An Example Axes', ylabel='Y-Axis', Linestyles xlabel='X-Axis') Create Plot >>> ax.legend(loc='best') No overlapping plot elements 2 >>> plt.plot(x,y,linewidth=4.0) Ticks >>> import matplotlib.pyplot as plt >>> plt.plot(x,y,ls='solid') >>> ax.xaxis.set(ticks=range(1,5), Manually set x-ticks >>> plt.plot(x,y,ls='--') ticklabels=[3,100,-12,"foo"]) Figure >>> plt.plot(x,y,'--',x**2,y**2,'-.') >>> ax.tick_params(axis='y', Make y-ticks longer and go in and out >>> plt.setp(lines,color='r',linewidth=4.0) direction='inout', >>> fig = plt.figure() Text & Annotations length=10) >>> fig2 = plt.figure(figsize=plt.figaspect(2.0)) Subplot Spacing Adjust the spacing between subplots Axes >>> ax.text(1, >>> fig3.subplots_adjust(wspace=0.5, -2.1, hspace=0.3, All plotting is done with respect to anAxes . In most cases, a 'Example Graph', left=0.125, subplot will fit your needs. A subplot is an axes on a grid system. style='italic') right=0.9, >>> ax.annotate("Sine", top=0.9, >>> fig.add_axes() xy=(8, 0), bottom=0.1) xycoords='data', >>> fig.tight_layout() Fit subplot(s) in to the figure area >>> ax1 = fig.add_subplot(221) # row-col-num xytext=(10.5, 0), >>> ax3 = fig.add_subplot(212) textcoords='data', Axis Spines >>> fig3, axes = plt.subplots(nrows=2,ncols=2) arrowprops=dict(arrowstyle="->", >>> ax1.spines['top'].set_visible(False) Make the top axis line for a plot invisible >>> fig4, axes2 = plt.subplots(ncols=3) connectionstyle="arc3"),) >>> ax1.spines['bottom'].set_position(('outward',10)) Move the bottom axis line outward 3 Plotting Routines 5 Save Plot 1D Data Vector Fields Save figures >>> plt.savefig('foo.png') >>> fig, ax = plt.subplots() >>> axes[0,1].arrow(0,0,0.5,0.5) Add an arrow to the axes >>> lines = ax.plot(x,y) Draw points with lines or markers connecting them Plot a 2D field of arrows Save transparent figures Draw unconnected points, scaled or colored >>> axes[1,1].quiver(y,z) >>> ax.scatter(x,y) >>> axes[0,1].streamplot(X,Y,U,V) Plot a 2D field of arrows >>> plt.savefig('foo.png', transparent=True) >>> axes[0,0].bar([1,2,3],[3,4,5]) Plot vertical rectangles (constant width) >>> axes[1,0].barh([0.5,1,2.5],[0,1,2]) Plot horiontal rectangles (constant height) Data Distributions >>> axes[1,1].axhline(0.45) Draw a horizontal line across axes Show Plot Draw a vertical line across axes 6 >>> axes[0,1].axvline(0.65) >>> ax1.hist(y) Plot a histogram >>> ax.fill(x,y,color='blue') Draw filled polygons Make a box and whisker plot >>> plt.show() Fill between y-values and 0 >>> ax3.boxplot(y) >>> ax.fill_between(x,y,color='yellow') >>> ax3.violinplot(z) Make a violin plot 2D Data or Images Close & Clear Clear an axis >>> fig, ax = plt.subplots() Pseudocolor plot of 2D array >>> plt.cla() Colormapped or RGB arrays >>> axes2[0].pcolor(data2) >>> plt.clf() Clear the entire figure >>> im = ax.imshow(img, Pseudocolor plot of 2D array Close a window cmap='gist_earth', >>> axes2[0].pcolormesh(data) >>> plt.close() >>> CS = plt.contour(Y,X,U) Plot contours interpolation='nearest', Plot filled contours vmin=-2, >>> axes2[2].contourf(data1) >>> axes2[2]= ax.clabel(CS) Label a contour plot DataCamp vmax=2) Learn Python for Data Science Interactively Matplotlib 2.0.0 - Updated on: 02/2017 Python For Data Science Cheat Sheet 3 Plotting With Seaborn Seaborn Axis Grids Learn Data Science Interactively at www.DataCamp.com >>> g = sns.FacetGrid(titanic, Subplot grid for plotting conditional >>> h = sns.PairGrid(iris) Subplot grid for plotting pairwise col="survived", relationships >>> h = h.map(plt.scatter) relationships row="sex") >>> sns.pairplot(iris) Plot pairwise bivariate distributions >>> g = g.map(plt.hist,"age") >>> i = sns.JointGrid(x="x", Grid for bivariate plot with marginal >>> sns.factorplot(x="pclass", Draw a categorical plot onto a y="y", univariate plots y="survived", Facetgrid data=data) Statistical Data Visualization With Seaborn hue="sex", >>> i = i.plot(sns.regplot, data=titanic) sns.distplot) The Python visualization library Seaborn is based on >>> sns.lmplot(x="sepal_width", Plot data and regression model fits >>> sns.jointplot("sepal_length", Plot bivariate distribution across a FacetGrid matplotlib and provides a high-level interface for drawing y="sepal_length", "sepal_width", hue="species", data=iris, attractive statistical graphics. data=iris) kind='kde') Categorical Plots Regression Plots Make use of the following aliases to import the libraries: Plot data and a linear regression Scatterplot >>> sns.regplot(x="sepal_width", y="sepal_length", model fit >>> import matplotlib.pyplot as plt >>> sns.stripplot(x="species", Scatterplot with one data=iris, >>> import seaborn as sns y="petal_length", categorical variable data=iris) ax=ax) The basic steps to creating plots with Seaborn are: >>> sns.swarmplot(x="species", Categorical scatterplot with Distribution Plots y="petal_length", non-overlapping points 1. Prepare some data Plot univariate distribution data=iris) >>> plot = sns.distplot(data.y, 2. Control figure aesthetics Bar Chart kde=False, color="b") 3. Plot with Seaborn >>> sns.barplot(x="sex", Show point estimates and confidence intervals with Matrix Plots 4. Further customize your plot y="survived", hue="class", scatterplot glyphs >>> sns.heatmap(uniform_data,vmin=0,vmax=1) Heatmap data=titanic) >>> import matplotlib.pyplot as plt Count Plot >>> import seaborn as sns >>> sns.countplot(x="deck", Show count of observations Also see Matplotlib >>> tips = sns.load_dataset("tips") Step 1 Further Customizations data=titanic, 4 >>> sns.set_style("whitegrid") Step 2 Step 3 palette="Greens_d") >>> g = sns.lmplot(x="tip", Point Plot Axisgrid Objects y="total_bill", Show point estimates and Remove left spine data=tips, >>> sns.pointplot(x="class", >>> g.despine(left=True) confidence intervals as Set the labels of the y-axis aspect=2) y="survived", >>> g.set_ylabels("Survived") rectangular bars Set the tick labels for x >>> g = (g.set_axis_labels("Tip","Total bill(USD)"). hue="sex", >>> g.set_xticklabels(rotation=45) Set the axis labels set(xlim=(0,10),ylim=(0,100))) Step 4 data=titanic, >>> g.set_axis_labels("Survived", >>> plt.title("title") palette={"male":"g", "Sex") Set the limit and ticks of the >>> plt.show(g) Step 5 "female":"m"}, >>> h.set(xlim=(0,5), markers=["^","o"], ylim=(0,5), x-and y-axis linestyles=["-","--"]) xticks=[0,2.5,5], Boxplot yticks=[0,2.5,5]) Data Also see Lists, NumPy & Pandas Boxplot 1 >>> sns.boxplot(x="alive", Plot y="age", >>> import pandas as pd hue="adult_male", >>> plt.title("A Title") Add plot title >>> import numpy as np data=titanic) >>> plt.ylabel("Survived") Adjust the label of the y-axis >>> uniform_data = np.random.rand(10, 12) >>> sns.boxplot(data=iris,orient="h") Boxplot with wide-form data Adjust the label of the x-axis >>> data = pd.DataFrame({'x':np.arange(1,101), Violinplot >>> plt.xlabel("Sex") 'y':np.random.normal(0,4,100)}) >>> plt.ylim(0,100) Adjust the limits of the y-axis Violin plot Adjust the limits of the x-axis Seaborn also offers built-in data sets: >>> sns.violinplot(x="age", >>> plt.xlim(0,10) y="sex", >>> plt.setp(ax,yticks=[0,5]) Adjust a plot property Adjust subplot params >>> titanic = sns.load_dataset("titanic") hue="survived", >>> plt.tight_layout() >>> iris = sns.load_dataset("iris") data=titanic) Show or Save Plot Also see Matplotlib Figure Aesthetics Also see Matplotlib 5 2 >>> plt.show() Show the plot Context Functions Save the plot as a figure >>> plt.savefig("foo.png") Save transparent figure >>> f, ax = plt.subplots(figsize=(5,6)) Create a figure and one subplot Set context to >>> plt.savefig("foo.png", >>> sns.set_context("talk") "talk" transparent=True) >>> sns.set_context("notebook", Set context to "notebook", Seaborn styles font_scale=1.5, Scale font elements and rc={"lines.linewidth":2.5}) override param mapping Close & Clear Also see Matplotlib >>> sns.set() (Re)set the seaborn default >>> sns.set_style("whitegrid") Set the matplotlib parameters Color Palette Clear an axis Set the matplotlib parameters >>> plt.cla() >>> sns.set_style("ticks", >>> plt.clf() Clear an entire figure >>> sns.set_palette("husl",3) Define the color palette Close a window {"xtick.major.size":8, Use with to temporarily set palette >>> plt.close() "ytick.major.size":8}) >>> sns.color_palette("husl") with >>> sns.axes_style("whitegrid") Return a dict of params or use with >>> flatui = ["#9b59b6","#3498db","#95a5a6","#e74c3c","#34495e","#2ecc71"] to temporarily set the style Set your own color palette DataCamp with >>> sns.set_palette(flatui) Learn Python for Data Science Interactively Python For Data Science Cheat Sheet 3 Renderers & Visual Customizations Bokeh Glyphs Grid Layout Learn Bokeh Interactively at www.DataCamp.com, Scatter Markers >>> from bokeh.layouts import gridplot taught by Bryan Van de Ven, core contributor >>> p1.circle(np.array([1,2,3]), np.array([3,2,1]), >>> row1 = [p1,p2] fill_color='white') >>> row2 = [p3] >>> p2.square(np.array([1.5,3.5,5.5]), [1,4,3], >>> layout = gridplot([[p1,p2],[p3]]) color='blue', size=1) Line Glyphs Tabbed Layout Plotting With Bokeh >>> p1.line([1,2,3,4], [3,4,5,6], line_width=2) The Python interactive visualization library Bokeh >>> p2.multi_line(pd.DataFrame([[1,2,3],[5,6,7]]), >>> from bokeh.models.widgets import Panel, Tabs >>> tab1 = Panel(child=p1, title="tab1") enables high-performance visual presentation of pd.DataFrame([[3,4,5],[3,2,1]]), color="blue") >>> tab2 = Panel(child=p2, title="tab2") large datasets in modern web browsers. >>> layout = Tabs(tabs=[tab1, tab2]) Customized Glyphs Also see Data Linked Plots Bokeh’s mid-level general purpose Selection and Non-Selection Glyphs bokeh.plotting Linked Axes interface is centered around two main components: data >>> p = figure(tools='box_select') >>> p.circle('mpg', 'cyl', source=cds_df, >>> p2.x_range = p1.x_range and glyphs. selection_color='red', >>> p2.y_range = p1.y_range nonselection_alpha=0.1) Linked Brushing Hover Glyphs >>> p4 = figure(plot_width = 100, + = tools='box_select,lasso_select') >>> from bokeh.models import HoverTool >>> p4.circle('mpg', 'cyl', source=cds_df) data glyphs plot >>> hover = HoverTool(tooltips=None, mode='vline') >>> p5 = figure(plot_width = 200, >>> p3.add_tools(hover) The basic steps to creating plots with the tools='box_select,lasso_select') bokeh.plotting >>> p5.circle('mpg', 'hp', source=cds_df) interface are: US Colormapping >>> layout = row(p4,p5) Asia Europe 1. Prepare some data: >>> from bokeh.models import CategoricalColorMapper Python lists, NumPy arrays, Pandas DataFrames and other sequences of values >>> color_mapper = CategoricalColorMapper( 2. Create a new plot factors=['US', 'Asia', 'Europe'], 4 Output & Export palette=['blue', 'red', 'green']) 3. Add renderers for your data, with visual customizations >>> p3.circle('mpg', 'cyl', source=cds_df, Notebook 4. Specify where to generate the output color=dict(field='origin', transform=color_mapper), >>> from bokeh.io import output_notebook, show 5. Show or save the results legend='Origin') >>> output_notebook() >>> from bokeh.plotting import figure >>> from bokeh.io import output_file, show Legend Location HTML >>> x = [1, 2, 3, 4, 5] Step 1 Inside Plot Area Standalone HTML >>> y = [6, 7, 2, 4, 5] >>> p.legend.location = 'bottom_left' >>> p = figure(title="simple line example", Step 2 >>> from bokeh.embed import file_html x_axis_label='x', >>> from bokeh.resources import CDN >>> html = file_html(p, CDN, "my_plot") y_axis_label='y') Outside Plot Area >>> p.line(x, y, legend="Temp.", line_width=2) Step 3 >>> from bokeh.models import Legend >>> r1 = p2.asterisk(np.array([1,2,3]), np.array([3,2,1]) >>> from bokeh.io import output_file, show >>> output_file("lines.html") Step 4 Step 5 >>> r2 = p2.line([1,2,3,4], [3,4,5,6]) >>> output_file('my_bar_chart.html', mode='cdn') >>> show(p) >>> legend = Legend(items=[("One" ,[p1, r1]),("Two",[r2])], location=(0, -30)) Components >>> p.add_layout(legend, 'right') Also see Lists, NumPy & Pandas >>> from bokeh.embed import components 1 Data >>> script, div = components(p) Under the hood, your data is converted to Column Data Legend Orientation PNG Sources. You can also do this manually: >>> p.legend.orientation = "horizontal" >>> p.legend.orientation = "vertical" >>> import numpy as np >>> from bokeh.io import export_png >>> import pandas as pd >>> export_png(p, filename="plot.png") >>> df = pd.DataFrame(np.array([[33.9,4,65, 'US'], Legend Background & Border [32.4,4,66, 'Asia'], [21.4,4,109, 'Europe']]), >>> p.legend.border_line_color = "navy" SVG columns=['mpg','cyl', 'hp', 'origin'], >>> p.legend.background_fill_color = "white" index=['Toyota', 'Fiat', 'Volvo']) >>> from bokeh.io import export_svgs Rows & Columns Layout >>> from bokeh.models import ColumnDataSource >>> p.output_backend = "svg" >>> export_svgs(p, filename="plot.svg") >>> cds_df = ColumnDataSource(df) Rows >>> from bokeh.layouts import row Plotting >>> layout = row(p1,p2,p3) Show or Save Your Plots 2 Columns 5 >>> from bokeh.plotting import figure >>> from bokeh.layouts import columns >>> show(p1) >>> show(layout) >>> p1 = figure(plot_width=300, tools='pan,box_zoom') >>> layout = column(p1,p2,p3) >>> save(p1) >>> save(layout) >>> p2 = figure(plot_width=300, plot_height=300, Nesting Rows & Columns DataCamp x_range=(0, 8), y_range=(0, 8)) >>>layout = row(column(p1,p2), p3) >>> p3 = figure() Learn Python for Data Science Interactively