<<

Python Data Processing with Pandas

CSE 5542 Introducon to Data Visualizaon Pandas

• A very powerful package of Python for manipulang tables • Built on top of , so is efficient • Save you a lot of effort from wring lower python code for manipulang, extracng, and deriving tables related informaon • Easy visualizaon with • Main data structures – Series and DataFrame • First thing first

• Series: an indexed 1D array • Explicit index

• Access data • Can work as a diconary

• Access and slice data DataFrame Object

• Generalized two dimensional array with flexible row and column indices DataFrame Object

• Generalized two dimensional array with flexible row and column indices DataFrame Object

• From Pandas Series DataFrame Object

• From Pandas Series DataFrame Object

• Another example Viewing Data

• View the first or last N rows Viewing Data

• Display the index, columns, and data Viewing Data

• Quick stascs (for columns A B D in this case) Viewing Data

• Sorng: sort by the index (i.e., reorder columns or rows), not by the data in the table

column Viewing Data

• Sorng: sort by the data values Selecng Data

• Selecng using a label Selecng Data

• Mul-axis, by label Selecng Data

• Mul-axis, by label Slicing: last included Selecng Data

• Select by posion Selecng Data

• Boolean indexing Selecng Data

• Boolean indexing Seng Data

• Seng a new column aligned by indexes Seng Data Operaons

• Descripve stascs – Across axis 0 (rows), i.e., column mean

– Across axis 1 (column), i.e., row mean Operaons

• Apply

• Histogram Merge Tables

• Join Merge Tables

• Append Grouping File I/O

• CSV File I/O

• Excel