Design of Casa::Tableplot for Casapy
Total Page:16
File Type:pdf, Size:1020Kb
Design of casa::TablePlot for Casapy Urvashi Rau 18 Dec 2007 Abstract This note describes the design of the casa::TablePlot classes and the control flow through them for data access and plot creation. The use of the Matplotlib plotting package is described, along with a performance analysis to quantify bot- tlenecks. Many thanks to S.D.Jaeger, G.Moellenbroek, and K.Golap for some bug-finding/fixing, and helpful suggestions about design changes, during their development of the ap- plication level classes that use casa::TablePlot. The work done on this code was funded via a Graduate Student Research Assistantship. 1 Contents 1 Classes and control flow 4 1.1 DataAccess.................................. 4 1.2 Plotting .................................... 5 1.3 UserInteraction................................ 5 1.4 FlagVersions ................................. 6 2 GUI and Python 7 2.1 DesignConstraints .............................. 7 2.2 C++toPythonbinding(plotting) . 7 2.3 Python to C++ Callback binding (user interaction) . ....... 8 3 User Applications 8 4 Performance analysis 9 4.1 TimingandMemoryUsage ......................... 9 4.2 Optimizationstrategies. .. 10 5 Miscellaneous 11 5.1 Knownproblems ............................... 11 5.2 Makingchanges................................ 11 6 AppendixA:SampleTaQLexpressions 13 7 Appendix B : List of Plot Options 15 2 Figure 1: Flow of control through a set of classes that provide 2D plotting functionality for data stored as casa::Table. Data is read from casa::Table and derived quantities are computed and plotted (black arrows). User interaction with the plotted data via queries and editing is accessible from the command line (maroon arrows) as well as from the plotter GUI (blue arrows). Custom computations are allowed via optional callback functions defined by the user application (light blue boxes). The pink and yellow layers at the bottom control the C++ Python binding and plotting package specific code. 3 1 Classes and control flow casa::TablePlot is a Singleton class that manages plot requests from multiple applica- tions (casa::MSPlot,casa::PlotCal,casac::tableplot). It receives input in the form of data Tables and Plot-Options for each plot and manages multiple panels on the plot window with multiple plot layers per panel. It maintains and sends commands to a single instance of a plotter GUI, and responds to user interaction both from the GUI and the command line. 1.1 Data Access 1. Table Access : Columns from any kind of casa::Table can be plotted against each other. Input tables can be Measurement Sets, MS Subtables, Calibration tables, etc.., or reference sub-sets of Tables generated via any Table selection mechanism. Multiple tables of different types can be opened simultaneously for plotting and interactive editing. Data sets can be iterated through to generate a series of plots from different subselections. Columns with finite and ordered values can be iterated upon. Iteration rules follow that of the casa::TableIterator class. Iterations can proceed only in one direction - forward. 2. Expression Evaluation : Expressions for derived quantities are written using the TaQL1 syntax. For data from ArrayColumns, in-row selections within the Array are done via TaQL indices (for example : AMPLITUDE(DATA[1:2,1:10:1])). For sample TaQL strings, please refer to Appendix A. Quantities that cannot be computed purely via TaQL expressions make use of conversion functions supplied by the application in the form of callbacks casa::TPConvertBase. 3. Data extraction : casa::BasePlot is responsible for evaluating TaQL expres- sions for X and Y axis data, performing conversions on the results, and storing the data in arrays to be plotted. It provides a set of query functions that the plotter class casa::TPPlotter will use to access data to be plotted. Derivatives of casa::BasePlot can implement customized data access, but must provide the same view of the final data to be plotted, in the form casa::Arrays and query functions that casa::TPPlotter expects. casa::CrossPlot is one derivative, used for plots where the X data comes from ArrayColumn Array indices (for example, channel number), and not TaQL string evaluations. Other derivatives can be im- plemented to read data without using TaQL expressions. Note that the rest of the framework does not depend on how the data is accessed internal to casa:BasePlot. 4. Flag Handling : Flags are read using the same in-row selection as the data itself, and stored in arrays. casa::TPPlotter uses the casa::BasePlot query functions to check flags with data and select points for plotting according to on- the-fly selection criteria (only unflagged, only flagged, every nth point, average n points, points within a certain range of values). Interactive editing modifies the flags in the casa::BasePlot flag storage arrays. The casa::BasePlot (or 1Aips++ Note 199 : http : //aips2.nrao.edu/stable/docs/notes/199/199.html 4 derivatives) are then responsible for writing these flags back to the Table, along with any translations or flag-expansions (for plots of averaged values). It is always ensured that the FLAG ROW column holds the logical AND of all flags per row in the FLAG column. The flags are written to disk after every interaction, to allow all currently visible plots using the same Table, to immediately reflect flag changes. Columns to be used for flags can be user specified, and default to FLAG and FLAG ROW. 1.2 Plotting 1. Plot Parameters : casa::PlotOptions manages user input. It contains defaults for all parameters, and functions to validate parameters and do error checking. Pa- rameters are divided into those common to all layers on a panel (panel location, size, axis labels..), and those that can vary across layers (plot colour, marker format...), and stored in a wrapper casa::PanelParams class. For a list of plot parameters and their functions, please refer to Appendix B. 2. Panel management : For each panel, casa::TablePlot maintains an instance of casa::PanelParams and a list of casa::BasePlots. Plots from multiple Tables, and overplots can generate multiple layers per panel. This class iterates through panels and layers and sends Vector<casa::BasePlot>, casa::PanelParams pairs to a single instance of casa::TPPlotter for plotting. 3. Plotting : casa::TPPlotter performs two tasks. First, it receives a list of casa::BasePlots, and queries each of them for the number of plot commands to run, and the number of points per plot. For each plot command, it reads data and flags through casa::BasePlot query functions, and assembles the selected points into plotting-package-specific data format. Then, it reads plot options from supplied casa::PanelParams, and constructs plot commands. Finally, the data and plot commands are combined and sent to the plotting package. 1.3 User Interaction The following functions allow the user to interact with any displayed plot. They are accessible from the command line by directly calling casa::TablePlot functions as well as from buttons on the GUI which internally call the same casa::TablePlot functions. These functions operate on the stored Vector<casa::BasePlot>, casa::PanelParams pairs per panel, and trigger refreshed plots, as well as the transfer of flags to and from the Table on disk. A no-GUI mode of plotting has also been provided. In this mode, the plots are created as described above, but are not rendered onto a plot window. Instead, additional commands can be used to directly save a plot into a file on disk. 1. Mark Regions: Rubber-band boxes can be drawn to mark rectangular regions on a plot. The same effect can be reached by sending in world-coordinate box specifications from the command line. 5 2. Zoom/Pan: The matplotlib TkAgg GUI provides buttons for Zoom/Pan modes. (Note that Matplotlib stores a full copy of all data points currently on the plot, in Double precision, to enable Zoom/Pan modes directly from the GUI.) Command-line in- teraction can be done via the plotrange plot option. This interface can be used if a new plotting package does not hold all data points in memory for Zoom/Pan. 3. Flag/Unflag: Buttons for flagging and unflagging have been added to the TkAgg matplotlib GUI. Their callbacks trigger casa::TablePlot functions that modify values stored in casa::BasePlot flag arrays, which then get transferred to the Table on disk. If multiple views of the same data are plotted on different panels, flags from one panel are reflected in the others. casa::TablePlot returns a flag summary as a casa::Record which holds all information necessary for recreating each flag/unflag command, and can be used to generate a flag history. Currently this information is sent to the logger, but it can be returned as python records for scripting. 4. Locate: A GUI button has been added for the operation of obtaining meta-data about se- lected points. Table row indices are inferred from the selected data points, and val- ues of specified Table Columns (or TaQL expressions) for only these rows are read and displayed. One entry is produced per selected row. The control flow is the same as for Flag/Unflag, and a record of meta-data is returned by casa::TablePlot for the selected points. 5. Iter-Next: A GUI button was provided to move to the next plot in an iteration series. 6. Quit: The GUI has been provided with a button that triggers the destruction of the plot window, as well as the clean-up of C++ data structures associated with all currently visible plots. The X button on the GUI window has also been bound to the same function to ensure the proper release of C++ allocated memory, when the plotter GUI is shut. For custom operations required by the applications, callback functions can be pro- vided via the casa::TPGuiCallBackHooks and casa::TPResetCallBack classes.