Synopsis
Create, or add to, a set of scatter plots.
Syntax
scatterplots(data, style='all', margin=0.15, gap=0.05, symbol='circle', color='blue', size=2, fill=True, ticks=4, overplot=False) The routine is also available as chips_contrib.scatter.splots()
Description
This routine will create a grid of scatter plots, each plot displaying two of the variables in the input data set. It provides a quick way to see how the different variables are correlated.
The routine can be used to create a new plot, or overlay new data on an existing set of scatterplots (when overplot=True). In the latter case the style, margin, gap, and ticks arguments are ignored.
Data to display
The data to be displayed can be given in one of the following forms
- a filename, which can contain CIAO Data Model syntax, and represents a table or a 2D image;
- a Crate, containing a table or 2D image;
- a dictionary, where the keys are the variable names and the values are the variable values;
- a sequence - e.g. list or tuple - of arrays, one for each variable (names are set to "x0", "x1", ...);
- or a 2D image, which is treated as a selection of columns (so that a nx by ny image has nx variables, labelled "x0", "x1", ...).
Plot arrangement and layout
When overplot=False, which is the default, the scatterplots() command creates a grid of plots, with separation given by the gap argument and the spacing between the edge of the grid and the frame by the margin argument. The style argument determines whether a grid of n by n plots is created (style='all'), or if only the upper-right triangle (style='upper') or lower-left triangle (style='lower') is created. In all three cases the diagonal values are displayed, since they label the columns and rows.
The ticks argument controls how many major tick marks there are on each axis. If set to None then each axis will be displayed using the "limits" majortick.mode setting. When given an integer greater than 1, the ticks argument is used to change each axis to use the "counts" majortick.mode, where ticks controls the number of major tick marks displayed. This can make the axis much-more readable, but at the expense of using a larger range than covered by the variable. The limits for a variable can be changed by using the scatterlimits() routine. Each variable is displayed using a linear scale; the scatterlog() command can be used to switch to a logarithmic scale.
Once scatterplots() has been called, the grid spacing and size can be modified using the adjust_grid_gaps, adjust_grid_xrelsize, adjust_grid_xrelsizes, adjust_grid_yrelsize, and adjust_grid_yrelsizes routines.
Curve properties
The symbol, color, size, and fill arguments correspond to the symbol.style, symbol.color, symbol.size, and symbol.fill attributes of the curve.
Overplotting
When overplot=True, the gap,margin, style, and nticks arguments are ignored. Data are added to matching plots, with unknown variables ignored. The overplot call does not have to include data for all the variables in the original grid.
Loading the routine
The routine can be loaded into a ChIPS, Sherpa or Python session by saying:
from chips_contrib.scatter import *
or, to access the qualified version of the name,
from chips_contrib import scatter
after which the scatter.splots() routine can be used.
Examples
Example 1
chips> from chips_contrib.scatter import * chips> add_window(9, 9, 'inches') chips> scatterplots('iris.fits')
The first line - "from chips_contrib.scatter import *" - loads the scatterplot routines; it only needs to be made once per ChIPS session. The numeric columns in iris.fits are plotted against each other in a n by n grid of plots.
Example 2
chips> scatterplots('iris.fits', style='upper', gap=0.02, margin=0.05)
The grid of plots now only includes the upper-right plots, and the spacing between the plots, and gap between the plots and the frame edge, have been decreased.
Example 3
chips> add_window(12, 12, 'inches') chips> scatterplots('iris.fits[species=setosa]', color='red', margin=0.05) chips> scatterplots('iris.fits[species=versicolor]', color='green', overplot=True) chips> scatterplots('iris.fits[species=virginica]', color='blue', overplot=True) chips> scatterlimits('petal_length', 0, 8) chips> scatterlimits('sepal_length', 4, 8) chips> adjust_grid_gaps(0.02, 0.02)
If iris.fits contains the Fisher Iris data set then this creates a plot similar to that shown on the Wikipedia page. The display ranges for the petal_length and sepal_length variables are changed, and the spacing between the plots is reduced.
Example 4
chips> scatterplots([x, y, z])
Create scatter plots for the data stored in the x, y, and z variables: they are assumed to be 1D lists or NumPy arrays. The variable names are set to "x0", "x1", and "x2".
Example 5
chips> scatterplots({'x': x, 'y': y, 'z': z}) chips> idx = z > 0.2 chips> scatterplots({'x': x[idx], 'y': y[idx]}, color='orange', overplot=True)
The x and y points with z > 0.2 are drawn in orange. Since the values are given in a dictionary, the order of the plots is not guaranteed. An ordered dictionary could be used if the order is important.
Changes in the scripts 4.7.1 (December 2014) release
This routine is new in this release.
Bugs
See the bug pages on the CIAO website for an up-to-date listing of known bugs.
See Also
- contrib
- scatterlimits, scatterscale