Starlink User Note253
Mark Taylor
19 August 2008
$Id: sun253.xml,v 1.242 2008-08-19 14:26:02 mbt Exp $
TOPCAT is an interactive graphical viewer and editor for tabular data. It has been designed for use with astronomical tables such as object catalogues, but is not restricted to astronomical applications. It understands a number of different astronomically important formats, and more formats can be added. It is designed to cope well with large tables; a million rows by a hundred columns should not present a problem even with modest memory and CPU resources.
It offers a variety of ways to view and analyse the data, including a browser for the cell data themselves, viewers for information about table and column metadata, tools for joining table using flexible matching algorithms, and visualisation facilities including histograms, 2- and 3-dimensional scatter plots, and density maps. Using a powerful and extensible Java-based expression language new columns can be defined and row subsets selected for separate analysis. Selecting a row can be configured to trigger an action, for instance displaying an image of the catalogue object in an external viewer. Table data and metadata can be edited and the resulting modified table can be written out in a wide range of output formats.
TOPCAT is written in pure Java and is available under the GNU General Public Licence. Its underlying table processing facilities are provided by STIL, the Starlink Tables Infrastructure Library.
TOPCAT is an interactive graphical program which can examine, analyse, combine, edit and write out tables. A table is, roughly, something with columns and rows; each column contains objects of the same type (for instance floating point numbers) and each row has an entry for each of the columns (though some entries might be blank). A common astronomical example of a table is an object catalogue.
TOPCAT can read in tables in a number of formats from various sources, allow you to inspect and manipulate them in various ways, and if you have edited them optionally write them out in the modified state for later use, again in a variety of formats. Here is a summary of its main capabilities:
The general idea of the program is quite straightforward. At any time, it has a list of tables it knows about - these are displayed in the Control Window which is the first thing you see when you start up the program. You can add to the list by loading tables in, or by some actions which create new tables from the existing ones. When you select a table in the list by clicking on it, you can see general information about it in the control window, and you can also open more specialised view windows which allow you to inspect it in more detail or edit it. Some of the actions you can take, such as changing the current Sort Order, Row Subset or Column Set change the Apparent Table, which is a view of the table used for things such as saving it and performing row matches. Changes that you make do not directly modify the tables on disk (or wherever they came from), but if you want to save the changes you have made, you can write the modified table(s) to a new location.
The main body of this document explains these ideas and capabilities
in more detail, and
Appendix A gives a full description of all the windows which
form the application.
While the program is running, this document is available via the
online help system - clicking the Help ()
toolbar button in any window will pop up a help browser open at
the page which describes that window.
This document is heavily hyperlinked, so you may find it easier to
read in its HTML form than on paper.
Recent news about the program can be found on the TOPCAT web page. It was initially developed within the now-terminated Starlink project, and has subsequently been supported by PPARC grant PP/D002486/1, VOTech and AstroGrid. The underlying table handling facilities are supplied by the Starlink Tables Infrastructure Library STIL, which is documented more fully in SUN/252. The software is written in pure Java, and should run on any J2SE 1.4 or 1.5 platform. This makes it highly portable, since it can run on any machine which has a suitable Java installation, which is available for MS Windows, Mac OS X and most flavours of Unix amongst others. Some of the external viewer applications it talks to rely on non-Java code however so one or two facilities, such as displaying spectra, may be absent in some cases. TOPCAT is available under the terms of the GNU General Public License.
This manual aims to give detailed tutorial and reference documentation on most aspects of TOPCAT's capabilities, and reading it is an excellent way to to learn about the program. However, it's quite a fat document, and if you feel you've got better things to do with your time than read it all, you should be able to do most things by playing around with the software and dipping into the manual (or equivalently the online help) when you can't see how to do something or the program isn't behaving as expected. This section provides a short introduction for the impatient, explaining how to get started.
To start the program, you will probably type topcat
or
something like
java -jar topcat-lite.jar
(see Section 9 for
more detail). To view a table that you have on disk, you can either
give its name on the command line or load it using the Load
button from the GUI. FITS and VOTable files are recognised automatically;
if your data is in another format such as ASCII (see Section 4.1.1)
you need to tell the program (e.g. -f ascii
on the command line).
If you just want to try the program out, topcat -demo
will
start with a couple of small tables for demonstration purposes.
The first thing that you see is the Control Window. This has a list of the loaded table(s) on the left. If one of these is highlighted by clicking on it, information about it will be shown on the right; some of this (table name, sort order) you can change here. Along the top is a toolbar with a number of buttons, most of which open up new windows. These fall into a few groups:
Some of the windows allow you to make changes of various sorts to the tables, such as performing sorts, selecting rows, modifying data or metadata. None of these affect the table on disk (or database, or wherever), but if you subsequently save the table the changes will be reflected in the table that you save.
A notable point to bear in mind concerns memory.
TOPCAT is fairly efficient in use of memory, but in some cases when
dealing with large tables you might see an OutOfMemoryError.
It is usually possible to work round this by using either or both
of the -disk
or -Xmx
NNNM
flags on startup - see Section 9.2.2.
Finally, if you have queries, comments or requests about the software, and they don't appear to be addressed in the manual, consult the TOPCAT web page and by all means contact the author - user feedback is always welcome.
The Apparent Table is a particular view of a table which can be influenced by some of the viewing controls.
When you load a table into TOPCAT it has a number of characteristics like the number of columns and rows it contains, the order of the rows that make up the data, the data and metadata themselves, and so on. While manipulating it you can modify the way that the table appears to the program, by changing or adding data or metadata, or changing the order or selection of columns or rows that are visible. For each table its "apparent table" is a table which corresponds to the current state of the table according to the changes that you have made.
In detail, the apparent table consists of the table as it was originally imported into the program plus any of the following changes that you have made:
The apparent table is used in the following contexts:
An important feature of TOPCAT is the ability to define and use Row Subsets. A Row Subset is a selection of the rows within a whole table being viewed within the application, or equivalently a new table composed from some subset of its rows. You can define these and use them in several different ways; the usefulness comes from defining them in one context and using them in another. The Subset Window displays the currently defined Row Subsets and permits some operations on them.
At any time each table has a current row subset, and this affects the Apparent Table. You can always see what it is by looking at the "Row Subset" selector in the Control Window when that table is selected; by default it is one containing all the rows. You can change it by choosing from this selector or as a result of some other actions.
Other contexts in which subsets can be used are picking a selection of rows from which to calculate in the Statistics Window and marking groups of rows to plot using different markers in the various plotting windows.
You can define a Row Subset in one of the following ways:
Combining this with sorting the rows in the table can be useful; if you do a Sort Up on a given column and then drag out the top few rows of the table you can easily create a subset consisting of the highest values of a given column.
In all these cases you will be asked to assign a name for the subset. As with column names, it is a good idea to follow a few rules for these names so that they can be used in algebraic expressions. They should be:
In the first subset definition method above, the current subset will be set immediately to the newly created one. In other cases the new subset may be highlighted appropriately in other windows, for instance by being plotted in scatter plot windows.
You can sort the rows of each table according to the values in a selected column. Normally you will want to sort on a numeric column, but other values may be sortable too, for instance a String column will sort alphabetically. Some kinds of columns (e.g. array ones) don't have any well-defined order, and it is not possible to select these for sorting on.
At any time, each table has a current row order,
and this affects the Apparent Table.
You can always see what it is by looking under the "Sort Order" item
in the Control Window when that table
is selected; by default it is "(none)", which means the rows have the
same order as that of the table they were loaded in from.
The little arrow (/
) indicates whether
the sense of the sort is up or down. You can change the sort order
by selecting a column name from this control, and change the sense
by clicking on the arrow. The sort order can also be changed
by using menu items in the
Columns Window or right-clicking
popup menus in the Data Window.
Selecting a column to sort by calculates the new row order by performing a sort on the cell values there and then. If the table data change somehow (e.g. because you edit cells in the table) then it is possible for the sort order to become out of date.
The current row order affects the Apparent Table, and hence determines the order of rows in tables which are exported in any way (e.g. written out) from TOPCAT. You can always see the rows in their currently sorted order in the Data Window.
When each table is imported it has a list of columns. Each column has header information which determines the kind of data which can fill the cells of that column as well as a name, and maybe some additional information like units and Unified Content Descriptor. All this information can be viewed, and in some cases modified, in the Columns Window.
During the lifetime of the table within TOPCAT, this list of columns can be changed by adding new columns, hiding (and perhaps subsequently revealing) existing columns, and changing their order. The current state of which columns are present and visible and what order they are in is collectively known as the Column Set, and affects the Apparent Table. The current Column Set is always reflected in the order in which columns are displayed in the Data Window and Statistics Window. The Columns Window shows all the known columns, including hidden ones, in Column Set order; whether they are currently visible is indicated by the (leftmost) "Visible" column.
You can affect the current Column Set in the following ways:
You can also hide a column by right-clicking on it in the Data Window, which brings up a popup menu - select the Hide option. To make it visible again you have to go to the Columns Window as above.
TOPCAT supports a wide variety of tabular data formats. In most cases these are file formats for tables stored as single files on a disk or at the end of a URL, but there are other possibilities, for instance a table you have opened could be the result of an SQL query on a database.
Since you can load a table from one format and save it in a different
one, TOPCAT can be used to convert a table from one format to another.
If this is all you want to do however, you may find it more
convenient to use the tcopy
command line utility in the
STILTS package.
The format handling is extensible, so new formats can be added fairly easily. All the table input/output is handled by STIL, the Starlink Tables Infrastructure Library; more detailed descriptions of the I/O capabilities can be found in its documentation.
The following subsections describe the available formats for reading and writing tables. The two operations are separate, so not all the supported input formats have matching output formats and vice versa.
Loading tables into TOPCAT is done either from the command line
when you start the program up or
using the Load Table dialogue.
For FITS and VOTable formats
the file format can be detected automatically
(note this is done by looking at the file content, it has nothing
to do with filename extensions).
For other formats though, for instance ASCII or Comma-Separated Values,
you will have to specify the format that the file is in.
In the Load Window, there is a selection box from which you can
choose the format, and from the command line you use the
-f
flag - see Section 9 for details.
You can always specify the format rather than using automatic detection
if you prefer - this can be a good idea if a table appears to
be failing to load in a surprising way, since it may give you
a more detailed error message.
In either case, table locations may be given as filenames or as URLs, and any data compression (gzip, unix compress and bzip2) will be automatically detected and dealt with - see Section 4.2.
Note: in some earlier versions of TOPCAT, ASCII
format tables could be detected automatically, so you could load
them by typing something like "topcat table.txt
".
In the current version, you have to signal that this is an
ASCII table, for instance by typing "topcat -f ascii table.txt
".
The following sections describe the table formats which TOPCAT can read.
FITS binary and ASCII table extensions can be read. Unless told otherwise, TOPCAT will display the first TABLE or BINTABLE extension in a given FITS file. If a later extension is required, this is indicated by giving the extension number after a '#' at the end of the table location. The first extension (first HDU after the primary HDU) is numbered 1. Thus in a compressed FITS table named "spec23.fits.gz" with one primary HDU and two BINTABLE extensions, you would view the first one using the name "spec23.fits.gz" or "spec23.fits.gz#1" and the second one using the name "spec23.fits.gz#2". The suffix "#0" is never used for a legal FITS file, since the primary HDU cannot contain a table.
You can select which extension to use more conveniently than by specifying the HDU numbers if you use the Hierarchy Browser to load the table.
If the table has been written using TOPCAT's "fits-plus
"
output format (see Section 4.1.2.1) then the metadata will be
read in from the primary HDU as well.
For normal FITS files, header cards in the table's HDU header will be made available as table parameters (see Appendix A.3.2). Only header cards which are not used to specify the table format itself are visible as parameters (e.g. NAXIS, TTYPE* etc cards are not). HISTORY and COMMENT cards are run together as one multi-line value.
If the table is stored in a FITS binary table extension in a file
on local disk in uncompressed form, then the table is 'mapped' into
memory - this generally means fast loading and low memory use,
even in the absence of TOPCAT's -disk
flag
(Section 9.1).
As well as normal binary and ASCII FITS tables, STIL supports FITS files which contain tabular data stored in column-oriented format. This means that the table is stored in a BINTABLE extension HDU, but that BINTABLE has a single row, with each cell of that row holding a whole column's worth of data. The final (slowest-varying) dimension of each of these cells (declared via the TDIMn header cards) is the same, namely, the number of rows in the table that is represented. The point of this is that all the cells of a given column are stored contiguously, which for very large, and especially very wide tables means that certain access patterns (basically, ones which access only a small proportion of the columns in a table) can be much more efficient since they require less I/O overhead in reading data blocks.
Such tables are perfectly legal FITS files, but most non-STIL software will probably not recognise them as tables in the usual way. This format is mostly intended for the case where you have a large table in some other format (possibly the result of an SQL query) and you wish to cache it in a way which can be read efficiently by a STIL-based application.
Like normal FITS, two variants are supported;
with (colfits-plus
) and without (colfits-basic
)
metadata stored as a VOTable byte array in the primary HDU.
Colfits format is only available for data which stored as an uncompressed file in the file system (not, for instance, from a URL).
VOTable is an XML-based format for tabular data endorsed by the International Virtual Observatory Alliance; while the tabular data which can be encoded is by design close to what FITS allows, it provides for much richer encoding of structure and metadata. TOPCAT is believed to read any table which conforms to the VOTable 1.0 or VOTable 1.1 specification. This includes tables in which the cell data are included in-line as XML elements (VOTable/TABLEDATA format), or included/referenced as a FITS table (VOTable/FITS) or included/referenced as a raw binary stream (VOTable/BINARY). TOPCAT does not attempt to be fussy about input VOTable documents, and it will have a good go at reading VOTables which violate the standards in various ways.
VOTable documents can have a complicated hierarchical structure, and may contain more than one actual table. Unless told otherwise, TOPCAT will load the first table it finds in the document, so in the (common) case that the document holds exactly one table, giving the filename will load that sole table. To display a table other than the first, you must indicate the zero-based index of the TABLE element in a breadth-first search after a '#' character at the end of the table specification. Here is an example VOTable document:
<VOTABLE> <RESOURCE> <TABLE name="Star Catalogue"> ... </TABLE> <TABLE name="Galaxy Catalogue"> ... </TABLE> </RESOURCE> </VOTABLE>If this is available in a file named "cats.xml" then open the Star Catalogue using the name "cats.xml" or "cats.xml#0", and the Galaxy Catalogue using the name "cats.xml#1".
In many cases tables are stored in some sort of unstructured plain text format, with cells separated by spaces or some other delimiters. There is a wide variety of such formats depending on what delimiters are used, how columns are identified, whether blank values are permitted and so on. It is impossible to cope with them all, but TOPCAT attempts to make a good guess about how to interpret a given ASCII file as a table, which in many cases is successful. In particular, if you just have columns of numbers separated by something that looks like spaces, you should be just fine.
Here are the detailed rules for how the ASCII-format tables are interpreted:
null
" (unquoted) represents
the null valueBoolean
,
Short
Integer
,
Long
,
Float
,
Double
,
String
If the list of rules above looks frightening, don't worry, in many cases it ought to make sense of a table without you having to read the small print. Here is an example of a suitable ASCII-format table:
# # Here is a list of some animals. # # RECNO SPECIES NAME LEGS HEIGHT/m 1 pig "Pigling Bland" 4 0.8 2 cow Daisy 4 2 3 goldfish Dobbin "" 0.05 4 ant "" 6 0.001 5 ant "" 6 0.001 6 ant '' 6 0.001 7 "queen ant" 'Ma\'am' 6 2e-3 8 human "Mark" 2 1.8In this case it will identify the following columns:
Name Type ---- ---- RECNO Short SPECIES String NAME String LEGS Short HEIGHT/m FloatIt will also use the text "
Here is a list of some animals
"
as the Description parameter of the table.
Without any of the comment lines, it would still interpret the table,
but the columns would be given the names col1
..col5
.
If you understand the format of your files but they don't exactly match the criteria above, the best thing is probably to write a simple free-standing program or script which will convert them into the format described here. You may find Perl or awk suitable languages for this sort of thing.
This format is not detected automatically - you must specify that
you wish to load a table in ascii
format.
CalTech's Infrared Processing and Analysis Center
use a text-based format for storage of tabular data,
defined at
http://irsa.ipac.caltech.edu/applications/DDGEN/Doc/ipac_tbl.html.
Tables can store column name, type, units and null values, as well
as table parameters. They typically have a filename extension
".tbl
" and are used for Spitzer data amongst other
things. An example looks like this:
\title='Animals' \ This is a table with some animals in it. | RECNO | SPECIES | NAME | LEGS | HEIGHT | | char | char | char | int | double | | | | | | m | | | | null | | | 1 pig Pigling Bland 4 0.8 2 cow Daisy 4 2 3 goldfish Dobbin 0 0.05 4 ant null 6 0.001
Comma-separated value ("CSV") format is a common semi-standard text-based format in which fields are delimited by commas. Spreadsheets and databases are often able to export data in some variant of it. The intention is that TOPCAT can read tables in the version of the format spoken by MS Excel amongst other applications, though the documentation on which it was based was not obtained directly from Microsoft.
The rules for data which it understands are as follows:
This format is not detected automatically - you must specify that
you wish to load a table in csv
format.
Tab-Separated Table, or TST, is a text-based table format used
by a number of astronomical tools including Starlink's
GAIA and
ESO's SkyCat
on which it is based.
A definition of the format can be found in
Starlink Software Note 75.
The implementation here ignores all comment lines: special comments
such as the "#column-units:
" are not processed.
An example looks like this:
Simple TST example; stellar photometry catalogue. A.C. Davenhall (Edinburgh) 26/7/00. Catalogue of U,B,V colours. UBV photometry from Mount Pumpkin Observatory, see Sage, Rosemary and Thyme (1988). # Start of parameter definitions. EQUINOX: J2000.0 EPOCH: J1996.35 id_col: -1 ra_col: 0 dec_col: 1 # End of parameter definitions. ra<tab>dec<tab>V<tab>B_V<tab>U_B --<tab>---<tab>-<tab>---<tab>--- 5:09:08.7<tab> -8:45:15<tab> 4.27<tab> -0.19<tab> -0.90 5:07:50.9<tab> -5:05:11<tab> 2.79<tab> +0.13<tab> +0.10 5:01:26.3<tab> -7:10:26<tab> 4.81<tab> -0.19<tab> -0.74 5:17:36.3<tab> -6:50:40<tab> 3.60<tab> -0.11<tab> -0.47 [EOD]
With appropriate configuration, TOPCAT can be used to examine the results of queries on an SQL-compatible relational database.
Database queries can be specified as a string in the form:
jdbc:driver-specific-url#sql-queryThe exact form is dependent on the driver. Here is an example for MySQL:
jdbc:mysql://localhost/astro1?user=mbt#SELECT ra, dec FROM swaa WHERE vmag<18which would get a two-column table (the columns being "ra" and "dec"), constructed from certain rows from the table "swaa" in the database "astro1" on the local host, using the access privileges of user mbt.
Fortunately you don't have to construct this by hand, there is an SQL Query Dialogue to assist in putting it together.
Note that TOPCAT does not view a table in the database directly, but the result of an SQL query on that table. If you want to view the whole table you can use the query
SELECT * FROM table-namebut be aware that such a query might be expensive on a large table.
Use of SQL queries requires some additional configuration of TOPCAT; see Section 9.3.
Some support is provided for files produced by the World Data Centre for Solar Terrestrial Physics. The format itself apparently has no name, but files in this format look something like the following:
Column formats and units - (Fixed format columns which are single space seperated.) ------------------------ Datetime (YYYY mm dd HHMMSS) %4d %2d %2d %6d - %1s aa index - 3-HOURLY (Provisional) %3d nT 2000 01 01 000000 67 2000 01 01 030000 32 ...Support for WDC tables is experimental - it may not be very robust.
This format is not detected automatically - you must specify that
you wish to load a table in csv
format.
Writing out tables from TOPCAT is done using the Save Table Window. In general you have to specify the format in which you want the table to be output by selecting from the Save Window's Table Output Format selector; the following sections describe the possible choices. In some cases there are variants within each format - these are described as well.
The program has no "native" file format, but if you have no particular
preference about which format to save tables to,
FITS is a good choice.
Uncompressed FITS tables do not in most cases have to be read all the
way through (they are 'mapped' into memory), which makes them very
fast to load up.
The FITS format which is written by default
(also known as "FITS-plus") also uses a trick to
store extra metadata, such as table parameters and UCDs
in a way TOPCAT can read in again later (see Section 4.1.2.1).
These files are quite usable as normal FITS tables by other applications,
but they will only be able to see the limited metadata stored in the
FITS headers.
For very large files, in some circumstances column-oriented FITS
("colfits
") format can be more efficient for some applications,
though this is unlikely to be understood except by STIL-based code
(TOPCAT and STILTS).
If you want to write to a format which retains all metadata in a portable
format, then one of the Section 4.1.2.3 formats might be better.
When saving in FITS format a new file is written consisting of two HDUs (Header+Data Units): a primary one (required by the FITS standard), and a single extension of type BINTABLE containing the table data.
There are two variants of this format:
votable-fits-inline
is hard to process efficiently
(in particular the data cannot easily be mapped into memory) and
votable-fits-href
requires that you keep your data in
two separate files, which can get separated from each other.
If you want to ensure that the metadata are available to other VOTable-aware
programs, you should use one of the normal
VOTable formats.
fits-plus
is being used you just get some hidden benefits.
When saving in column-oriented FITS format a new file is written consisting of two HDUs (Header+Data Units); a primary one (required by the FITS standard) and a single extension of type BINTABLE containing the table data. Unlike normal FITS format however, this table consists of a single row in which each cell holds the data for an entire column. This can be a more efficient format to work with when dealing with very large, and especially very wide, tables. The benefits are greatest when the file size exceeds the amount of available physical memory and operations are required which scan through the table using only a few of the columns (much of TOPCAT's operations, for instance plotting two columns against each other, fit into this category). The overhead for reading and writing this format is somewhat higher than for normal FITS however, and other applications may not be able to work with it (though it is a legal FITS file), so in most cases normal FITS is a more suitable choice.
Like normal (row-oriented) FITS (see Section 4.1.2.1), there are two variants:
When a table is saved to VOTable format, a document conforming to the VOTable 1.0 specification containing a single TABLE element within a single RESOURCE element is written.
There are a number of variants which determine the form in which the table data (DATA element) is written:
Tables can be written using a format which is compatible with the ASCII input format. It writes as plainly as possible, so should stand a good chance of being comprehensible to other programs which require some sort of plain text rendition of a table.
The first line is a comment (starting with a "#
" character)
which names the columns, and
an attempt is made to line up data in columns using spaces.
Here is an example of a short table written in this format:
# index Species Name Legs Height Mammal 1 pig Bland 4 0.8 true 2 cow Daisy 4 2.0 true 3 goldfish Dobbin 0 0.05 false 4 ant "" 6 0.0010 false 5 ant "" 6 0.0010 false 6 human Mark 2 1.9 true
Tables can be written to a simple text-based format which is designed to be read by humans. No reader exists for this format.
Here is an example of a short table written in this format:
+-------+----------+--------+------+--------+--------+ | index | Species | Name | Legs | Height | Mammal | +-------+----------+--------+------+--------+--------+ | 1 | pig | Bland | 4 | 0.8 | true | | 2 | cow | Daisy | 4 | 2.0 | true | | 3 | goldfish | Dobbin | 0 | 0.05 | false | | 4 | ant | | 6 | 0.0010 | false | | 5 | ant | | 6 | 0.0010 | false | | 6 | human | Mark | 2 | 1.9 | true | +-------+----------+--------+------+--------+--------+
Tables can be written to the semi-standard comma-separated value (CSV) format, described in more detail in Section 4.1.1.6. This can be useful for importing into certain external applications, such as some spreadsheets or databases.
There are two variants:
Tables can be written to TST format, which is described in more detail in Section 4.1.1.7. This can be useful for communicating with some other astronomical tools such as GAIA.
With appropriate configuration, TOPCAT can write out tables as new tables in an SQL-compatible relational database.
For writing, the location is specified as the following URL:
jdbc:driver-specific-url#new-table-nameThe exact form is dependent on the driver. Here is an example for MySQL:
jdbc:mysql://localhost/astro1?user=mbt#newtabwhich would write the current contents of the browser into a new table named "newtab" in the database "astro1" on the local host with the access privileges of user mbt.
Fortunately you do not have to construct this URL by hand, there is an SQL dialogue box to assist in putting it together.
Use of SQL queries requires some additional configuration of TOPCAT; see Section 9.3.
A table can be written out as an HTML 3.2 TABLE element, suitable for use as a web page or insertion into one.
There are two variants:
A table can be written out as a LaTeX tabular
environment,
suitable for insertion into a document intended for publication.
There are two variants:
tabular
element alone is output;
this will have to be embedded in a larger LaTeX document before use.
tabular
within a
table
within a
document
is output.
Obviously, this isn't so suitable for very large tables.
Mirage is a powerful standalone java tool developed at Bell Labs for analysis of multidimensional data. It uses its own file format for input. TOPCAT can write tables in the input format which Mirage uses, so that you can prepare tables in TOPCAT and write them out for subsequent use by Mirage.
It is also possible in principle to launch Mirage directly from within TOPCAT, using the Export To Mirage item on the Control Window's File menu; this will cause Mirage to start up viewing the currently selected Apparent Table. In order for this to work the Mirage classes must be on your classpath (see Section 9.2.1) when TOPCAT is run.
There appears to be a bug in Mirage which means this does not always work - sometimes Mirage starts up with no data loaded into it. In this case you will have to save the data to disk in Mirage format, start up Mirage separately, and load the data in using the New Dataset item in Mirage's Console menu.
Note that when Mirage has been launched from TOPCAT, exiting Mirage or closing its window will exit TOPCAT as well.
It is in principle possible to configure TOPCAT to work with table file formats other than the ones listed in this section. It does not require any upgrade of TOPCAT itself, but you have to write or otherwise acquire an input and/or output handler for the table format in question.
The steps that you need to take are:
startable.readers
and/or
startable.writers
system property to the name of the
handler classes (see Section 9.2.3)Explaining how to write such handlers is beyond the scope of this document - see the user document and javadocs for STIL.
In many cases loading and saving tables will be done using GUI dialogues such as the filestore load and save windows, where you just need to click on a filename or directory to indicate the load/save location. However in some cases, for instance specifying tables on the command line (Section 9.1) or typing pathnames directly into the load/save dialogue windows, you may want give the location of a table for input or output using only a single string.
Most of the time you will just want to type in a filename; either an absolute or relative pathname can be used. However, TOPCAT also supports direct use of URLs, including ones using some specialised protocols. Here is the list of URL types allowed:
http:
ftp:
file:
jar:
myspace:
myspace:/survey/iras_psc.xml
",
and can access files in the myspace are that the user is currently
logged into.
These URLs can be used for both input and output of tables.
To use them you must have an AstroGrid account and the AstroGrid
WorkBench or similar must be running; if you're not currently
logged in a dialogue will pop up to ask you for name and
password.ivo:
ivo://uk.ac.le.star/filemanager#node-2583
".
These URLs can be used for both input and output of tables.
To use them you must have an AstroGrid account and the AstroGrid
WorkBench or similar must be running; if you're not currently
logged in a dialogue will pop up to ask you for name and
password.jdbc:
As with the GUI-based load dialogues, data compression in any of the supported formats (gzip, bzip2, Unix compress) is detected and dealt with automatically for input locations.
TOPCAT allows you to join two or more tables together to produce a new one in a variety of ways, and also to identify "similar" rows within a single table according to their cell contents. This section describes the facilities for performing these related operations.
There are two basic ways to join tables together: top-to-bottom and side-by-side. A top-to-bottom join (which here I call concatenation) is fairly straightforward in that it just requires you to decide which columns in one table correspond to which columns in the other. A side-by-side join is more complicated - it is rarely the case that row i in the first table should correspond to row i in the second one, so it is necessary to provide some criteria for deciding which (if any) row in the second table corresponds to a given row in the first. In other words, some sort of matching between rows in different tables needs to take place. This corresponds to what is called a join in database technology. Matching rows within a single table is a useful operation which involves many of the same issues, so that is described here too.
Two tables can be concatenated using the Concatenation Window, which just requires you to specify the two tables to be joined, and for each column in the first ("Base") table, which column in the second ("Appended") table (if any) corresponds to it. The Apparent Table is used in each case. The resulting table, which is added to the list of known tables in the Control Window, has the same columns as the Base table, and a number of rows equal to the sum of the number of rows in the Base and Appended tables.
As a very simple example, concatenating these two tables:
Messier RA Dec Name ------- -- --- ---- 97 168.63 55.03 Owl Nebula 101 210.75 54.375 Pinwheel Galaxy 64 194.13 21.700 Black Eye Galaxyand
RA2000 DEC2000 ID ------ ------- -- 185.6 58.08 M40 186.3 18.20 M85with the assignments RA->RA2000, Dec->DEC2000 and Messier->ID would give:
Messier RA Dec Name ------- -- --- ---- 97 168.63 55.03 Owl Nebula 101 210.75 54.375 Pinwheel Galaxy 64 194.13 21.700 Black Eye Galaxy M40 185.6 58.08 M85 183.6 18.20Of course it is the user's responsibility to ensure that the correspondance of columns is sensible (that the two corresponding columns mean the same thing).
You can perform a concatenation using the
Concatenation Window;
obtain this using the Concatenate Tables () button
in the Control Window.
When joining two tables side-by-side you need to identify which row(s) in one correspond to which row(s) in the other. Conceptually, this is done by looking at each row in the first table, somehow identifying in the second table which row "refers to the same thing", and putting a new row in the joined table which consists of all the fields of the row in the first table, followed by all the fields of its matched row in the second table. The resulting table then has a number of columns equal to the sum of the number of columns in both input tables.
In practice, there are a number of complications. For one thing, each row in one table may be matched by zero, one or many rows in the the other. For another, defining what is meant by "referring to the same thing" may not be straightforward. There is also the problem of actually identifying these matches in a relatively efficient way (without explicitly comparing each row in one table with each row in the other, which would be far too slow for large tables).
A common example is the case of matching two object catalogues - suppose we have the following catalogues:
Xpos Ypos Vmag ---- ---- ---- 1134.822 599.247 13.8 659.68 1046.874 17.2 909.613 543.293 9.3and
x y Bmag - - ---- 909.523 543.800 10.1 1832.114 409.567 12.3 1135.201 600.100 14.6 702.622 1004.972 19.0and we wish to combine them to create one new catalogue with a row for each object which appears in both tables. To do this, you have to specify what counts as a match - in this case let's say that a row in one table matches (refers to the same object as) a row in the other if the distance between the positions indicated by their X and Y coordinates matches to within one unit (sqrt((Xpos-x)2 + (Ypos-y)2)<=1)). Then the catalogue we will end up with is:
Xpos Ypos Vmag x y Bmag ---- ---- ---- - - ---- 1134.822 599.247 13.8 1135.201 600.100 14.6 909.613 543.293 9.3 909.523 543.800 10.1There are a number of variations on this however - your match criteria might involve sky coordinates instead of Cartesian ones (or not be physical coordinates at all), you might want to match more than two tables, you might want to identify groups of matching objects in a single table, you might want the output to include rows which don't match as well...
The Match Window allows you to specify
To match two tables, use the Pair Match () button
in the Control Window;
to match more tables than two at once, use the other options on the
Control Window's Join menu.
Although the effect is rather different, searching through a
single table for rows which match each other (refer to the same
object, as explained above) is a similar process and requires much
of the same information to be specified, mainly, what counts as
a match.
You can do this using the Internal Match Window,
obtained by using the Internal Match () button
in the Control Window.
This section provides a bit more detail on the how the row matching is done. It is designed to give a rough idea to interested parties; it is not a tutorial description from first principles of how it all works.
The basic algorithm for matching is based on dividing up the space of possibly-matching rows into an (indeterminate) number of bins. These bins will typically correspond to disjoint cells of a physical or notional coordinate space, but need not do so. In the first step, each row of each table is assessed to determine which bins might contain matches to it - this will generally be the bin that it falls into and any "adjacent" bins within a distance corresponding to the matching tolerance. A reference to the row is associated with each such bin. In the second step, each bin is examined, and if two or more rows are associated with it every possible pair of rows in the associated set is assessed to see whether it does in fact consitute a matched pair. This will identify all and only those row pairs which are related according to the selected match criteria. During this process a number of optimisations may be applied depending on the details of the data and the requested match.
This means that the matching algorithm is basically an O(N log(N)) process, where N is the total number of rows in all the tables participating in a match. This is good news, since the naive interpretation would be O(N2). This can break down however if the matching tolerance is such that the number of rows associated with some or most bins gets large, in which case an O(M2) component can come to dominate, where M is the number of rows per bin. The average number of rows per bin is reported in the logging while a match is proceeding, so you can keep an eye on this.
For more detail on the matching algorithms, see the
javadocs for the uk.ac.starlink.table.join
package,
or contact the author.
TOPCAT allows you to enter algebraic expressions in a number of contexts:
What you write are actually expressions in the Java language, which are compiled into Java bytecode before evaluation. However, this does not mean that you need to be a Java programmer to write them. The syntax is pretty similar to C, but even if you've never programmed in C most simple things, and some complicated ones, are quite intutitive.
The following explanation gives some guidance and examples for writing these expressions. Unfortunately a complete tutorial on writing Java is beyond the scope of this document, but it should provide enough information for even a novice to write useful expressions.
The expressions that you can write are basically any function of all the column values and subset inclusion flags which apply to a given row; the function result can then define the per-row value of a new column, or the inclusion flag for a new subset, or the action to be performed when a row is activated by clicking on it. If the built-in operators and functions are not sufficient, or it's unwieldy to express your function in one line of code, you can add new functions by writing your own classes - see Section 6.9.
Note: if Java is running in an environment with certain security restrictions (a security manager which does not permit creation of custom class loaders) then algebraic expressions won't work at all, and the buttons which allow you to enter them will be disabled.
To create a useful expression for a cell in a column, you will have to refer to other cells in different columns of the same table row. You can do this in three ways:
ucd$<ucd-spec>
". Depending on the version of
UCD scheme used, UCDs can contain various punctuation marks such
as underscores, semicolons and dots; for the purpose of this syntax
these should all be represented as underscores ("_
").
So to identify a column which has the UCD "phot.mag;em.opt.R
",
you should use the identifier "ucd$phot_mag_em_opt_r
".
Matching is not case-sensitive. Futhermore, a trailing underscore
acts as a wildcard, so that the above column could also be referenced
using the identifier "ucd$phot_mag_
". If multiple
columns have UCDs which match the given identifer, the first one
will be used.
Note that the same syntax can be used for referencing table parameters (see Section 6.3); columns take preference so if a column and a parameter both match the requested UCD, the column value will be used.
There is a special column whose name is "Index" and whose $ID is "$0". The value of this is the same as the row number in the unsorted table (the grey numbers on the left of the grid in the Data Window), so for the first column in the unsorted table it's 1, for the second it's 2, and so on.
The value of the variables so referenced will be a primitive
(boolean, byte, short, char, int, long, float, double) if the
column contains one of the corresponding types. Otherwise it will
be an Object of the type held by the column, for instance a String.
In practice this means: you can write the name of a column, and it will
evaluate to the numeric (or string) value that that column contains
in each row. You can then use this in normal algebraic expressions
such as "B_MAG-U_MAG
" as you'd expect.
If you have any Row Subsets defined you can also access the value of the boolean (true/false) flag indicating whether the current row is in each subset. Again there are two ways of doing this:
Note: in early versions of TOPCAT the hash sign ("#") was used instead of the underscore for this purpose; the hash sign no longer has this meaning.
? :
" operator or
when combining existing subsets using logical operators to create
a new subset.
Some tables have constant values associated with them; these may represent such things as the epoch at which observations were taken, the name of the catalogue, an angular resolution associated with all observations, or any number of other things. Such constants are known as table parameters and can be viewed and modified in the Parameter Window. The values of such parameters can be referenced in algebraic expressions as follows:
param$
.
ucd$
. Any punctuation marks in the
UCD should be replaced by underscores, and a trailing underscore
is interpreted as a wildcard. See Section 6.1 for
more discussion.
When no special steps are taken, if a null value (blank cell) is encountered in evaluating an expression (usually because one of the columns it relies on has a null value in the row in question) then the result of the expression is also null.
It is possible to exercise more control than this, but it
requires a little bit of care,
because the expressions work in terms of primitive values
(numeric or boolean ones) which don't in general have a defined null
value. The name "null" in expressions gives you the java null
reference, but this cannot be matched against a primitive value
or used as the return value of a primitive expression.
For most purposes, the following two tips should enable you to work with null values:
NULL_
"
(use upper case) to the column name or $ID. This
will yield a boolean value which is true if the column contains
a blank or a floating point NaN (not-a-number) value,
and false otherwise.
NULL
"
(upper case). To return a null value from a non-numeric expression
(e.g. a String column) use the name "null
" (lower case).
Null values are often used in conjunction with the conditional
operator, "? :
"; the expression
test ? tval : fvalreturns the value
tval
if the boolean expression test
evaluates true, or fval
if test
evaluates false.
So for instance the following expression:
Vmag == -99 ? NULL : Vmagcan be used to define a new column which has the same value as the Vmag column for most values, but if Vmag has the "magic" value -99 the new column will contain a blank. The opposite trick (substituting a blank value with a magic one) can be done like this:
NULL_Vmag ? -99 : VmagSome more examples are given in Section 6.8.
The operators are pretty much the same as in the C language. The common ones are:
+
(add)
-
(subtract)
*
(multiply)
/
(divide)
%
(modulus)
!
(not)
&&
(and)
||
(or)
^
(exclusive-or)
==
(numeric identity)
!=
(numeric non-identity)
<
(less than)
>
(greater than)
<=
(less than or equal)
>=
(greater than or equal)
(byte)
(numeric -> signed byte)
(short)
(numeric -> 2-byte integer)
(int)
(numeric -> 4-byte integer)
(long)
(numeric -> 8-byte integer)
(float)
(numeric -> 4-type floating point)
(double)
(numeric -> 8-byte floating point)
+
(string concatenation)
[]
(array dereferencing)
?:
(conditional switch)
instanceof
(class membership)
Many functions are available for use within your expressions, covering standard mathematical and trigonometric functions, arithmetic utility functions, type conversions, and some more specialised astronomical ones. You can use them in just the way you'd expect, by using the function name (unlike column names, this is case-sensitive) followed by comma-separated arguments in brackets, so
max(IMAG,JMAG)will give you the larger of the values in the columns IMAG and JMAG, and so on.
The functions are grouped into the following classes:
Some constants for approximate conversions between different magnitude scales are also provided:
JOHNSON_AB_*
, for Johnson <-> AB magnitude
conversions
(http://www.astro.utoronto.ca/~patton/astro/mags.html,
citing Frei and Gunn 1995).VEGA_AB_*
, for Vega <-> AB magnitude
conversions
(Blanton et al., Astronomical Journal 127, 2562-2578 (2005),
eqs.(5)).
yyyy-mm-ddThh:mm:ss.s
, where the T
is a literal character (a space character may be used instead).
Based on UTC.
Therefore midday on the 25th of October 2004 is
2004-10-25T12:00:00
in ISO 8601 format,
53303.5 as an MJD value,
2004.81588 as a Julian Epoch and
2004.81726 as a Besselian Epoch.
Currently this implementation cannot be relied upon to better than a millisecond.
The following parameters are used:
For a flat universe, omegaM
+omegaLambda
=1
The terms and formulae used here are taken from the paper by D.W.Hogg, Distance measures in cosmology, astro-ph/9905116 v4 (2000).
A listing of the functions in these classes is given in Appendix B.1, and complete documentation on them is available within TOPCAT from the Available Functions Window.
This note provides a bit more detail for Java programmers on what is going on here; only read on if you want to understand how the use of functions in TOPCAT algebraic expressions relates to normal Java code.
The expressions which you write are compiled to Java bytecode
when you enter them (if there is a 'compilation error' it will be
reported straight away). The functions listed in the previous subsections
are all the public static
methods of the classes which
are made available by default. The classes listed are all in the
packages uk.ac.starlink.ttools.func
and
uk.ac.starlink.topcat.func
(uk.ac.starlink.topcat.func.Strings
etc).
However, the public static methods are all imported into an anonymous
namespace for bytecode compilation, so that you write
(sqrt(x)
and not Maths.sqrt(x)
.
The same happens to other classes that are imported (which can be
in any package or none) - their public
static methods all go into the anonymous namespace. Thus, method
name clashes are a possibility.
This cleverness is all made possible by the rather wonderful JEL.
There is another category of functions which can be used apart from those listed in the previous section. These are called, in Java/object-oriented parlance, "instance methods" and represent functions that can be executed on an object.
It is possible to invoke any of its public
instance methods on any object
(though not on primitive values - numeric and boolean ones).
The syntax is that you place a "." followed by the method invocation
after the object you want to invoke the method on,
hence NAME.substring(3)
instead of substring(NAME,3)
.
If you know what you're doing, feel free to go ahead and do this.
However, most of the instance methods you're likely to want to use
have equivalents in the normal functions listed in the previous section,
so unless you're a Java programmer or feeling adventurous, you are
probably best off ignoring this feature.
Here are some general examples. They could be used to define synthetic columns or (where numeric) to define values for one of the axes in a plot.
(first + second) * 0.5
sqrt(variance)
radiansToDegrees(DEC_radians) degreesToRadians(RA_degrees)
parseInt($12) parseDouble(ident)
toString(index)
toShort(obs_type) toDouble(range)or
(short) obs_type (double) range
hmsToRadians(RA1950) dmsToRadians(decDeg,decMin,decSec)
radiansToDms($3) radiansToHms(RA,2)
min(1000, max(value, 0))
jmag == 9999 ? NULL : jmag
NULL_jmag ? 9999 : jmag
psfCounts[2]
RA > 100 && RA < 120 && Dec > 75 && Dec < 85
$2*$2 + $3*$3 < 1 skyDistance(ra0,dec0,degreesToRadians(RA),degreesToRadians(DEC))<15*ARC_MINUTE
index <= 100
index % 10 == 0
equals(SECTOR, "ZZ9 Plural Z Alpha") equalsIgnoreCase(SECTOR, "zz9 plural z alpha") startsWith(SECTOR, "ZZ") contains(ph_qual, "U")
matches(SECTOR, "[XYZ] Alpha")
(_1 && _2) && ! _3
! NULL_ellipticity
The functions provided by default for use with algebraic expressions, while powerful, may not provide all the operations you need. For this reason, it is possible to write your own extensions to the expression language. In this way you can specify arbitrarily complicated functions. Note however that this will only allow you to define new columns or subsets where each cell is a function only of the other cells in the same row - there is no way to define a value in one row as a function of values in other rows.
In order to do this, you have to write and compile a (probably short) program in the Java language. A full discussion of how to go about this is beyond the scope of this document, so if you are new to Java and/or programming you may need to find a friendly local programmer to assist (or mail the author). The following explanation is aimed at Java programmers, but may not be incomprehensible to non-specialists.
The steps you need to follow are:
jel.classes
or jel.classes.activation
system properties (colon-separated if there are several)
as described in Section 9.2.3
or during a run using the
Available Function Window's
Add Class (Any public static methods defined in the classes thus specified will be available for use in the Synthetic Column, Algebraic Subset or (in the case of activation functions only) Activation Window windows. They should be defined to take and return the relevant primitive or Object types for the function required (in the case of activation functions the return value should normally be a short log string).
For example, a class written as follows would define a three-value average:
public class AuxFuncs { public static double average3( double x, double y, double z ) { return ( x + y + z ) / 3.0; } }and the expression "
average3($1,$2,$3)
"
could then be used to define a new synthetic column, giving the average of
the first three existing columns.
Exactly how you would build this is dependent on your system,
but it might involve doing something like the following:
javac AuxFuncs.java
"topcat -Djel.classes=AuxFuncs -classpath .
"As well as seeing the overview of table data provided by a plot or statistics summary, it is often necessary to focus on a particular row of the table, which according to the nature of the table may represent an astronomical object, an event or some other entity. In the Data Window a table row is simply a row of the displayed JTable, and in a plot it corresponds to one plotted point.
If you click on plotted point in one of the graphics windows, or on a row in the Data Window, the corresponding table row will be activated. When a row is activated, three things happen:
The third one can be more complicated. By default, no activation action is set, so nothing else happens, and this may very well be what you want. However, by clicking on the Activation Action selector in the Control Window you can bring up the Activation Window which enables you to choose an additional action to take place. There are various options here and various ways to achieve them (see Appendix A.7.4 for more details) but the kinds of actions which are envisaged are to display one or more images or spectra relating to the row you have identified. One of the options available for instance retrieves a postage-stamp image of a few arcminutes around the sky position defined by the row from a SuperCOSMOS all-sky image survey and pops it up in a viewer window. So for instance having spotted an interesting point in a plot of a galaxy catalogue you can click on it, and immediately see a picture to identify its morphological type.
The exact actions you want to perform may be closely tailored to the data you have, for instance you may have a set of spectra on disk named by object ID. It's impossible to cater for such possibilities with a set of pre-packaged options, so you are able to define your own custom actions here. This is done by writing a expression using the syntax described in Section 6. A number of special functions (described in the following subsection) are provided to do things like display an image or a spectrum in a browser (given its filename or URL), or access data from certain data servers on the web, but there is nothing to stop the adventurous plugging in their own external programs so in principle you can configure pretty much anything to happen on the basis of the values in the row that you have activated.
When defining custom activation actions in the Activation Window, you enter an expression to be invoked on a row when it is activated. This expression uses the syntax defined in Section 6 and can make use of the functions listed in Appendix B.1. However in this case there is an additional list of functions you can use which cause something to happen (for instance displaying an image) rather than just returning a value. The following classes of functions are available:
ImageWindow
).
Supported image formats include GIF, JPEG, PNG and FITS,
which may be compressed.
A listing of the functions in these classes is given in Appendix B.2, and complete documentation on them is available within TOPCAT from the Available Functions Window.
TOPCAT makes use of a tool interoperability protocol called PLASTIC (PLatform for AStronomical InterConnection). This can be used to exchange messages between TOPCAT and other PLASTIC-aware tools such as Aladin, VisIVO and STILTS. The messages which are relevant to TOPCAT are things like "load this table" or "highlight this set of rows".
The communication works by all tools communicating with a central 'hub' process, so a hub must be running in order for the messaging to operate. If a hub is running when TOPCAT starts, it will connect to it automatically, listening for messages sent by other tools. If not, you can start a hub from the Interop menu in the Control Window or start one externally. Other tools will have their own policy for connecting to the hub, but in general it's best to start a hub first before starting up the tools which you want to talk to it.
The interoperability has two aspects to it: on the one hand TOPCAT
can send messages to other applications which cause them to do
things, and on the other hand TOPCAT can receive and act on such
messages sent by other applications.
These are described separately in the subsections below.
There is also a section on the Interop
menu in the
Control Window used to control hub operations.
The PLASTIC protocol itself is still under development and may undergo changes in the future. It is therefore possible that there may be version compatibility issues which arise between TOPCAT and other PLASTIC-compliant tools and services, so these facilities should probably be regarded as experimental at present.
You can view and control operations relating to the PLASTIC hub from the Interop menu in the Control Window. It contains the following options:
kill
command).
Because this has some system-dependent features, it's not guaranteed to
work, especially in non-Unix environments.
The Interop menu may be replaced by a PLASTIC window in a future release.
This section describes the messages which TOPCAT can transmit to other tools which understand the PLASTIC protocol, and how to cause them to be sent.
In most cases you can choose two ways to transmit a PLASTIC message within TOPCAT:
Below is a list of places you can cause TOPCAT to transmit PLASTIC messages. The PLASTIC message IDs are listed along with the descriptions; unless you are a tool developer you can probably ignore these.
Message ID: ivo://votech.org/votable/load
or ivo://votech.org/votable/loadFromURL
Message ID: ivo://votech.org/votable/showObjects
Message ID: ivo://votech.org/votable/highlightObject
Message ID: ivo://votech.org/sky/pointAtCoords
Message ID: ivo://votech.org/fits/image/loadFromURL
This section describes the messages which TOPCAT will respond to when it receives them from other applications via the PLASTIC hub. The PLASTIC message IDs are listed along with the descriptions; unless you are a tool developer you can probably ignore these.
Message ID: ivo://votech.org/votable/load
or ivo://votech.org/votable/loadFromURL
Note: this behaviour differs from the behaviour in TOPCAT versions prior to v3.0.
Message ID: ivo://votech.org/votable/showObjects
Message ID: ivo://votech.org/votable/highlightObject
The following system-level messages are also responded to:
ivo://votech.org/test/echo
ivo://votech.org/info/getName
ivo://votech.org/info/getIconURL
ivo://votech.org/hub/event/ApplicationRegistered
ivo://votech.org/hub/event/ApplicationUnregistered
ivo://votech.org/hub/event/HubStopping
Starting up TOPCAT may just be a case of typing "topcat
" or
clicking on an appropriate icon and watching the
Control Window pop up.
If that is the case, and it's running happily for you,
you can probably ignore this section.
What follows is a description of how to start the program up,
and various command line arguments and configuration options which can't be
changed from within the program.
Some examples are given in Section 9.5.
Actually obtaining the program is not covered here; please see
the TOPCAT web page
http://www.starlink.ac.uk/topcat/.
There are various ways of starting up TOPCAT depending on how (and whether) it has been installed on your system; some of these are described below.
There may be some sort of short-cut icon on your desktop which
starts up the program - in this case just clicking on it will probably work.
Failing that you may be able to locate the
jar file (probably named topcat.jar
,
topcat-full.jar
or topcat-lite.jar
)
and click on that. These files would be located in the
java/lib/topcat/
directory in a standard Starjava installation.
Note that when you start by clicking on something
you may not have the option of entering
any of the command line options described below -
to use these options, which you may need to do for serious use of
the program, you will have to run the program from the command line.
Alternatively you will have to invoke the program from the command line.
On Unix-like operating systems, you can use the topcat
script.
If you have the full starjava installation, this is probably in
the starjava/java/bin directory. If you have one of the
standalone jar files (topcat-full.jar or topcat-lite.jar), it should
be in the same directory as it/them. If it's not there,
you can unpack it from the jar file itself, using a command like
unzip topcat-lite.jar topcat
.
If that directory (and java) is on your path then you can write:
topcat [java-args] [topcat-args]In this case any arguments which start
-D
or -X
are assumed to be arguments to the java command,
any arguments which start -J
are stripped of the -J
and then passed as arguments to the java command,
a -classpath
path defines a class path to
be used in addition to the TOPCAT classes,
and any remaining arguments are used by TOPCAT.
If you're not running Unix then to start from the
command line you will have to use the java
command itself.
The most straightforward way of doing this will look like:
java [java-args] -jar path/to/topcat.jar [topcat-args](or the same for
topcat-full.jar
etc).
However NOTE: using java's -jar
flag ignores
any other class path information, such as the CLASSPATH environment
variable or java's -classpath
flag - see Section 9.2.1.
Note that Java Web Start can also be used to invoke the program without requiring any prior download/installation - sorry, this isn't documented properly here yet.
The meaning of the optional
[topcat-args]
and
[java-args]
sequences are described in
Section 9.1 and
Section 9.2 below respectively.
You can start TOPCAT from the command line with no arguments - in this case it will just pop up the command window from which you can load in tables. However you may specify flags and/or table locations and formats.
If you invoke the program with the "-help
" flag you
will see the following usage message:
Usage: topcat <flags> [[-f <format>] <table> ...] General flags: -help print this message and exit -version print component versions etc and exit -verbose increase verbosity of reports to console -demo start with demo data -disk use disk backing store for large tables -hub run internal PLASTIC hub -exthub run external PLASTIC hub -[no]plastic do [not] connect to running PLASTIC hub -[no]soap do [not] start SOAP services -noserv don't run any services -stilts <args> run STILTS not TOPCAT Optional load dialogue flags: -tree hierarchy browser -file basic file browser -sql SQL query on relational database -cone cone search dialogue -gavo GAVO Millennium run database query -registry VO registry query -siap Simple Image Access Protocol queries Useful Java flags: -classpath jar1:jar2.. specify additional classes -XmxnnnM use nnn megabytes of memory -Dname=value set system property Auto-detected formats: fits-plus, colfits-plus, colfits-basic, fits, votable All known formats: fits-plus, colfits-plus, colfits-basic, fits, votable, ascii, csv, tst, ipac, wdc Useful system properties (-Dname=value - lists are colon-separated): java.io.tmpdir temporary filespace directory jdbc.drivers JDBC driver classes jel.classes custom algebraic function classes jel.classes.activation custom action function classes star.connectors custom remote filestore classes startable.load.dialogs custom load dialogue classes startable.readers custom table input handlers startable.writers custom table output handlers startable.storage default storage policy mark.workaround work around mark/reset bug myspace.cache MySpace performance workaroundThe meaning of the flags is as follows:
-f
flag what format the named files are in.
Any table file on the command line following a
-f <format>
sequence must be in the named format until the next -f
flag.
The names of both the auto-detected formats (ones which don't need
a -f
) and the non-auto-detected formats (ones which do)
are given in the usage message you can see by giving the
-help
flag (this message is shown above).
You may also use the classname of a class on the classpath which
implements the TableBuilder
interface -
see SUN/252.
-help
(or -h
)
flag is given, TOPCAT will write a usage
message and exit straight away.
-version
flag is given, TOPCAT will print
a summary of its version and the versions and availability of some
its components, and exit straight away.
-demo
flag causes the program to start up with
a few demonstration tables loaded in. You can use these to play
around with its facilities. Note these demo tables are quite small
to avoid taking up a lot of space in the installation, and don't
contain particularly sensible data, they are just to give an idea.
-disk
flag is given then the program will use
disk backing storage for caching table data that is read in, rather
than keeping it in memory. This means that tables much larger than
the heap memory assigned to Java can be used. It may lead to slower
processing, but usually the performance is not greatly reduced.
If you find TOPCAT running out of memory (you see
OutOfMemoryError
s popping up in windows or on the console)
then re-running with the -disk
flag is a good idea.
The temporary data files are written in the default temporary
directory (defined by the java.io.tmpdir
system property -
often /tmp
- and deleted when the program exits, unless
it exits in an unusual way.
Note however that uncompressed FITS binary tables on disk are not
read into memory in any case (they are mapped)
so the -disk
flag may not make much difference with FITS.
-verbose
(or -v
) flag increases
the level of verbosity of messages which TOPCAT writes to standard
output (the console).
It may be repeated to increase the verbosity further.
The messages it controls are currently those written through
java's standard logging system - see the description of the
Log Window for more
information about this.
topcat -stilts -help
)
for the form of the <stilts-args>
.
Some of the flags control what load dialogues are visible in the Load Window. In fact all of these load dialogues can be accessed from the Load Window's DataSources menu as long as the classes are available, but if you specify these flags on the command line, the corresponding button will appear in the main part of the window, making the option more obvious. The load dialogue flags are:
startable.load.dialogs
system property (see Section 9.2.3).
Other arguments on the command line are taken to be the locations of tables. Any tables so specified will be loaded into TOPCAT at startup. These locations are typically filenames, but could also be URLs or SQL queries, or perhaps something else. In addition they may contain "fragment identifiers" (with a "#") to locate a table within a given resource, so that for instance the location
/my/data/cat1.fits#2means the second extension in the multi-extension FITS file
/my/data/cat1.fits
.
Section 4.2 describes in more detail the
kinds of URLs which can be used here.
Note that options to Java itself may also be specified on the command-line, as described in the next section.
As described above, depending on how you invoke TOPCAT you may be able to specify arguments to Java itself (the "Java Virtual Machine") which affect how it runs. These may be defined on the command line or in some other way. The following subsections describe how to control Java in ways which may be relevant to TOPCAT; they are however somewhat dependent on the environment you are running in, so you may experience OS-dependent variations.
The classpath is the list of places that Java looks to find the bits of compiled code that it uses to run an application. When running TOPCAT this always has to include the TOPCAT classes themselves - this is part of the usual invocation and is described in Section 9. However, for certain things Java might need to find some other classes, in particular for:
If you are going to use these facilities you will need to start the
program with additional class path elements that point to the location
of the classes required. How you do this depends on how you
are invoking TOPCAT.
If you are using tht topcat
startup script, you can write:
topcat -classpath other-paths ...(this adds the given paths to the standard ones required for TOPCAT itself). If you are invoking java directly, then you can either write on the command line:
java -classpath path/to/topcat.jar:other-paths uk.ac.starlink.topcat.Driver ...or set the CLASSPATH environment variable something like this:
setenv CLASSPATH path/to/topcat.jar:other-pathsIn any case, multiple (extra) paths should be separated by colons in the other-paths string.
Note that if you are running TOPCAT
using java's -jar
flag,
any attempt you make to specify the classpath will be ignored!
This is to do with Java's security model.
If you need to specify a classpath which includes more than the
TOPCAT classes themselves, you can't use java -jar
(use
java -classpath topcat-lite.jar:... uk.ac.starlink.topcat.Driver
instead).
If TOPCAT fails during operation with a message that says something
about a java.lang.OutOfMemoryError
, then your heap
size is too small for what you are trying to do. You will have to
run java with a bigger heap size using the -Xmx
flag.
Invoking TOPCAT from the topcat
script you would write
something like:
topcat -Xmx256M ...or using java directly:
java -Xmx256M ...which means use up to 256 megabytes of memory (don't forget the "M" for megabyte). JVMs typically default to a heap size of 64M. You probably don't want to specify a heap size larger than the physical memory of the machine that you are running on.
There are other types of memory and tuning options controlled
using options of the form -X<something-or-other>
;
if you're feeling adventurous you may be able to find out about these
by typing "java -X
".
Note however: using the -disk
flag
described in Section 9.1 may be a better solution; this
makes the program store data from large tables on disk rather than in memory.
System properties are a way of getting information into Java (they are the Java equivalent of environment variables). The following ones have special significance within TOPCAT:
apple.laf.useScreenMenuBar
true
for TOPCAT,
so menus mostly appear at the top of the screen (though it's not
true to say that TOPCAT obeys the Mac look and feel completely);
if you prefer the more Java-like look and feel, set it to
false
.
There are bugs with this feature in Apple's Java 1.4 JRE, so it's
set false by default in that case, but you can try setting it true
at your own risk if you like.
java.io.tmpdir
-disk
flag has been
specified (see Section 9.1).
jdbc.drivers
jel.classes
jel.classes.activation
lut.files
1.0 1.0 0.0 1.0 0.0 1.0would give a colour map that fades from yellow to magenta. Any number of samples may be given; the scale is interpolated.
mark.workaround
mark()
/reset()
methods of some java
InputStream
classes. These are rather common,
including in Sun's J2SE system libraries.
Use this if you are seeing errors that say something like
"Resetting to invalid mark
".
Currently defaults to "false".myspace.cache
star.connectors
uk.ac.starlink.connect.Connector
interface which
specifies how you can log on to such a service and provides a
hierarchical view of the filespace it contains.
startable.load.dialogs
uk.ac.starlink.table.gui.TableLoadDialog
interface and
naming them in this property.
See STIL
documentation for more detail.
startable.readers
startable.storage
disk
" has basically the same effect as
supplying the "-disk
" argument on the TOPCAT command line
(see Section 9.1).
Other possible values are "memory
" (the default),
"sideways
" and "discard
".
startable.writers
user.dir
votable.strict
true
for strict enforcement of the VOTable standard
when parsing VOTables. This prevents the parser from working round
certain common errors, such as missing arraysize
attributes on FIELD/PARAM elements with datatype="char"
.
False by default.
To define these properties on the command line
you use the -D
flag, which has the form
-D<property-name>=<value>If you're using the TOPCAT startup script, you can write something like:
topcat -Djdbc.drivers=org.postgresql.Driver ...or if you're using the
java
command directly:
java -Djdbc.drivers=org.postgresql.Driver ...
Alternatively you may find it more convenient to
write these definitions in a file named
.starjava.properties
in your home directory; the above
command-line flag would be equivalent to inserting the line:
jdbc.drivers=org.postgresql.Driverin your
.starjava.properties
file.
This section describes additional configuration which must be done to allow TOPCAT to access SQL-compatible relational databases for reading (see Section 4.1.1.8) or writing (see Section 4.1.2.8) tables. If you don't need to talk to SQL-type databases, you can ignore the rest of this section. The steps described here are the standard ones for configuring JDBC (which sort-of stands for Java Database Connectivity), described in more detail on Sun's JDBC web page.
To use TOPCAT with SQL-compatible databases you must:
jdbc.drivers
system property to the name of the
driver class as described in Section 9.2.3
These steps are all standard for use of the JDBC system.
To the author's knowledge, TOPCAT has so far successfully been used with the following RDBMSs and corresponding JDBC drivers:
jdbc:oracle:thin:@//hostname:1521/database#SELECT ...for querying an existing database (loading) and
jdbc:oracle:thin:@//hostname:1521/database#new-table-namefor writing a new table (saving).
Here are example command lines to start up TOPCAT using databases known to work.
java -classpath topcat-full.jar:pg73jdbc3.jar \ -Djdbc.drivers=org.postgresql.Driver \ uk.ac.starlink.topcat.Driver
java -classpath topcat-full.jar:mysql-connector-java-3.0.8-bin.jar \ -Djdbc.drivers=com.mysql.jdbc.Driver \ uk.ac.starlink.topcat.Driver
java -classpath topcat-full.jar:ojdbc14.jar \ -Djdbc.drivers=oracle.jdbc.driver.OracleDriver \ uk.ac.starlink.topcat.Driver
java -classpath topcat-full.jar:jtds-1.1.jar \ -Djdbc.drivers=net.sourceforge.jtds.jdbc.Driver \ uk.ac.starlink.topcat.Driver
Considerable effort has gone into making TOPCAT capable of dealing with large datasets. In particular it does not in general have to read entire files into memory in order to do its work, so it's not restricted to using files which fit into the java virtual machine's 'heap memory' or into the physical memory of the machine. As a rule of thumb, the program will work with tables up to about a million rows at a reasonable speed; the number of columns is less of an issue (though see below concerning performance). However, the way you invoke the program affects how well it can cope with large tables; you may in some circumstances get a message that TOPCAT has run out of memory (either a popup or a terse "OutOfMemoryError" report on the console), and there are some things you can do about this:
-disk
flag on the command line,
you can tell it to use temporary files for storing this data instead.
This makes it much less likely to run out of memory.
You can achieve the same effect by adding the line
startable.storage=disk
in the
.starjava.properties
in your home directory.
This is the best thing to do if you're seeing an out of memory
error when loading tables.
See Section 9.1, Section 9.2.3.
-Xmx
flag, followed by the maximum heap memory,
for instance "topcat -Xmx256M
" or
"java -Xmx256M -jar topcat-full.jar
".
Don't forget the "M
" to indicate megabytes.
It's generally reasonable to increase this value up to nearly the
amount of free physical memory in your machine if you need to
(taking account of the needs of other processes running at the same time)
but attempting any more will usually result in abysmal performance.
See Section 9.2.2.
-disk
or
-Xmx
flags as above.
It is also possible to use column-oriented storage for non-FITS
files by specifying the flag -Dstartable.storage=sideways
.
This is like using the -disk
flag but uses column-oriented
rather than row-oriented temporary files. However, using it for
such large files means that the conversion is likely to be rather
slow, so you may be better off converting the original file to
colfits
format in a separate step and using that.
As far as performance goes, the memory size of the machine you're using does make a difference. If the size of the dataset you're dealing with (this is the size of the FITS HDU if it's in FITS format but may be greater or less than the file size for other formats) will fit into unused physical memory then general everything will run very quickly because the operating system can cache the data in memory; if it's larger than physical memory then the data has to keep being re-read from disk and most operations will be much slower, though use of column-oriented storage can help a lot in that case.
Here are some examples of invoking TOPCAT from the command line.
In each case two forms are shown: one using the topcat
script, and one using the jar file directly. In the latter case,
the java
command is assumed to be on the your path, and
the jar file itself, assumed in directory my/tcdir
,
might be named topcat.jar
,
topcat-full.jar
, or something else, but the form
of the command is the same.
topcat java -jar topcat.jar
topcat -h java -jar topcat.jar -h
topcat testcat.fits java -jar my/tcdir/topcat.jar testcat.fits
topcat t1.fits -f ascii t2.txt t3.txt -f votable t4.xml java -jar my/tcdir/topcat.jar t1.fits -f ascii t2.txt t3.txt -f votable t4.xml
topcat -Xmx256M -disk java -Xmx256M -jar my/tcdir/topcat.jar -disk
topcat -classpath my/funcdir/funcs.jar -Djel.classes=my.ExtraFuncs t1.fits java -classpath my/tcdir/topcat.jar:my/funcdir/funcs.jar \ -Djel.classes=func.ExtraFuncs \ uk.ac.starlink.topcat.Driver t1.fits
topcat -classpath my/jdbcdir/pg73jdbc3.jar -Djdbc.drivers=org.postgresql.Driver java -classpath my/tcdir/topcat.jar:my/jdbcdir/pg73jdbc3.jar \ -Djdbc.drivers=org.postgresql.Driver uk.ac.starlink.topcat.Driver
topcat -classpath my/driverdir/drivers.jar \ -Dstartable.readers=my.MyTableBuilder \ -Dstartable.writers=my.MyTableWriter \ java -classpath my/tcdir/topcat.jar:my/driverdir/drivers.jar \ -Dstartable.readers=my.MyTableBuilder \ -Dstartable.writers=my.MyTableWriter \ uk.ac.starlink.topcat.Driver
-Dx=y
definitions can be avoided by putting equivalent
x=y
lines into the .starjava.properties
in your
home directory.
This appendix gives a tour of all the windows that form the TOPCAT application, explaining the anatomy of the windows and the various tools, menus and other controls. Attributes common to many or all windows are described in Appendix A.1, and the subsequent sections describe each of the windows individually.
When the application is running, the Help For Window
() toolbar button will display the appropriate description
for the window on which it is invoked.
This section describes some features which are common to many or all of the windows used in the TOPCAT program.
Each window has a toolbar at the top containing various buttons representing actions that can be invoked from the window. Most of them contain the following buttons:
Buttons in the toolbar often appear in menus of the same window as well; you can identify them because they have the same icon. This is a convenience; invoking the action from the toolbar or from the menu will have the same effect.
Often an action will only be possible in certain circumstances, for instance if some rows in the associated JTable have been selected. If the action is not possible (i.e. it would make no sense to invoke it) then the button in the toolbar and the menu option will be greyed out, indicating that it cannot be invoked in the current state.
Most windows have a menu bar at the top containing one or more menus. These menus will usually provide the actions available from the toolbar (identifiable because they have the same icons), and may provide some other less-commonly-required actions too.
Here are some of the menus common to several windows:
An example JTable
Many of the windows, including the Data Window, display their data in a Java widget called a JTable. This displays a grid of values, with headings for each column, in a window which you can scroll around. Although JTables are used for a number of different things (for instance, showing the table data themselves in the Data Window and showing the column metadata in the Columns Window), the fact that the same widget is used provides a common look and feel.
Here are some of the things you can do with a JTable:
In some cases where a JTable is displayed, there will be a menu on the menu bar named Display. This permits you to select which columns are visible and which are hidden. Clicking on the menu will give you a list of all the available columns in the table, each with a checkbox by it; click the box to toggle whether the column is displayed or not.
The Control Window
The Control Window is the main window from which all of TOPCAT's activities are controlled. It lists the known tables, summarises their characteristics, and allows you to open other windows for more specialised tasks. When TOPCAT starts up you will see this window - it may or may not have some tables loaded into it according to how you invoked the program.
The window consists of two main parts: the Table List panel on the left, and the Current Table Properties panel on the right. Tables loaded into TOPCAT are shown in the Table List, each identified by an index number which never changes for a given table, and a label which is initially set from its location, but can be changed for convenience.
One of the tables in the list is highlighted, which means it is the currently selected table; you can change the selection by clicking on an item in the list. Information about the selected table is shown in the properties panel on the right. This shows such things as the number of rows and columns, current sort order, current row subset selection and so on. It contains some controls which allow you to change these properties. Additionally, many of the buttons in the toolbar relate to the currently selected table.
The Table List, Current Table Properties panel, and actions available from the Control Window's toolbar and menus are described in the following subsections.
The Table List panel on the left of the Control Window is pretty straightforward - it lists all the tables currently known to TOPCAT. If a new table is introduced by loading it from the Load Window or as a result of some action such as table joining then its name and number will appear in this list. The currently selected table is highlighted - clicking on a different table name (or using the arrow keys if the list has keyboard focus) will change the selection. The properties of the selected table are displayed in the Current Table Properties panel to its right, and a number of the toolbar buttons and menu items refer to it.
If you double-click on a table in the list, or press Return while it is selected, that table's Data Window will appear.
Certain other applications (Treeview or even another instance of TOPCAT) can interoperate with TOPCAT using drag-and-drop, and for these the table list is a good place to drag/drop tables. For instance you can drag a table node off of the main panel of Treeview and drop it onto the table list of TOPCAT, and you will be able to use that table as if it had been loaded from disk. You can also paste the filename or URL of a table onto the table list, and it will be loaded.
The Current Table Properties panel on the right hand side of the Control Window contains a number of controls which relate to the currently selected table and its Apparent properties; they will be blank if no table is selected. Here is what the individual items mean:
There are a number of buttons on the Control Window's toolbar; these are the most convenient way of invoking most of TOPCAT's functions. They are grouped as follows:
This section describes actions available from the Control Window menus additional to those also available from the toolbar (described in the previous section) and those common to other windows (described in Appendix A.1.2).
The File menu contains the following additional actions:
The Views menu contains actions for launching the windows which give certain views of the table metadata. These are all provided as toolbar buttons as well.
The Graphics menu contains actions for launching the windows which give various plotting and visualisation options. These are all provided as toolbar buttons as well.
The Interop menu contains options concerned with interoperating with other tools using the PLASTIC protocol. It is described in Section 8.1.
The Windows menu contains actions for controlling which table view windows are currently visible on the screen. If you have lots of tables and are using various different views of several of them, the number of windows on the screen can get out of hand and it's easy to lose track of what window is where. The actions on this menu do some combination of either hiding or revealing all the various view windows associated with either the selected table or all the other ones. Windows hidden are removed from the screen but if reactivated (e.g. by using the appropriate toolbar button) will come back in the same place and the same state. Revealing all the windows associated with a given table means showing all the view windows which have been opened before (it won't display windows which have never explicitly been opened).
The Joins menu, as well as containing the actions for table concatenation, internal matching and pair matching which are available from the toolbar, also gives you the option to join three or four tables at once by matching rows. The multi-table match windows work pretty much the same as the Pair Matching Window, but with more tables.
Many of the windows you will see within TOPCAT display information about a single table. There are several of these, each displaying a different aspect of the table data - cell contents, statistics, column metadata etc. There is one of each type for each of the tables currently loaded, though they won't necessarily all be displayed at once. The title bar of these windows will say something like TOPCAT(3): Table Columns, which indicates that it is displaying information about the column metadata for the table labelled "3:" in the Control Window.
To open any of these windows, select the table of interest in the Control Window and click the appropriate toolbar button (or the equivalent item in the Table Views menu). This will either open up a new window of the sort you have requested, or if you have opened it before, will make sure it's visible.
If you have lots of tables and are using various different views of several of them, the number of windows on the screen can get out of hand and it's easy to lose track of what window is where. In this case the Control Window's Windows menu (described in Appendix A.2.4), or the File|Control Window menu item in any of the view windows can be handy to keep them under control.
The following sections describe each of these table view windows in turn.
Data Window
The Data Window presents a JTable
containing the actual cells of the
Apparent Table.
You can display it using the Table Data ()
button when the chosen table is selected in the
Control Window's Table List.
You can scroll around the table in the usual way. In most cases you can edit cells by double-clicking in them, though some cells (e.g. ones containing arrays rather than scalars) cannot currently be edited. If it looks like an edit has taken place, it has.
There is a grey column of numbers on the left of the JTable which gives the row index of each row. This is the value of the special Index column, which numbers each row of the original (not apparent) table starting at 1. If the table has been sorted these numbers may not be in order.
Note that reordering the columns by dragging their headings around will change the order of columns in the table's Column Set and hence the Apparent Table.
If you have table with very many columns it can be difficult to scroll the display sideways so that a column you are interested in is in view. In this case, you can go to the Columns Window and click on the description of the column you are after in the display there. This will have the effect of scrolling the Data Window sideways so that your selected column is visible in the centre of the display here.
The following buttons are available in the toolbar:
As well as the normal menu, right-clicking over one of the columns in the displayed table will present a Column Popup Menu, which provides a convenient way to do some things with the column in question:
.*XYZ.*
" to find all rows which contain
the string "XYZ".
float[]
representing magnitudes in 5 different bands,
selecting this option will hide PMAG and insert 5 new
Float
-type columns PMAG_1...PMAG_5 in its place,
each containing one of the magnitudes.
Parameters Window
The Parameters Window displays metadata which applies to the whole table
(rather than that for each column).
You can display it using the Table Parameters ()
button when the chosen table is selected in the
Control Window's Table List.
In table/database parlance, an item of per-table metadata is often known as a "parameter" of the table. At least the number of rows and columns will be listed. Some table file formats (for instance VOTable and FITS) have provision for storing other table parameters, while others (for instance CSV) do not. In the latter case there may not much much of interest displayed in this window.
The top part of the display is a JTable with one row for each parameter. It indicates the parameter's name, its value, the type of item it is (integer, string etc) and other items of interest such as units, dimensionality or UCD if they are defined. If a column of the table has no entries (for instance, the Units column might be empty because none of the parameters has had units defined for it) then that column may be absent from the display - in this case the Display menu can be used to reveal it.
You can edit some parameter values and descriptions by double-clicking on them as usual.
The bottom part of the display gives an expanded view of a selected parameter (click on a row in the top part to select one). This is especially useful if the parameter value is too long to show fully in the table display. In most cases you can edit the fields here to change the value and other characteristics of a parameter.
The following items are available in the toolbar:
Columns Window
The Columns Window displays a JTable
giving all the information (metadata)
known about each column in the table.
You can display it using the Column Info ()
button when the chosen table is selected in the
Control Window's Table List.
The display may take a little bit of getting used to, since each column in the main data table is represented by a row in the JTable displayed here. The order and widths of the columns of JTable widget can be changed in the same way as those for the Data Window JTable, but this has no effect on the data.
The leftmost column, labelled "Visible", contains a checkbox in
each row (one for each column of the data table).
Initially, these are all ticked.
By clicking on those boxes, you can toggle them between ticked and
unticked. When unticked, the column in question will become hidden.
The row can still be seen in this window, but the corresponding data
column is no longer a part of
the Apparent Table, so will not be seen
in the Data Window or appear in
exported versions of the table.
You can tick/untick multiple columns at once by highlighting a set of
rows by dragging the mouse over them and then using the
Hide Selected () or
Reveal Selected (
)
toolbar buttons or menu items.
If you want to hide or reveal all the columns in the table, use the
Hide All (
) or
Reveal All (
) buttons.
Each column in the displayed JTable corresponds to one piece of information for each of the columns in the data table - column name, description, UCD etc. Tables of different types (e.g. ones read from different input formats) can have different categories of metadata. By default a metadata category is displayed in this JTable if at least one table column has a non-blank value for that metadata category, so for instance if no table columns have a defined UCD then the UCD column will not appear. Categories can be made to appear and disappear however by using the Display menu. The metadata items are as follows:
You can edit column names and some other entries in this JTable by double-clicking on them as usual.
The order in which the rows are presented is determined by the table's current Column Set, so can be changed by dragging the column headers around in the Data Window.
The following buttons are available in the toolbar:
float[]
representing magnitudes in 5 different bands,
then selecting it and hitting this button will hide PMAG and
insert 5 new Float
-type columns PMAG_1...PMAG_5
in its place each containing one of the magnitudes.
Several of these actions operate on the currently selected column or columns. You can select columns by clicking on the corresponding row in the displayed JTable as usual. A side effect of selecting a single column is that the table view in the Data Window will be scrolled sideways so that the selected column is visible in (approximately) the middle of the screen. This can be a boon if you are dealing with a table that contains a large number of columns.
Subsets Window
The Subsets Window displays the
Row Subsets
which have been defined.
You can display it using the Row Subsets ()
button when the chosen table is selected in the
Control Window's Table List.
The subsets are displayed in a JTable widget with a row for each subset. The columns of the JTable are as follows:
Note: in previous versions of TOPCAT the hash sign ("#") was used instead of the underscore for this purpose; the hash sign no longer has this meaning.
Entries in the Name and Expression columns can be edited by double-clicking on them in the normal way.
The following toolbar buttons are available in this window:
The following additional menu items are available:
Statistics Window
The Statistics Window shows statistics for the values in each
of the table's columns.
You can display it using the Column Statistics ()
button when the chosen table is selected in the
Control Window's Table List.
The calculated values are displayed in a JTable widget with a row for each column in the main table, and a column for each of a number of statistical quantities calculated on some or all of the values in the data table column corresponding to that grid row. The following columns are shown by default:
In addition, some quantile values can calculated on demand (by selecting their values in the Display menu, as for the previous list). The available values are:
The quantities displayed in this window are not necessarily those for the entire table; they are those for a particular Row Subset. At the bottom of the window is the Subset For Calculations selector, which allows you to choose which subset you want the calculations to be done for. By clicking on this you can calculate the statistics for different subsets. When the window is first opened, or when it is invoked from a menu or the toolbar in the Control Window, the subset will correspond to the current row subset.
The toolbar contains the following extra buttons:
For a large table the calculations may take a little while. While they are being performed you can interact with the window as normal, but a progress bar is shown at the bottom of the window. If you initiate a new calculation (by pushing the Recalculate button or selecting a new subset) or close the window during a calculation, the superceded calculation will be stopped.
TOPCAT has a number of windows for performing data visualisation of various kinds. These share various characteristics which are described in the first subsection below; the specific windows themselves are described in the later subsections.
These visualisation windows are fairly sophisticated, and the plots can exported to vector (EPS) or image (GIF, JPEG, PNG) files for later presentation (see Appendix A.4.1.7). However, at least at present, TOPCAT does not claim to be a full end-to-end system for generating publication quality graphics, and hence lacks facilities for detailed configuration of axis labelling, font control, data annotation and so on. You may well find that you can use it to generate publication quality graphics, but if you need features which are not currently provided you may find it best to use TOPCAT to investigate your data and decide exactly what data you want to present, and then export the data in a form which can be used by a more output-oriented package.
The various types of graphics windows have different characteristics to fulfil their different functions, but they share a common way of doing things. Each window contains a number of controls including toolbar buttons, menu items, column selectors and others. In general any change that you make to any of the controls will cause the plot to be redrawn to take account of that change. If this requires a read or re-read of the data, a progress bar at the bottom of the window may show this progressing - except for very large tables it is usually pretty fast.
Each of the graphics windows is displayed by clicking its toolbar button in the Control Window. If one of the tables in the list is selected at the time (the Current Table) the new plot window will initially be displayed showing data from some of its columns (generally the first few numeric columns) by way of illustration. You will usually want to change the controls so it displays the quantities you are interested in.
The following subsections describe some of the features which work the same for most or all of the graphics windows.
All the graphics windows provide one or more axes on which to plot - the histogram has 1, the 2d scatter and density plots have 2, the 3d scatter plot has 3 and the spherical plot has 2 or 3. In each case you select one or more dataset to plot on these axes, and select what plotting style to use for each set. A dataset is typically a number of columns from a table (the number matching the dimensionality of the plot) and a selection of row subsets associated with that table. You select this and the plotting style(s) using the panel at the bottom of each plot window. Here is dataset selector for the 2d scatter plot:
Default dataset selector from 2d scatter plot window
The different parts of this control work as follows:
The Axis selectors (here X Axis and Y Axis) give the quantities to be plotted. If you click the little down arrow at the right of each selector you get a list of all the numeric columns in the chosen table, from which you can select one. If you click the little left and right arrows to the right of the selector it will cycle through all the columns in the table. However, if you prefer you can type in an expression to use here. This may be more convenient if there's a very long list of columns (another way to deal with this is to hide most of the columns using the Column Window). However, what you type in doesn't have to be a column name, it can be an algebraic expression based on columns in the table, or even a constant. In the example, the X axis is a straight column name, and the Y axis is an expression. The expression language syntax is explained in Section 6.
The Log checkbox for each axis is used to select whether the scale should be logarithmic or linear.
The Flip checkbox for each axis is used to select whether the axis values increase in the conventional direction (left to right, bottom to top) or its opposite.
Some of the buttons in the toolbar shown will modify what is visible in this panel, for instance inserting new selectors to allow selection of error values. All the selectors work in the same way however.
The buttons to the right of each subset name show the symbol that is used in the plot to display the data from that subset, in this case a red cross and a blue circle. These are selected automatically when the subset is first selected for viewing (the initial default style set depends mainly on how many rows there are in the selected table - many rows gives small dots, few gives big ones). However, you have a lot of freedom to configure how they appear. If you click the button with the symbol on it a dialogue will pop up which allows you to select colour, shape, transparency and so on, as well as error bar style if appropriate and things like whether fitted lines will be plotted for that subset. The options available differ according to the kind of plot, and are described along with the different graphics windows in the following subsections. The style window stays visible until you dismiss it, but if you click on another of the buttons in the Row Subsets panel its contents will change to allow you to select values for the corresponding subset. Most graphics windows have a Marker Style menu. This allows you to change all the styles in the plot at once to be members of a uniform set, for instance different coloured pixels, or black open shapes. If you select one of these it will overwrite the existing style selections, but they can be edited individually afterwards.
The Add Dataset ()
and Remove Dataset (
)
buttons in the toolbar add a new tab or remove the selected one
respectively.
Initially only the Main tab is present, and this one cannot be removed.
Sometimes (high-dimensional plots, auxiliary axes, error bars) a lot of information needs to be entered into the data panel, and the bottom part of the window can get quite large. Normally, the plot in the upper part of the window shrinks to accommodate it. You can of course resize the window to gain more space, but if your screen is small you may still end up with an uncomfortably small plot. If this happens, you can use the following button from the main toolbar:
In general terms the axes on which the graphics are plotted are defined by the datasets you have selected. The axis labels are set from the column names or expressions chosen for the Main dataset, and the ranges are determined so that all the data from the chosen datasets can be seen. However, these things can be adjusted manually.
The following features are available directly from the window for configuring axis range:
For more control over axis range and labelling, use the
Configure Axes () toolbar button,
which will pop up a dialogue like the following:
Axis Configuration Dialogue for 2-d axes
You can fill in these values as follows:
TOPCAT provides quite flexible graphical representation of symmetric or asymmetric errors in 1, 2 and 3 dimensions. The plots with error bar support are the 2D, 3D and spherical scatter plots and the stacked lines plot.
By default, error bar drawing is switched off. The simplest way to activate it is to use the relevant error bar button(s) in the data selector tool bar (the one below the plot). For the Cartesian (2D, 3D, lines) plots, some or all of the following buttons are present:
Here is a 2D plot in which symmetric X and asymmetric Y errors are being used:
Plot window with symmetric X and asymmetric Y errors
You can see that with the error column selector, the panel has become too wide for the window so a scrollbar has appeared at the bottom - you can scroll this left and right or enlarge the window to see the parts that you need to.
For the spherical plot the following error toggle buttons are present:
If you want to use asymmetric or one-sided errors, use the options in the Errors menu instead of the toolbar buttons. For instance the options for X axis error bars in the 2D scatter plot are:
There are many options for the plotting style of one, two and three dimensional error bars, including capped and uncapped bars, crosshairs, ellipses and rectangles. This plotting style is controlled from the plot window's Style Editor window (see e.g. Appendix A.4.3.1), which can be viewed by clicking on the marker icon in the Row Subsets panel at the bottom right of the window. The available error bar styles will depend on which axes currently have errors; if none do, then the error bar selector will be disabled. You can also use the Error Style menu to change the error style for all the visible datasets at once.
On the 2-d and 3-d scatter plots you can write text labels adjacent
to plotted points. To do this click the Draw Labels
() button in the dataset toolbar (below the plotting area
in the plot window). This will reveal a new Point Labels
selector below the existing spatial ones.
Using this you can select any of the table columns (not just the
numeric ones as for the other selectors), or give a string or
numeric expression involving them. When this selector is filled
in, every point in the dataset which has a non-blank value for
this quantity will have it written next to the point on the display.
Point Labelling for Messier objects in the spherical plot
In this example the NAME column has been selected, so that each point plotted (in this case all the Messier objects) is labelled with its name. As you can see, where many labels are plotted near to each other they can get in each others' way. In some cases TOPCAT will omit plotting labels in crowded regions, in others not - but in any case if you have labels too tightly grouped they are unlikely to be legible.
TOPCAT can plot data in one, two or three spatial dimensions, but sometimes the the data which you need to visualise is of higher dimensionality. For this purpose, some of the plotting windows (2D and 3D scatter plots) allow you to control the colouring of plotted points according to values from one or more additional columns (or calculated expressions), which gives you more visual information about the data you are examining.
To use this facility, click the Add auxiliary axis
() button in the dataset toolbar (below the plot area
in a plot window).
A new axis selector will appear below the existing spatial ones,
labelled Aux 1 Axis. It has log and flip checkboxes
like the spatial axes, and to the right (you may need to widen the
window or use the scrollbar at the bottom to see it) is a selector depicting a
number of colourmaps to choose from - the default one resembling a
rainbow is usually quite suitable, but you can pick others.
If you enter a column name or expression into the selector, each
plotted point will be coloured according to the value of that quantity
in the corresponding row of data. If that quantity is null for a row,
the corresponding point will not be plotted.
A scale on the right of the plot indicates how the colour map
corresponds to numeric values.
To remove the auxiliary axis and go back to normally-coloured points,
simply click the Remove auxiliary axis (
)
button.
3D plot of simulation data showing X, Y, Z spatial position with the auxiliary axis indicating timestep.
There are two types of colour maps you can choose from: colour fixing and colour modifying. The fixing ones are easiest to understand: the original colour of the point (as drawn in the legend) is ignored, and it is coloured according to the relevant value on the selected auxiliary axis. The colour modifying maps take the original colour and affect it somehow, for instance by changing its transparency or its blue component. These are marked with an asterisk ("*") in the colour map selector. They can be used to convey more information but are often harder to interpret visually - for one thing the shading of the colour bar in the legend will not correspond exactly to the colours of the plotted points.
By using modifying colour maps it is possible to perform plots with more than one auxiliary axis - typically the first one will be a fixing map and subsequent ones will be modifying. So the first auxiliary axis could have the (fixing) Rainbow map, and the second could have the (modifying) Transparency map. The colour alterations are applied in order. It is possible, but pointless, to have multiple fixing maps applied to the same points - the last-numbered one will determine the colour and earlier ones will get ignored. Multiple aux axes can be obtained by clicking the Add auxiliary axis button more than once. When combining several maps some thought has to be given to which ones to use - some good combinations are the three RGB ones or the three YUV ones.
A fairly wide range of colour maps of both kinds is provided by default. If these do not suit your needs, it is possible to provide your own custom colour fixing maps using the lut.files system property - see Section 9.2.3.
It is easy to generate attractive screenshots using auxiliary axes. Making visual sense of the results is a different matter. One visualisation expert tried to dissuade their introduction in TOPCAT on the grounds that the graphics they produce are too hard for humans to interpret - I hope that these plots can assist with some analysis, but it is a somewhat experimental feature which may or may not end up being widely useful. The maximum number of auxiliary axes which can be used together is currently three. This could be increased on request, but if you feel you can generate an intelligible plot using more than this then you're considerably smarter than me.
When quantities are plotted in one of the graphics windows it becomes easy to see groupings of the data which might not otherwise be apparent; a cluster of (X,Y) points representing a group of rows may correspond to a physically meaningful grouping of objects which you would like to treat separately elsewhere in the program, for instance by calculating statistics on just these rows, writing them out to a new table, or plotting them in a different colour on graphs with different coordinates. This is easily accomplished by creating a new Row Subset containing the grouped points, and the graphics windows provide ways of doing this.
In some of the plots
(Histogram
2d Scatter plot
Density map and
Spherical plot)
you can set the axis ranges (either manually or by zooming with the
mouse - see Appendix A.4.1.2)
so that only the points you want to identify are visible,
and then click the
New Subset From Visible toolbar button
(the icon is ,
or
depending on the plot type).
This defines a subset consisting of all the
points that are visible on the current plot.
This is only useful if the group you are interested in
corresponds to a rectangular region in the plotting space.
A more flexible way is to draw a region or regions
on the plot which identify the points you are interested in.
To do this, hit the
Draw Subset Region ()
toolbar button. Having done this, you can drag the mouse around
on the plot (keep the left mouse button down while you move)
to encircle the points that you're interested in.
As you do so, a translucent grey blob will be left behind -
anything inside the
blob will end up in the subset. You can draw one or many blobs,
which may be overlapping or not. If you make a mistake while
drawing a sequence of blobs, you can click the right mouse button,
and the most recently added blob will disappear.
When you're in this region-drawing mode,
you can't zoom or resize the window or change the characteristics
of the plot, and the Draw Subset Region button
appears with a tick over it (
) to remind you
you're in it. Here's what the plot looks like while you're drawing:
Region-Drawing Mode
When you're happy with the region you've defined, click the
toolbar button again.
In either case, when you have indicated that you want to define a new row subset, a dialogue box will pop up to ask you its name. As described in Section 3.1.1, it's a good idea to use a name which is just composed of letters, numbers and underscores. You can optionally select a subset name which has been used before from the list, which will overwrite the former contents of that subset. When you enter a name and hit the OK button, the new subset will be created and the points in it will be shown straight away on the plot using a new symbol. As usual, you can toggle whether the points in this subset are displayed using the Row Subsets box at the bottom of the Plot Window.
All the graphics windows have the following export options in the toolbar:
Exporting to the pixel-based formats (GIF, JPEG, PNG) is fairly straightforward: each pixel on the screen appears as one pixel in the output file. PNG is the most capable, but it is not supported by all image viewers. GIF works well in most cases, but if there are more than 255 colours some of the colour resolution will be lost. JPEG can preserve a wide range of colours, but does not support transparency and is lossy, so close inspection of image features will reveal blurring.
When exporting to Encapsulated PostScript (EPS), which is a vector graphics format, there are a few things to consider:
Histogram Window
The histogram window lets you plot histograms of one or more
columns or derived quantities.
You can display it using the Histogram ()
button in the Control Window's toolbar.
You select the quantity or quantities to plot using the dataset selector at the bottom of the window. You can configure the axes, including zooming in and out, with the mouse (drag on the plot or the axes) or manually as described in Appendix A.4.1.2.
The Bin Placement box below the main plot controls where the bars are drawn. Select the horizontal range of each bar using the Width entry box - either type in the value you want or use the tiny up/down arrows at the right to increase/decrease the bin size. The Offset checkbox on the left determines where the zero point on the horizontal axis falls in relation to the bins; if the box is checked then zero (or one for logarithmic X axis) is in the centre of a bin, and if it's unchecked then zero (or one) is on a bin boundary.
The following buttons are available on the toolbar:
The Dataset Toolbar contains the following options:
When weighted, bars can be of negative height. An anomaly of the plot as currently implemented is that the Y axis never descends below zero, so any such bars are currently invisible. This may be amended in a future release (contact the author to vote for such an amendment).
The Export menu contains additional image export options and the following options specific to the histogram:
You have considerable freedom to configure how the bars are drawn; controlling this is described in the following subsection.
The bins in a histogram can be represented in many different ways. A representation of how a bar will be displayed is shown on a button to the right of the name of each visible subset, at the bottom right of the histogram window. If you click this button the following dialogue will pop up which enables you to change the appearance.
Style editor dialogue for histogram bars
The Legend box defines how the selected set will
be identified in the legend which appears alongside the plot
(though the legend will only be visible if Show Legend
() is on):
The Bars panel describes the form of the bars to be plotted for each data set.
Any changes you make in this window are reflected in the plot straight away. If you click the OK button at the bottom, the window will disappear and the changes remain. If you click Cancel the window will disappear and any changes you made will be discarded.
You can also change all the plotting styles at once by using the Bar Style menu in the histogram window. Here you can select a standard group of styles (e.g. all filled adjacent bars with different colours) for the plotted sets.
Plot Window
The plot window allows you to do 2-dimensional scatter plots of
one or more pair of table columns (or derived quantities).
You can display it using the Plot () button
in the Control Window's toolbar.
On the plotting surface a marker is plotted for each row in the selected dataset(s) at a position determined by the values in the table columns selected to provide the X and Y values. A marker will only be plotted if both the X and Y values are not blank. Select the quantities to plot and the plotting symbols with the dataset selector at the bottom. You can configure the axes, including zooming in and out, with the mouse (drag on the plot or the axes) or manually as described in Appendix A.4.1.2.
Clicking on any of the plotted points will activate it - see Section 7.
The following buttons are available on the toolbar:
The Dataset Toolbar contains the following options:
You have considerable freedom to configure how the points are plotted including the shape, colour and transparency of symbols, the type of lines which join them if any, and the representation of error bars if active. These options are described in the following subsection.
When plotting points in a scatter plot there are many different ways that each point can be displayed. By default, TOPCAT chooses a set of markers on the basis of how many points there are in the table and uses a different one for each plotted set. The marker for each set is displayed in a button to the right of its name in the dataset selector panel at the bottom of the plot window. If you click this button the following dialogue will pop up which enables you to change the appearance.
Style editor dialogue for 2d scatter plot
The Legend box defines how the selected set will
be identified in the legend which appears alongside the plot
(though the legend will only be visible if Show Legend
() is on):
The Marker box defines how the markers plotted for each data point will appear:
The Line box determines if any lines are drawn associated with the current set and if so what their appearance will be.
Note that for both the plotted line and the quoted coefficients the data is taken only from the points which are currently visible - that means that if you've zoomed the axes to exclude some of the data points, they will not be contributing to the calculated statistics.
Any changes you make in this window are reflected in the plot straight away. If you click the OK button at the bottom, the window will disappear and the changes remain. If you click Cancel the window will disappear and any changes you made will be discarded.
You can also change all the plotting styles at once by using the Marker Style menu in the plot window. Here you can select a standard group of styles (e.g. all open 2-pixel markers with different colours and shapes) for the plotted sets. Similarly, error styles can be changed all at once using the Error Style menu.
Stacked Lines Window
The stacked line plot window allows you to plot one or more ordinate (Y)
quantities against a monotonic abscissa (X) quantity.
For clarity, the different plots are displayed on vertically
displaced graphs which share the same X axis.
You can display this window using the Lines ()
button in the Control Window's toolbar.
The display initially holds a single X-Y graph, usually with lines connecting adjacent points. The points will be reordered before drawing if necessary so that the line is displayed as a function of X, rather than of an invisible third independent variable (in the Scatter Plot this isn't done which can lead to lines being scribbled all over the plot). If one of the columns in the table appears to represent a time value, this will be selected as the default X axis. Otherwise, the 'magic' index variable will be used, which represents the row number. Of course, these can be changed from their default values using the selectors in the usual way.
To add a new graph with a different Y axis, use the
Add Dataset () button in the
Dataset Toolbar at the bottom of the window.
This has a slightly different effect from what it does in the other
plot windows, in that it inserts a new plotting region with its own
Y axis at the top of the plot on which the specified data is drawn,
rather than only causing a new set of points to be plotted on the
existing plot region.
Thus all the datasets appear in their own graphs with their own Y axes
(though if you have multiple row subsets plotted for the same
dataset they will appear on the same part of the plot as usual).
To remove one of the graphs, select its tab and use the
Remove Dataset (
) button as usual.
Zooming can only be done on one axis at a time
rather than dragging out an X-Y region on the plot surface, since
there isn't a single Y axis to zoom on.
To zoom the X axis in/out, position the mouse just below the X axis
at the bottom of the plot and drag right/left.
To zoom one of the Y axes in or out, position the mouse just to the
left of the Y axis you're interested in and drag down/up.
To set the ranges manually, use the Configure Axes
() button as usual, but note that there is one
label/range setting box for each of the Y axes.
These things work largely as described in Appendix A.4.1.2,
as long as you bear in mind that the range of each of the Y axes
is treated independently of the others.
Clicking on any of the points will activate it - see Section 7.
The following buttons are available on the toolbar:
The Dataset Toolbar contains the following options:
You can determine how the data are plotted using lines and/or markers as described in the following subsection.
The default plotting style for the stacked lines plot is a simple black line for each graph. Since the plots typically do not overlap each other, this is in many cases suitable as it stands. However, you can configure the plotting style so that the points are plotted with markers as well as or instead of lines, and change the colours, marker shapes, line styles etc. The style for each row subset is displayed in a button to the right of its name in the bottom right of the plotting window. If you click this button the following dialogue will pop up which entables you to configure the plotting style.
Stacked Line Plot Style Editor
The Legend box defines how the selected set will
be identified in the legend which appears alongside the plot
(though the legend will only be visible if Show Legend
() is on):
The Display box defines how the markers plotted for each data point will appear:
The Line box defines how the lines joining the points will look. These controls will only be active if the Display selection is Line or Both.
The Markers box defines how markers at the data points will look. These controls will only be active if the Display selection is Markers or Both.
Any changes you make in this window are reflected in the plot straight away. If you click the OK button at the bottom, the window will disappear and the changes remain. If you click Cancel the window will disappear and any changes you made will be discarded.
You can also change all the plotting styles at once by using the Line Style menu in the stacked lines plot window. Here you can select a standard group of styles (e.g. dashed lines, coloured lines) for the plotted sets. Similarly, error styles can be changed all at once using the Error Style menu.
3D scatter plot window
The 3D plot window draws 3-dimensional scatter plots of one or more
triples of table columns (or derived quantities) on Cartesian axes.
You can display it using the 3D () button
in the Control Window's toolbar.
On the display a marker is plotted for each row in the selected dataset(s) at a position determined by the values in the table columns selected to provide the X, Y and Z values. A marker will only be plotted if none of the X, Y and Z values are blank. Select the quantities to plot and the plotting symbols with the dataset selector at the bottom.
The 3D space can be rotated by dragging the mouse around on the surface - it will rotate around the point in the centre of the plotted cube. The axis labels try to display themselves the right way up and in a way which is readable from the viewing point if possible, which means they move around while the rotation is happening. By default the points are rendered as though the 3D space is filled with a 'fog', so that more distant points appear more washed out - this provides a visual cue which can help to distinguish the depth of plotted points. However, you can turn this off if you want. If there are many points, then you may find that they're not all plotted while a drag-to-rotate gesture is in progress. This is done to cut down on rendering time so that GUI response stays fast. When the drag is finished (i.e. when you release the mouse button) all the points will come back again.
Zooming is also possible. You can zoom in around the
centre of the plot so that the viewing window only covers the middle.
This resembles the Axis Zoom in some of the 2-d plots,
but in this case the active region is the space to the left of the
plot. Drag the mouse down to zoom in or up to zoom
out on this part of the window. Currently it is only possible
to zoom in/out around the centre of the plot.
When zoomed you can use the
Subset From Visible () toolbar button
to define a new Row Subset consisting only of the
points which are currently visible.
See Appendix A.4.1.6 for more explanation.
Clicking on any of the plotted points will activate it - see Section 7.
The following buttons are available on the toolbar:
The following additional item is available as a menu item only:
The Dataset Toolbar contains the following options:
You have considerable freedom to configure how the points are plotted including the shape, colour and transparency of symbols and the representation of error bars if used. These options are described in the following subsection.
When plotting points in a 3D plot there are many different ways that each point can be displayed. By default, TOPCAT chooses a set of markers on the basis of how many points there are in the table and uses a different one for each plotted set. The marker for each set is displayed in a button to the right of its name in the dataset selector panel at the bottom of the plot window. If you click this button the following dialogue will pop up which enables you to change the appearance.
Style editor dialogue for 3d plots
The Legend box defines how the selected set will
be identified in the legend which appears alongside the plot
(though the legend will only be visible if Show Legend
() is on):
The Marker box defines how the markers plotted for each data point will appear:
Any changes you make in this window are reflected in the plot straight away. If you click the OK button at the bottom, the window will disappear and the changes remain. If you click Cancel the window will disappear and any changes you made will be discarded.
You can also change all the plotting styles at once by using the Marker Style menu in the plot window. Here you can select a standard group of styles (e.g. all open 2-pixel markers with different colours and shapes) for the plotted sets. Similarly, error styles can be changed all at once using the Error Style menu.
Spherical plot window
The spherical plot window draws 3-dimensional scatter plots
of datasets from one or more tables on spherical polar axes,
so it's suitable for displaying the position of coordinates on
the sky or some other spherical coordinate system, such as the
surface of a planet or the sun.
You can display it using the Sphere () button
in the Control Window's toolbar.
In most respects this window works like the 3D Plot window, but it uses spherical polar axes rather than Cartesian ones, You have to fill in the dataset selector at the bottom with longitude- and latitude-type coordinates from the table. Selectors are included to indicate the units of those coordinates. If TOPCAT can locate columns in the table which appear to represent Right Ascension and Declination, these will be filled in automatically. If only these two are filled in, then the points will be plotted on the surface of the unit sphere - this is suitable if you just want to inspect the positions of a set of objects in the sky.
If the Radial Coordinates () button is
activated, you can optionally fill in a value in the
Radial Axis selector as well.
In this case points will be plotted in the interior of the sphere,
at a distance from the centre given by the value of the radial coordinate.
The following buttons are available on the toolbar:
The following additional item is available as a menu item only:
The Dataset Toolbar contains the following options:
You have considerable freedom to configure how points are plotted including the shape, colour and transparency of symbols and the representation of errors if used. This works exactly as for the Cartesian 3D plot as described in Appendix A.4.5.1.
Density map window in RGB mode
The density map window plots a 2-dimensional density map of one or more pairs of table columns (or derived quantities); the colour of each pixel displayed is determined by the number of points in the data set which fall within its bounds. Another way to think of this is as a histogram on a 2-dimensional grid, rather than a 1-dimensional one as in the Histogram Window. You can optionally weight these binned counts with another value from the table.
Density maps are suitable when you have a very large number of points to plot, since in this case it's important to be able to see not just whether there is a point at a given pixel, but how many points fall on that pixel. To a large extent, the transparency features of the other 2d and 3d plotting windows address this issue, but the density map gives you a bit more control. It can also export the result as a FITS image, which can then be processed or viewed using image-specific software such as GAIA or Aladin.
This window can be operated in two modes:
You can configure the axes, including zooming in and out, with the mouse (drag on the plot or the axes) or manually as described in Appendix A.4.1.2.
Two controls specific to this window are shown below the plot itself:
The following buttons are available on the toolbar:
The Dataset Toolbar contains the following options:
The Export menu provides a number of ways to export the displayed image for external viewing or analysis. As well as options to export as GIF, JPEG, EPS and FITS, there is also the option to transmit the FITS image to one or all applications listening using the PLASTIC tool interoperability protocol which will receive images. In this way you can transmit the image directly to PLASTIC-aware image manipulation tools such as GAIA or Aladin. See Section 8 for more information about PLASTIC.
How to set the colour channel corresponding to each dataset is explained in the following subsection.
For a density map in RGB mode, each dataset is assigned a colour channel to which it contributes. A representation of this is displayed in a button to the right of its name in the dataset selector panel at the bottom of the density map window. If you click this button the following dialogue will pop up which enables you to change the colour channel.
Style editor dialogue for density map
The Legend box defines how the selected set will
be identified in the legend which appears alongside the plot
(though the legend will only be visible if Show Legend
() is on):
The Channel selector allows you to select either the Red, Green or Blue channel for this dataset to contribute to. Note that this is only enabled in RGB mode; in indexed mode it has no effect and is disabled.
Load Window
The Load Window is used for loading tables from an external location
(e.g. disk or URL) into TOPCAT. It is obtained using the
Load Table button () in the
Control Window toolbar or File menu.
This dialogue allows you to specify a new table to open in several
different ways, described below.
If you successfully load a table using any of these options,
a new entry will be added into the Table List in the Control Window,
which you can then use in the usual ways.
If you choose a location which can't be turned into a table
(for instance because the file doesn't exist),
a window will pop up telling you what went wrong.
If you get an OutOfMemoryError
while loading a table,
you will have to run TOPCAT with more memory, as described in
Section 9.2.2 or use the -disk
flag described in
Section 9.1.
In the simplest case, you can type a name into the
Location field and hit return or the OK
button. This location can be a filename or a URL,
possibly followed by a '#
' character and a
'fragment identifier' to indicate where in the file or URL the table is
located; the details of what such fragment identifiers mean can be
found in the relevant subsection within Section 4.1.1.
Allowed URL types are described in Section 4.2.
You should select the relevant table format from the
Format selector box - you can leave it on
(auto) for loading FITS tables or VOTables,
but for other formats such as ASCII or CSV you must select the right one
explicitly (again, see Section 4.1.1 for details).
There are many other ways of loading tables however, described in the following subsections. The Filestore Browser button is always visible below the location field. Depending on startup options, there may be other buttons here. In any case, you can look in the DataSources menu to see other table load dialogues. Exactly which ones are available will depend on your setup (some may be absent or greyed out, and additional ones may be available). The following subsections describe some of the options which may be available.
Filestore Browser window
By clicking the Filestore Browser button in the Load Window, you can obtain a file browser which will display the files in a given directory. The way this window works is almost certainly familiar to you from other applications.
Unlike a standard file browser however, it can also browse files in remote filestores: currently supported are MySpace and SRB. MySpace is a distributed storage system developed for use with the Virtual Observatory by the AstroGrid project, and SRB (Storage Resource Broker) is a similar general purpose system developed at SDSC. To make use of these facilities, select the relevant entry from the selector box at the top of the window as illustrated above; this will show you a Log In button which prompts you for username, password etc, and you will then be able to browse the remote filestore as if it were local. The same button can be used to log out when you are finished, but the session will be logged out automatically when TOPCAT ends in any case. Access to remote filesystems is dependent on certain optional components of TOPCAT, and it may not be available if you have the topcat-lite configuration.
The browser initially displays the current directory, but this can be
changed by typing a new directory into the File Name field,
or moving up the directory hierarchy using the selector box at the top,
or navigating the file system by clicking the up-directory button
or double-clicking on displayed directories.
The initial default directory can be changed by setting the
user.dir
system property.
All files are shown, and there is no indication of which ones represent tables and which do not. To open one of the displayed files as a table, double-click on it or select it by clicking once and click the Open Table button. The Table Format selector must be set correctly: the "(auto)" setting will automatically detect the format of VOTable or FITS tables, otherwise you will need to select the option describing the format of the file you are attempting to load (see Section 4.1.1). If you pick a file which cannot be converted into a table an error window will pop up.
In most cases, selecting the file name and possibly the format is all you need to do. However, the Position in file field allows you to add information about where in the file the table you want is situated. The meaning of this varies according to the file format: for FITS files, it is the index of the HDU containing the table you're after (the first extension after the primary HDU is numbered 1), and for VOTables it is the index of the TABLE element (the first TABLE encountered is numbered 0). If you leave this blank, you will get the first table in the file in question - many file formats only allow one table per file in any case. For a more table-aware view of the file system, use the Hierarchy Browser instead.
File load Hierarchy Browser window
By selecting the Hierarchy Browser option from the Load Window's DataSources menu, you can obtain a browser which presents a table-aware hierarchical view of the file system. (Note that a freestanding version of this panel with additional functionality is available in the separate Treeview application).
This browser resembles the Filestore Browser in some ways, but with important differences:
The main part of the window shows a "tree" representation of the
hierarchy, initially rooted at the current directory
(the initial directory can be changed by setting the
user.dir
system property).
Each line displayed represents a "node" which may be a file or
some other type of item (for instance an HDU in a FITS file or an
entry in a tar archive). The line contains a little icon
which indicates what kind of node it is and a short text string which
gives its name and maybe some description.
Nodes which represent tables are indicated by the
icon.
For nodes which have some internal structure there is also a
"handle" which indicates whether they are
collapsed (
) or expanded (
).
You can examine remote filespaces (MySpace, SRB)
as well as local ones in the same way as with the
Filestore Browser.
If you select a node by clicking on it, it will be highlighted and some additional description will appear in the panel below the hierarchy display. The text is in bold if the node in question can be opened as a table, and non-bold if it is some non-table item.
Note: an important restriction of this browser is that it will only pick up tables which can be identified automatically - this includes FITS and VOTable files, but does not include text-based formats such as ASCII and Comma-Separated Values. If you want to load one of the latter types of table, you will need to use one of the other load methods and specify table format explicitly.
You can see how this browser works on an example directory of tables as described in Appendix A.5.6.
Note that this window requires certain optional components of the TOPCAT installation, and will not be available if you have the topcat-lite configuration.
Navigation is a bit different from navigation in the File Browser window. To expand a node and see its contents, click on its handle (clicking on the handle when it is expanded will collapse it again). When you have identified the table you want to open, highlight it by clicking on it, and then click the Open Table button at the bottom.
To move to a different directory, i.e. to change the root of the tree which is displayed, use one of the buttons above the tree display:
(In fact the above navigation options are not restricted to changing the root to a new directory, they can move to any node in the tree, for instance a level in a Tar archive.)
There are two more buttons in the browser, Search Selected and Search Tree. These do a recursive search for tables in all the nodes starting at the currently selected one or the current root respectively. What this means is that the program will investigate the whole hierarchy looking for any items which can be used as tables. If it finds any it will open up the tree so that they are visible (note that this doesn't mean that the only nodes revealed will be tables, ancestors and siblings will be revealed too). This can be useful if you believe there are a few tables buried somewhere in a deep directory structure or Tar archive, but you're not sure where. Note that this may be time-consuming - a busy cursor is displayed while the search is going on. Changing the root of the tree will interrupt the search.
SQL Query Dialogue
If you want to read a table from an SQL database, you can use a specialised dialogue to specify the SQL query by selecting SQL Query option from the Load Window's DataSources menu.
This provides you with a list of fields to fill in which make up the query, as follows:
mysql
" for MySQL's Connector/J driver
or "postgresql
" for PostgreSQL's JDBC driver.
localhost
" if the database is local).
SELECT * from XXX
".
In principle any SQL query on the database can be used here,
but the details of what SQL syntax is permitted will be defined
by the JDBC driver you are using.
There are a number of criteria which must be satisfied for SQL access to work within TOPCAT (installation of appropriate drivers and so on) - see Section 9.3. If you don't take these steps, this dialogue may be inaccessible.
Cone search table import dialogue
By selecting the Cone Search option from the Load Window's DataSources menu, you can obtain a dialogue which allows you to query one of a number of external web services for a catalogue of objects known in a given region of the sky.
When first displayed, this dialogue window will ask an external services registry for all the cone search services on the net which have advertised their existence. When it has got the result, you will see a list of their names and titles in a table. For more information about each one, use the Columns menu to select what information, such as publisher, reference URL etc is displayed in the table. You can scroll up and down this table and select the one which you want to query by clicking on it.
Having selected one of the cone search services from the table, you need to specify the sky region in which you are interested. If you enter the name of an astronomical object into the Object Name field and hit the Resolve button, the coordinates will be entered into the RA and Dec fields below. Alternatively you can type the coordinates in directly, choosing either degrees or sexagesimal coordinates using the unit selector boxes. Enter the search radius too.
Having done this, hit the OK button. This will send the query to the service you selected and, if successful, load into TOPCAT a table containing all the objects in the region of the sky you have specified. The exact format of the returned table will depend on the service you have selected, but it will contain at least columns representing Right Ascension and Declination.
Note that this window requires certain optional components of the TOPCAT installation, and will not be available if you have the topcat-lite configuration.
GAVO load dialogue with an example query on the milli-Millennium database
This dialogue permits direct queries to the services provided by GAVO, the German Astrophysical Virtual Observatory. The main databases of general interest available through these services are the Millennium Simulation results, documented at http://www.g-vo.org/Millennium/Help.
To make a query, fill in the fields as required:
http://www.g-vo.org/Millennium
)
http://www.g-vo.org/MyMillennium
)
The SampleQueries menu provides some examples of queries on the Milli-Millennium database (these have been copied from the GAVO query page). If you select one of these the SQL Query panel will be filled in accordingly.
Much more documentation, including tutorials and descriptions of the database schemas, is available on the GAVO website, at http://www.g-vo.org/Millennium/Help.
Provided with TOPCAT are some example tables,
which you can access in a number of ways.
The simplest thing is to start up TOPCAT with the
"-demo
" flag on the command line, which will cause
the program to start up with a few demonstration tables already loaded in.
You can also load examples in from the Examples menu in the Load Window however. This contains the following options:
Note these examples are a bit of a mixed bag, and are not all that exemplary in nature. They are just present to allow you to play around with some of TOPCAT's features if you don't have any real data to hand.
Save Window
The Save Window is used to write tables out,
and it is accessed using the Save Table button ()
in the Control Window's toolbar or File menu.
Any table in the Control Window's table list can be
written at any time; what is written is the
Apparent Table corresponding to the currently
selected table, which takes into account any modifications you have
made to its data or appearance this session.
The current Row Subset and Row Order
are displayed in this window as a reminder of what you're about to
save; if you modify the values in these selectors you will be
modifying the Apparent Table in the usual way.
Any Row Subsets
which have been defined on the table in the current session
will not be saved themselves, but you can save information about
subset membership by creating new boolean columns based on subsets
using the "To Column" button () from the
Subsets Window.
You can use the Table Output Format selector box to pick the format in which the table will be written from one of the supported output formats. There is no default format, and it won't automatically save to the same format it was loaded from, but if you leave it on "(auto)" it will try to guess the format based on the filename given; for instance if you specify the name "out.fits", a FITS binary table will be written.
You can specify the location of the output table in these ways, which are described in the following sections:
There is no option to compress files on output (though you can of course compress them yourself once they have been written).
If the table is large, a progress bar indicating how near the save is to completion will appear. It is not advisable to edit the table during a save operation.
In some cases, when saving a table to a format other than the one from which it was loaded, or if some new kinds of metadata have been added, it may not be possible to express all the data and metadata from the table in the new format. For instance a WDC table can contain data which represent epoch (date), and this cannot be stored in a FITS table. In this case the table may be written with such columns missing. Some message to this effect may be output in this case.
You can specify where to save a table by typing its location directly into the Output Location field of the Save Table window. This will usually be the name of a new file to write to, but could in principle be a URL or a SQL specifier.
Filestore Browser for table saving
By clicking the Browse Filestore button in the Save Table window, you can obtain a browser which will display the files in a given directory.
The browser initially displays the current directory, but this can be
changed by typing a new directory into the File Name field,
or moving up the directory hierarchy using the selector box at the top,
or navigating the file system by clicking the up-directory button
or double-clicking on displayed directories.
The initial default directory can be changed by setting the
user.dir
system property.
The browser can display files in remote filestores such as on MySpace or SRB servers; see the section on the load filestore browser (Appendix A.5.1) for details.
To save to an existing file, select the file name and click the OK button at the bottom; this will overwrite that file. To save to a new file, type it into the File Name field; this will save the table under that name into the directory which is displayed. You can (re)set the format in which the file will be written using the Output Format selector box on the right (see Section 4.1.2 for discussion of output formats).
SQL table writing dialogue
If you want to write a table to an SQL database, you can use a specialised dialogue to specify the table destination by clicking the SQL Table button in the Save Table window.
This provides you with a list of fields to fill in which define the new table to write, as follows:
mysql
" for MySQL's Connector/J driver
or "postgresql
" for PostgreSQL's JDBC driver.
localhost
" if the database is local).
There are a number of criteria which must be satisfied for SQL access to work within TOPCAT (installation of appropriate drivers and so on) - see the section on JDBC configuration. If you don't take these steps, this dialogue may be inaccessible.
Concatenation Window
The Concatenation Window allows you to join two tables together
top-to-bottom. It can be obtained using the
Concatenate Tables button () in the
Control Window toolbar or Joins menu.
When two windows are concatenated all the rows of the first ("base") table are followed by all the rows of the second ("appended") table. The result is a new table which has a number of rows equal to the sum of the two it has been made from. The columns in the resulting table are the same as those of the base table. To perform the concatenation, you have to specify which columns from the appended table correspond to which ones in the base table. Of course, this sort of operation only makes sense if at least some of the columns in both tables have the same meaning. This process is discussed in more detail in Section 5.1.
The concatenation window allows you to select the base and appended tables, and for each column in the base table to specify which column in the appended table corresponds to it. You may select a blank for this, in which case the column in question will have all null entries in the resulting table. In some cases these column selectors may have a value filled in automatically if the program thinks it can guess appropriate ones, but you should ensure that it has guessed correctly in this case. Only suitable columns are available for choosing from these column selectors; in most cases this means numeric ones.
When you have filled in the fields to your satisfaction, hit the Concatenate button at the bottom of the window, and a new table will be created and added to the table list in the Control Window (a popup window will inform you this has happened).
The result is created from the Apparent versions of the base and appended tables, so that any row subsets, hidden columns, or sorts currently in force will be reflected in the output.
Pair Match Window
The Pair Match Window allows you to join two tables together
side-by-side, aligning rows by matching values in some of their
columns between the tables. It can be obtained using the
Pair Match () button in the
Control Window toolbar or
Joins menu.
In a typical scenario you might have two tables each representing a catalogue of astronomical objects, and you want a third table with one row for each object which has an entry in both of the original tables. An object is defined as being the same one in both tables if the co-ordinates in both rows are "similar", for instance if the difference between the positions indicated by RA and Dec columns differ by no more than a specified angle on the sky. Matching rows to produce the join requires you to specify the criteria for rows in both tables to refer to the same object and what to do when one is found - the options are discussed in more detail in Section 5.2.
The result is created from the Apparent versions of the tables being joined, so that any row subsets, hidden columns, or sorts currently in force will be reflected in the output. Progress information on the match, which may take a little while, is provided in the logging window and by a progress bar at the bottom of the window. When it is completed, you will be informed by a popup window which indicates that a new table has been created. This table will be added to the list in the Control Window and can be examined, manipulated and saved like any other. In some cases, some additional columns will be added to the output table which give you more information about how it has progressed (see Appendix A.7.2.3.
The Match Window provides a set of controls which allow you to choose how the match is done and what the results will look like. It consists of these main parts:
The following sections describe some of these components in more detail.
The match criteria box allows you to specify what counts as a match between two rows. The selection you make in this box will determine which columns you have to fill in for the table(s) being matched in the rest of the window. In most cases what you are selecting here is the coordinate space in which rows will be compared against each other, and a numerical value or values to determine how close two rows have to be in terms of a metric on that space to count as a match.
The following match types are offered:
Depending on the match type, the units of the error value(s) you enter may be significant. In this case, there will be a unit selector displayed alongside the entry box. You must choose units which are correct for the number you enter.
The column selection boxes allow you to select which of the columns in the input tables will provide the data (the coordinates which have to match). For each table you must select the names of the required columns; the ones you need to select will depend on the match criteria you have chosen.
For some columns, such as Right Ascension and Declination in sky matches, units are important for the columns you select. In this case, there will be a selector box for the units alongside the selector box for the column itself. You must ensure that the correct units have been selected, or the results of the match will be rubbish.
In some cases these column and/or unit selectors may have a value filled in automatically (if the program thinks it can guess appropriate ones) but you should ensure that it has guessed correctly in this case. Only suitable columns are available for choosing from these column selectors; in most cases this means numeric ones.
When the match is complete a new table will be created which contains rows determined by the matches which have taken place. The Output Rows selector box allows you to choose on what basis the rows will be included in the output table as a function of the matches that were found.
In all cases each row will refer to only one matched (or possibly unmatched) "object", so that any non-blank columns in a given row come from only rows in the input tables which match according to the specified criteria. However, you have two (somewhat interlinked) choices to make about which rows are produced.
The Match Selection selector allows you to choose what happens when a given row in one table can be matched by more than one row in the other table. There are two choices:
The Join Type selector allows you to choose what output rows result from a match in the input tables.
In most cases (all the above except for 1 not 2 and
2 not 1, the set of columns in the output table contains
all the columns from the first table followed by all the columns
from the second table. If this causes a clash of column names,
offending columns will be renamed with a trailing "_1
" or
"_2
".
Depending on the details of the match however,
some additional useful columns may be added:
Here is an example. If your input tables are these:
X Y Vmag - - ---- 1134.822 599.247 13.8 659.68 1046.874 17.2 909.613 543.293 9.3and
X Y Bmag - - ---- 909.523 543.800 10.1 1832.114 409.567 12.3 1135.201 600.100 14.6 702.622 1004.972 19.0then a Cartesian match of the two sets of X and Y values with an error of 1.0 using the 1 and 2 option would give you a result like this:
X_1 Y_1 Vmag X_2 Y_2 Bmag Separation --- --- ---- --- --- ---- ---------- 1134.822 599.247 13.8 1135.201 600.100 14.6 0.933 909.613 543.293 9.3 909.523 543.800 10.1 0.515using All from 1 would give you this:
X_1 Y_1 Vmag X_2 Y_2 Bmag Separation --- --- ---- --- --- ---- ---------- 1134.822 599.247 13.8 1135.201 600.100 14.6 0.933 659.68 1046.874 17.2 909.613 543.293 9.3 909.523 543.800 10.1 0.515and 1 not 2 would give you this:
X Y Vmag - - ---- 659.68 1046.874 17.2
Internal Match Window
The Internal Match Window allows you to perform matching between
rows of the same table, grouping rows that have the same or similar
values in specified columns and producing a new table as a result.
It can be obtained by using the Internal Match
() button in the Control Window
toolbar or Joins menu.
You might want to use this functionality to remove all rows which refer to the same object from an object catalogue, or to ensure that only one entry exists for each object, or to identify groups of several "nearby" objects in some way.
The result is created from the Apparent versions of the tables being joined, so that any row subsets, hidden columns, or sorts currently in force will be reflected in the output. Progress information on the match, which may take some time, is provided in the logging window and by a progress bar at the bottom of the window. When it is completed, you will be informed by a popup window which indicates that a new table has been created. This table will be added to the list in the Control Window and can be examined, manipulated and saved like any other.
The window has the following parts:
The Internal Match Action box gives a list of options for what will happen when an internal match calculation has completed. In each case a new table will be created as a result of the match. The options for what it will look like are these:
You can use this information in other ways, for instance if you
create a new Row Subset using the expression
"GroupSize == 5
" you could select only those
rows which form part of 5-object clusters.
Activation Window
The Activation Window allows you to configure an action to perform when a table row is activated by clicking on a row in the Data Window or a point in the Plot Window. It can be obtained by clicking the Activation Action selector at the bottom of the properties panel in the Control Window.
You have various options for how to define the action. On the left of the window is a list of options; you have to choose one of these to determine what kind of action will take place. When you click on one of these options the corresponding controls on the right hand side will become enabled: use these to select the details of the action and then click the OK button so that subsequent activation events will cause the action you have defined (or Cancel so that they won't). When you click OK the Activation Action in the control window will indicate the action you have configured.
The available options are as follows:
ivo://votech.org/sky/pointAtCoords
.
This will only work if TOPCAT is registered with a PLASTIC hub, and
so are one or more other applications which understand that message.
An example might be a sky viewing application such as Aladin which
can point to a particular region of sky whenever you activate a point.
You need to specify the columns which represent (J2000)
Right Ascension and Declination, and optionally a particular listener
to receive the messages (otherwise all registered ones will).
See Section 8 for more explanation about PLASTIC.
Functions which are expected to be useful for activation actions
are described in Appendix B.2 and include some
general-purpose ones
(displayImage
and displaySpectrum
to display
an image or spectrum in an external viewer) as well as a few
which are relevant to particular survey data, for instance the
spectra2QZ()
function, which will pop up a spectrum
viewer displaying all the spectra related to a given row of 2QZ
survey data based on the contents of its NAME column.
As the above list shows, most of the activation actions you can
define result in a viewer window of some kind popping up.
Exactly what kind of viewer is used depends on how TOPCAT is set up
and in some cases on your choices. More details of the viewer
programs available are given in the following subsections.
If these don't do what you want, you can use the
Execute Custom Code option, perhaps in conjunction with
user-defined functions or the
System
exec()
functions
described in Appendix B.2, to invoke your own.
If you choose the Display Cutout Image or View URL as Image option in the Activation Window, then activating a row will display an image in an image viewer.
The default image viewer is SoG, an astronomical image viewer based on JSky, which offers colourmap manipulation, image zooming, graphics overlays, and other features. For this to work JAI, otherwise known as Java Advanced Imaging must be installed. JAI is a free component available from Sun, but not a part of the Java 2 Standard Edition by default. In operation, SoG looks like this:
SoG Image Viewer
If JAI or the SoG classes themselves are absent, a fallback viewer which just displays the given image in a basic graphics window with no manipulation facilities is used. The fallback image viewer looks like this:
Fallback Image Viewer
If you choose the View URL as Spectrum option in the Activation Window, then activating a row will display a spectrum in a spectrum viewer.
The default spectrum viewer is SPLAT, a sophisticated multi-spectrum analysis program. This requires the presence of a component named JNIAST, which may or may not have been installed with TOPCAT (it depends on some non-Java, i.e. platform-specific code). There is currently no fallback spectrum viewer, so if JNIAST is not present, then spectra cannot be displayed. In this case it will not be possible to select the Display Named Spectrum item in the Activation Window. An example of SPLAT display of multiple spectra is shown below.
SPLAT Spectrum Viewer
Full documentation for SPLAT is available on-line within the program, or in SUN/243.
If you choose the View URL as Web Page option in the Activation Window, then activating a row will display the web page whose URL is in one of the columns in a web browser. You are given the option of what browser you would like to use in this case.
The default basic browser option uses a simple browser which can view HTML or plain text pages and has forward and back buttons which work as you'd expect. In many cases this is fine for viewing HTML pages, and it is available regardless of the system that you are running TOPCAT on. It looks like this:
Basic HTML browser
In some circumstances, it's possible to use your normal web browser for web page display instead. The list of browsers currently includes Firefox, Mozilla and Netscape as well as the basic one. Selecting these will generally only work if (1) the browser you select is installed and on your path, (2) you're on some Unix-like operating system, (3) the browser is already running when the action is invoked. In this case, the selected URL should be displayed in an existing browser window rather than opening a new one. Doing it this way has the advantage that your browser can probably display many types of document (perhaps using plugins) as well as HTML.
Help Window
The help window is a browser for displaying help information on TOPCAT;
it can be obtained by clicking the Help () button
that appears in the toolbar of most windows.
It views the text contained in this document, so it may be what you are
looking at now.
The panel on the left hand side gives a hierarchical
view of the available help topics, and the panel on the right hand
side displays the help text itself. The bar in between the two
can be dragged with the mouse to affect the relative sizes of
these windows.
The toolbar contains these extra buttons:
Although the printing buttons work, if you want to print out the
whole of this document rather than just a few sections you may be better off
printing the PDF version,
or printing the single-page HTML version through a web browser.
The most recent version of these should be available
on the web at
http://www.starlink.ac.uk/topcat/sun253/sun253.html and
http://www.starlink.ac.uk/topcat/sun253.pdf;
you can also find the HTML version in the topcat jar file at
uk/ac/starlink/topcat/help/sun253.html
or, if you have a full TOPCAT installation, in
docs/topcat/sun253/sun253.html
and
docs/topcat/sun253.pdf
(the single-page HTML version is available
here in the HTML version).
The help browser is an HTML browser and some of the hyperlinks in the help document point to locations outside of the help document itself. Selecting these links will go to the external documents. When the viewer is displaying an external document, its URL will be displayed in a line at the bottom of the window. You can cut and paste from this using your platform's usual mechanisms for this.
New Parameter dialogue window
The New Parameter window allows you to enter a new table parameter
to be added to a table.
It can be obtained by clicking the New Parameter ()
button in the Appendix A.3.2.
A parameter is simply a fixed value attached to a table and can contain
information which is a string, a scalar, an array... in fact exactly
the same sorts of values which can appear in table cells.
The window is pretty straightforward to use: fill in the fields and click OK to complete the addition. The Type selector allows you to select what kind of value you have input. The only compulsory field is Parameter Name; any of the others may be left blank, though you will usually want to fill in at least the Value field as well. Often, the parameter will have a string value, in which case the Units field is not very relevant.
Synthetic Column dialogue window
The Synthetic Column Window allows you to define a new "Synthetic" column,
that is one whose values are defined using an algebraic expression
based on the values of other columns in the same row.
The idea is that the value of the cells in a given row in this column
will be calculated on demand as a function of the values of cells
of other columns in that row. You can think of this as providing
functionality like that of a column-oriented spreadsheet.
You can activate the dialogue using the
Add Column () or
Replace Column (
) buttons in the
Columns Window or from the
(right-click) popup menu in the Data Window.
The window consists of a number of fields you must fill in to define the new column:
Having filled in the form to your satisfaction, hit the OK button at the bottom and the new column will be added to the table. If you have made some mistake in filling in the fields, a popup window will give you a message describing the problem. This message may be a bit arcane - try not to panic and see if you can rephrase the expression in a way that the parser might be happier with. If you can't work out the problem, it's time to consult your friendly local Java programmer (failing that, your friendly local C programmer may be able to help) or, by all means, contact the author.
If you wish to add more metadata items you can edit the appropriate cells in the Columns Window. You can edit the expression of an existing synthetic column in the same way.
Once created, a synthetic column is added to the Apparent Table and behaves just like any other; it can be moved, hidden/revealed, used in expressions for other synthetic columns and so on. If the table is saved the new column and its contents will be written to the new output table.
Sky Coordinates Window
The Sky Coordinates Window allows you to add new columns to a table,
representing coordinates in a chosen sky coordinate system.
The table must already contain columns which represent sky coordinates;
by describing the systems of the existing and of the new coordinates,
you provide enough information to calculate the values in the new columns.
You can activate this dialogue using the
New Sky Coordinate Columns () button
in the Columns Window.
The dialogue window has two halves; on the left you give the existing columns which represent sky coordinates, their coordinate system (ICRS, fk5, fk4, galactic, supergalactic or ecliptic) and the units (degrees, radians or sexagesimal) that they are in. Note that the columns available for selection will depend on the units you have selected; for degrees or radians only numeric columns will be selectable, while for sexagesimal (dms/hms) units only string columns will be selectable. On the right you make the coordinate system and units selections as before, but enter the names of the new columns in the text fields. Then just hit the OK button, and the new columns will be appended at the right of the table.
Algebraic Subset dialogue window
The Algebraic Subset Window allows you to define a new
Row Subset which uses an algebraic expression
to define which rows are included. The expression must be a
boolean one, i.e. its value is either true or false for each row of
the table.
You can activate this dialogue using the
Add Subset () button in the
Subsets Window.
The window consists of two fields which must be filled in to define the new subset:
Having filled in the form to your satisfaction, hit the OK button at the bottom and the new subset will be added to the list that can be seen in the Subsets Window where it behaves like any other. If you have made some mistake in filling in the fields, a popup window will give you a message describing the problem.
Available Functions Window
This window displays all the functions (Java methods) which are
available for use when writing
algebraic expressions.
This includes both the built-in expressions and any
extended ones you might have added.
You can find this window by using the
Show Functions () button in the
Synthetic Column or
Algebraic Subset
window toolbars.
On the left hand side of the window is a tree-like representation of the functions you can use. Each item in this tree is one of the following:
Of these, the Folder and Class items have a 'handle' (),
which means that they contain other items
(classes and functions/constants respectively).
By clicking on the handle (or equivalently double-clicking on the name)
you can toggle whether the item is open (so you can see its contents)
or closed (so you can't). So to see the functions in a class,
click on its handle and they will be revealed.
You can click on any of these items and information about it
will appear in the right hand panel. In the case of functions
this describes the function, its arguments, what it does, and
how to use it. The explanations should be fairly self-explanatory;
for instance the description in the figure above indicates that
you could use the invocation atan2(X_POS,Y_POS)
as the expression for a new table column which gives the angle from
the X axis of a point whose position is given by columns with
the names X_POS and Y_POS.
Examples of a number of these functions are given in
Section 6.8.
Using the Add button ()
you can specify the name of a class to add to those available.
You should enter the fully-qualified class name (i.e. including the
dot-separated package path). The class that you specify must be
on the class path which was current when TOPCAT was started,
as explained in Section 9.2.1.
Note however it would be more usual to specify these using
the system property
jel.classes
or
jel.classes.activation
at startup,
as described in Section 6.9.
Classes added in this way will be visible in the tree, but may
not have proper documentation (clicking on them may not reveal
a description in the right hand panel).
Log Window
The log window can be obtained using the View Log option on the File menu of the Control Window.
This window displays any log messages which the application has
generated. Depending on whether the -verbose
flag has
been specified, some or all of these messages may have been written
to console as well (if there is a console - this depends on how you
have invoked TOPCAT).
Under some circumstances, messages way back in the list may not be
displayed.
To clear the display of all the existing messages you can use
the Clear Log button ().
The messages displayed here are those written through Java's
logging system
- in general they are intended for
debugging purposes and not for users to read, but if something
unexpected is happening, or if you are filing a bug report,
it may provide some clues about what's going on.
Although it tries not to disturb things too much, TOPCAT's
manipulation of the logging infrastructure affects how it is
set up, so if you have customised your logging setup using,
e.g., the java.util.logging.config.*
system
properties, you may find that it's not behaving exactly as
you expected. Sorry.
This appendix lists the functions which can be used in algebraic expressions (see Section 6). They are listed in two sections: the first gives the functions available for use anywhere an expression can be used, and the second gives those only for use in defining custom Activation Actions.
Note that although all the available functions are listed here
with short descriptions, their full explanation, including parameter
descriptions and examples, is only available from the
Available Functions Window,
obtained using the toolbar button.
The following functions can be used anywhere that you can write an algebraic expression in TOPCAT. They will typically be used for defining new synthetic columns or algebraically-defined row subsets. More complete documentation of them is available from within TOPCAT in the Available Functions Window.
Functions for angle transformations and manipulations. In particular, methods for translating between radians and HH:MM:SS.S or DDD:MM:SS.S type sexagesimal representations are provided.
dm[s]
, or some others.
Additional spaces and leading +/- are permitted.
hm[s]
, or some others.
Additional spaces and leading +/- are permitted.
In conversions of this type, one has to be careful to get the
sign right in converting angles which are between 0 and -1 degrees.
This routine uses the sign bit of the deg
argument,
taking care to distinguish between +0 and -0 (their internal
representations are different for floating point values).
It is illegal for the min
or sec
arguments
to be negative.
In conversions of this type, one has to be careful to get the
sign right in converting angles which are between 0 and -1 hours.
This routine uses the sign bit of the hour
argument,
taking care to distinguish between +0 and -0 (their internal
representations are different for floating point values).
bepoch
parameter is the epoch at which the position in
the FK4 frame was determined.
bepoch
parameter is the epoch at which the position in
the FK4 frame was determined.
Standard arithmetic functions including things like rounding, sign manipulation, and maximum/minimum functions.
float
(32-bit floating point value),
so this is only suitable for relatively low-precision values.
It's intended for truncating the number of apparent significant
figures represented by a value which you know has been obtained
by combining other values of limited precision.
For more control, see the functions in the Formats
class.
Functions for conversion between flux and magnitude values. Functions are provided for conversion between flux in Janskys and AB magnitudes.
Some constants for approximate conversions between different magnitude scales are also provided:
JOHNSON_AB_*
, for Johnson <-> AB magnitude
conversions
(http://www.astro.utoronto.ca/~patton/astro/mags.html,
citing Frei and Gunn 1995).VEGA_AB_*
, for Vega <-> AB magnitude
conversions
(Blanton et al., Astronomical Journal 127, 2562-2578 (2005),
eqs.(5)).
JOHNSON_AB_V
.
JOHNSON_AB_B
.
JOHNSON_AB_Bj
.
JOHNSON_AB_R
.
JOHNSON_AB_I
.
JOHNSON_AB_g
.
JOHNSON_AB_r
.
JOHNSON_AB_i
.
JOHNSON_AB_Rc
.
JOHNSON_AB_Ic
.
JOHNSON_AB_uPrime
=u'AB.
JOHNSON_AB_gPrime
=g'AB.
JOHNSON_AB_rPrime
=r'AB.
JOHNSON_AB_iPrime
=i'AB.
JOHNSON_AB_zPrime
=z'AB.
VEGA_AB_J
.
VEGA_AB_H
.
VEGA_AB_K
.
F/Jy=10(23-(AB+48.6)/2.5)
AB=2.5*(23-log10(F/Jy))-48.6
F=lumin/(4 x Pi x dist2)
lumin=(4 x Pi x dist2) F
Functions which operate on array-valued cells. You can only use these functions on values which are already arrays. In most cases that means on values in table columns which are declared as array-valued. FITS and VOTable tables can have columns which contain array values, but other formats such as CSV cannot.
array
is not a numeric array, null
is returned.
array
is not a numeric array, null
is returned.
array
is not a numeric array, null
is returned.
array
is not a numeric array, null
is returned.
array
is not an array, zero is returned.
Pixel tiling functions for the celestial sphere.
k
value is the logarithm to base 2 of the
Nside parameter.
k
This k
value is the logarithm to base 2 of the
Nside parameter.
level
parameter suitable for a given
pixel size.
String manipulation and query functions.
s1+s2
, but blank values can sometimes appear as
the string "null
" if you do it like that.
s1+s2+s3
, but blank values can sometimes appear as
the string "null
" if you do it like that.
s1+s2+s3+s4
,
but blank values can sometimes appear as
the string "null
" if you do it like that.
s1==s2
,
which can (for technical reasons) return false even if the
strings are the same.
startIndex
and continues to the character at index endIndex-1
Thus the length of the substring is endIndex-startIndex
.
Functions for formatting numeric values.
formatDecimal
function.
format
string is as defined by Java's
java.text.DecimalFormat
class.
formatDecimal
function.
Standard mathematical and trigonometric functions.
x
,y
)
to polar (r
,theta
).
This method computes the phase
theta
by computing an arc tangent
of y/x
in the range of -pi to pi.
Functions for conversion of time values between various forms. The forms used are
yyyy-mm-ddThh:mm:ss.s
, where the T
is a literal character (a space character may be used instead).
Based on UTC.
Therefore midday on the 25th of October 2004 is
2004-10-25T12:00:00
in ISO 8601 format,
53303.5 as an MJD value,
2004.81588 as a Julian Epoch and
2004.81726 as a Besselian Epoch.
Currently this implementation cannot be relied upon to better than a millisecond.
isoDate
argument is
yyyy-mm-ddThh:mm:ss.s
, though some deviations
from this form are permitted:
T
' which separates date from time
can be replaced by a spaceZ
' (which indicates UTC) may be appended
to the time1994-12-21T14:18:23.2
",
"1968-01-14
", and
"2112-05-25 16:45Z
".
yyyy-mm-ddThh:mm:ss
.
yyyy-mm-dd
.
hh:mm:ss
.
java.text.SimpleDateFormat
class.
The default output corresponds to the string
"yyyy-MM-dd'T'HH:mm:ss
"
Functions for converting between different measures of cosmological distance.
The following parameters are used:
For a flat universe, omegaM
+omegaLambda
=1
The terms and formulae used here are taken from the paper by D.W.Hogg, Distance measures in cosmology, astro-ph/9905116 v4 (2000).
Warning: this makes some reasonable assumptions about the cosmology and returns the luminosity distance. It is only intended for approximate use. If you care about the details, use one of the more specific functions here.
Warning: this makes some reasonable assumptions about the cosmology. It is only intended for approximate use. If you care about the details use one of the more specific functions here.
z
were emitted.
z
.
Functions for converting between strings and numeric values.
The following functions can be used only for defining custom Activation Actions - they mostly deal with causing something to happen, such as popping up an image display window. They generally return a short string, which will be logged to the user to give an indication of what happened (or didn't happen, or should have happened). More complete documentation of them is available from within TOPCAT in the Available Functions Window.
Displays URLs in web browsers.
-remote openURL(
url)
".
Probably only works on Unix-like operating systems, and only
if the browser is already running.
Functions for display of graphics-format images in a no-frills
viewing window (an ImageWindow
).
Supported image formats include GIF, JPEG, PNG and FITS,
which may be compressed.
label
may be any string which identifies the window
for display, so that multiple images may be displayed in different
windows without getting in each others' way.
loc
should be a filename or URL, pointing to an image in
a format that this viewer understands.
Functions for display of spectra in the external viewer SPLAT.
label
may be any string which identifies the window
for display, so that multiple (sets of) spectra may be displayed
in different
windows without getting in each others' way.
loc
should be a filename pointing to a spectrum
in a format that SPLAT understands (includes FITS, NDF).
In some cases, a URL can be used too.
Specialist display functions for use with the SuperCOSMOS survey. These functions display cutout images from the various archives hosted at the SuperCOSMOS Sky Surveys (http://www-wfau.roe.ac.uk/sss/). In most cases these cover the whole of the southern sky.
pixels
pixels in
the X and Y dimensions. Pixels are approximately 0.67 arcsec square.
Sky coverage is complete.
pixels
pixels in the
X and Y dimensions. Pixels are approximately 0.67 arcsec square.
Sky coverage is -90<Dec<+2.5 (degrees).
pixels
pixels in the
X and Y dimensions. Pixels are approximately 0.67 arcsec square.
Sky coverage is -90<Dec<+2.5 (degrees).
pixels
pixels in the
X and Y dimensions. Pixels are approximately 0.67 arcsec square.
Sky coverage is -90<Dec<+2.5 (degrees).
pixels
pixels in the
X and Y dimensions. Pixels are approximately 0.67 arcsec square.
Sky coverage is -90<Dec<+2.5 (degrees).
pixels
pixels in the
X and Y dimensions. Pixels are approximately 0.67 arcsec square.
Sky coverage is -20.5<Dec<+2.5 (degrees).
Specialist functions for use with data from the the Millennium Galaxy Survey.
Functions for simple logging output.
Executes commands on the local operating system. These are executed as if typed in from the shell, or command line.
Functions for display of images in a window. Supported image formats include GIF, JPEG, PNG and FITS, which may be compressed. The SoG program (http://www.starlink.ac.uk/sog/) will be used if it is available, otherwise a no-frills image viewer will be used instead.
Specialist functions for use with data from the the 2QZ survey. Spectral data are taken directly from the 2QZ web site at http://www.2dfquasar.org/.
Specialist display functions for use with the Sloane Digital Sky Server.
scale
argument. The displayed image has
pixels
pixels along each side.
Functions for display of images in external viewer SOG (http://www.starlink.ac.uk/sog/).
label
may be any string which identifies the window
for display, so that multiple images may be displayed in different
windows without getting in each others' way.
loc
should be a filename or URL, pointing to an image in
a format that SOG understands (this includes FITS, compressed FITS,
and NDFs).
Functions for general display of spectra in a window. Display is currently done using the SPLAT program, if available (http://www.starlink.ac.uk/splat/). Recognised spectrum formats include 1-dimensional FITS arrays and NDF files.
This is TOPCAT, Tool for OPerations on Catalogues And Tables. It is a general purpose viewer and editor for astronomical tabular data.
Related software products are
See the TOPCAT web page, http://www.starlink.ac.uk/topcat/ for the latest news and releases.
TOPCAT was initially (2003-2005) developed under the UK Starlink project (1980-2005, R.I.P.). From July 2005 until June 2006, it was supported by grant PP/D002486/1 from the UK's Particle Physics and Astronomy Research Council. Maintenance and development has been funded from July 2006 until December 2007 by the European VOTech project within the UK's AstroGrid, and directly from AstroGrid funding beyond that.
Inspiration for many of TOPCAT's features has been taken from the following pre-existing tools:
Apart from the excellent Java 2 Standard Edition itself, the following external libraries provide important parts of TOPCAT's functionality:
The following users, testers and programmers have supplied useful comments (apologies for any missed out):
Releases to date have been as follows:
tabular
environment now available.compress
now work
(as well as gzip and bzip2).-demo
starts up with demo data.-disk
" flag allows use of disk backing storage for
large tablesIn addition, the following incompatibilities and changes have been introduced since the last version:
-f
" flag). FITS files and VOTables can
still be identified automatically (i.e. it's not necessary to
specify format in this case) but ASCII tables cannot:
you must now specify the format when loading ASCII tables.
This change allows better error messages and support for
more text-like formats.jel.classes
"
and "jel.classes.activation
",
not "gnu.jel.static.classes
".Secondly, the provision of load dialogues has been modularised, and a number of new dialogues provided. The new ones are:
startable.load.dialogs
system property.
The appearance of the Load Window has changed; now only the File Browser button is visible along with the Location field in the body of the window, but the DataSources menu can be used to display other available table import dialogues.
topcat-full.jar
and topcat-lite.jar
.
The former is much larger than before (11 Mbyte),
since it contains a number
of classes to support custom load dialogues such as the MySpace
browser and web service interaction, as well as the SoG classes.
The latter contains only the classes for the core functionality,
and is much smaller (3 Mbyte).
topcat -help
is now more comprehensive,
describing briefly what each option does and listing system
properties as well as arguments/flags proper.
In addition, the save dialogue now displays the current row subset and sort order - this makes it easier to see and/or change the details of the table you're about to save.
exec
functions which execute commands on the local
operating system-verbose
(or -v
)
flag one or more times you can get those messages back.
The messages (in fact all logging messages at any level)
can also be viewed from the GUI by using the new
File|Show Log menu option from the
Control Window.
tablecopy
tool is no longer covered in this
document; it is replaced by the tcopy
tool in
the separate
STILTS package.
There has also been some reorganisation of this document, mainly
in the appendices.
-version
flagNULL_
test on the first column of a table.Times
class.RANDOM
special function.null
" interpreted as a blank value in ASCII
tables.roundDecimal
and formatDecimal
functions
introduced for more control over visual appearance of numeric values.Some non graphics-related improvements have also been made as follows:
-soap
flag on the command line. This facility may be withdrawn in
future versions, in view of the fact that the PLASTIC service
can provide similar functionality.
showObjects
message,
it now checks if a matching subset exists rather than always
creating and adding a new one. If it does, it just sets current
the existing one. This can cut down (a bit) on proliferation
of Row Subsets.file:
scheme sent by TOPCAT in
PLASTIC messages now correctly conform to RFC 1738.-Dmyspace.cache=true
to speed it up at the
expense of accuracy.
ivo://votech.org/votable/highlightObject
message,
see Section 8.2 and Section 8.3.csv-noheader
output format.votable-fits-href
and
votable-binary-href
format tables from the file
browser.mark.workaround
system property,
see Section 9.2.3.startable.storage
policy "sideways
")
have been introduced.
These can provide considerable efficiency improvements for
certain tasks when working with very large (and especially wide)
tables.
ivo:
or myspace:
URLs is now provided - see new Section 4.2.toHex
and fromHex
numeric
conversion functions.-J
flag to topcat
startup script
for passing flags directly to Java.ivo://votech.org/votable/loadFromURL
message.sinh
, cosh
, tanh
and inverses)
Maths class
(sinh, cosh, tanh and inverses).Graphics upgrades
param$
notation (Section 6.3),
and both columns and parameters can be referenced by UCD using
ucd$
notation (Section 6.1).
Receving a row subset from PLASTIC in this way, and certain other actions, now cause the subset to be shown straight away (and updated if necessary) on any existing plots, which makes this kind of PLASTIC interaction more responsive.
The size of each subset, and also the corresponding percentage of
the table it represents, is now calculated automatically and
displayed in the Subset Window.
The old behaviour of only calculating sizes on request can be
reinstated using the Autocount rows ()
menu item if required.
formatDecimalLocal()
functions in
class Formats.fluxToLuminosity
and luminosityToFlux
functions in class Fluxes.gcj
).TNULL
n header
cards - write them as numeric not string values.-exthub
flag which starts a new external
PLASTIC hub.-stilts
convenience flag so you can easily
run STILTS from a TOPCAT installation.fluxToLuminosity
function..starjava.properties
file.datatype
attribute.-disk
flag is now honoured when loading
tables from JDBC, which makes it possible to input larger
datasets from RDBMS.