Next Previous Up Contents
Next: IPAC
Up: Supported Input Formats
Previous: VOTable
In many cases tables are stored in some sort of unstructured plain
text format, with cells separated by spaces or some other delimiters.
There is a wide variety of such formats depending on what delimiters
are used, how columns are identified, whether blank values are permitted
and so on. It is impossible to cope with them all, but TOPCAT
attempts to make a good guess about how to interpret a given ASCII file as
a table, which in many cases is successful. In particular, if you just
have columns of numbers separated by something that looks like spaces,
you should be just fine.
Here are the detailed rules for how the ASCII-format tables are
interpreted:
- Bytes in the file are interpreted as ASCII characters
- Each table row is represented by a single line of text
- Lines are terminated by one or more contiguous line termination
characters: line feed (0x0A) or carriage return (0x0D)
- Within a line, fields are separated by one or more whitespace
characters: space (" ") or tab (0x09)
- A field is either an unquoted sequence of non-whitespace characters,
or a sequence of non-newline characters between matching
single (') or double (") quote characters -
spaces are therefore allowed in quoted fields
- Within a quoted field, whitespace characters are permitted and are
treated literally
- Within a quoted field, any character preceded by a backslash character
("\") is treated literally. This allows quote characters to appear
within a quoted string.
- An empty quoted string (two adjacent quotes)
or the string "
null
" (unquoted) represents
the null value
- All data lines must contain the same number of fields (this is the
number of columns in the table)
- The data type of a column is guessed according to the fields that
appear in the table. If all the fields in one column can be parsed
as integers (or null values), then that column will turn into an
integer-type column. The types that are tried, in order of
preference, are:
Boolean
,
Short
Integer
,
Long
,
Float
,
Double
,
String
- Some special values are permitted for floating point columns:
NaN
for not-a-number, which is treated the same as a
null value for most purposes, and Infinity
or inf
for infinity (with or without a preceding +/- sign).
These values are matched case-insensitively.
- Empty lines are ignored
- Anything after a hash character "#" (except one in a quoted string)
on a line is ignored as far as table data goes;
any line which starts with a "!" is also ignored.
However, lines which start with a "#" or "!" at the start of the table
(before any data lines) will be interpreted as metadata as follows:
- The last "#"/"!"-starting line before the first data line may
contain
the column names. If it has the same number of fields as
there are columns in the table, each field will be taken to be
the title of the corresponding column. Otherwise, it will be
taken as a normal comment line.
- Any comment lines before the first data line not covered by the
above will be concatenated to form the "description" parameter
of the table.
If the list of rules above looks frightening, don't worry,
in many cases it ought
to make sense of a table without you having to read the small print.
Here is an example of a suitable ASCII-format table:
#
# Here is a list of some animals.
#
# RECNO SPECIES NAME LEGS HEIGHT/m
1 pig "Pigling Bland" 4 0.8
2 cow Daisy 4 2
3 goldfish Dobbin "" 0.05
4 ant "" 6 0.001
5 ant "" 6 0.001
6 ant '' 6 0.001
7 "queen ant" 'Ma\'am' 6 2e-3
8 human "Mark" 2 1.8
In this case it will identify the following columns:
Name Type
---- ----
RECNO Short
SPECIES String
NAME String
LEGS Short
HEIGHT/m Float
It will also use the text "Here is a list of some animals
"
as the Description parameter of the table.
Without any of the comment lines, it would still interpret the table,
but the columns would be given the names col1
..col5
.
If you understand the format of your files but they don't exactly
match the criteria above, the best thing is probably to write a
simple free-standing program or script which will convert them
into the format described here.
You may find Perl or awk suitable languages for this sort of thing.
This format is not detected automatically - you must specify that
you wish to load a table in ascii
format.
Next Previous Up Contents
Next: IPAC
Up: Supported Input Formats
Previous: VOTable
TOPCAT - Tool for OPerations on Catalogues And Tables
Starlink User Note253
TOPCAT web page:
http://www.starlink.ac.uk/topcat/
Author email:
m.b.taylor@bristol.ac.uk
Mailing list:
topcat-user@bristol.ac.uk