Next Previous Up Contents
Next: Comma-Separated Values
Up: Supplied Input Handlers
Previous: VOTable
In many cases tables are stored in some sort of unstructured plain
text format, with cells separated by spaces or some other delimiters.
The AsciiTableBuilder
class attempts to read these and interpret what's there
in sensible ways, but since there are so
many possibilities of different delimiters and formats for exactly
how values are specified, it won't always succeed.
Here are the rules for how the ASCII-format table handler reads tables:
- Bytes in the file are interpreted as ASCII characters
- Each table row is represented by a single line of text
- Lines are terminated by one or more contiguous line termination
characters: line feed (0x0A) or carriage return (0x0D)
- Within a line, fields are separated by one or more whitespace
characters: space (" ") or tab (0x09)
- A field is either an unquoted sequence of non-whitespace characters,
or a sequence of non-newline characters between matching
single (') or double (") quote characters -
spaces are therefore allowed in quoted fields
- Within a quoted field, whitespace characters are permitted and are
treated literally
- Within a quoted field, any character preceded by a backslash character
("\") is treated literally. This allows quote characters to appear
within a quoted string.
- An empty quoted string (two adjacent quotes) or the string
"
null
(unquoted) represents
the null value
- All data lines must contain the same number of fields (this is the
number of columns in the table)
- The data type of a column is guessed according to the fields that
appear in the table. If all the fields in one column can be parsed
as integers (or null values), then that column will turn into an
integer-type column. The types that are tried, in order of
preference, are:
Boolean
,
Short
Integer
,
Long
,
Float
,
Double
,
String
- Empty lines are ignored
- Anything after a hash character "#" (except one in a quoted string)
on a line is ignored as far as table data goes;
any line which starts with a "!" is also ignored.
However, lines which start with a "#" or "!" at the start of the table
(before any data lines) will be interpreted as metadata as follows:
- The last "#"/"!"-starting line before the first data line may
contain
the column names. If it has the same number of fields as
there are columns in the table, each field will be taken to be
the title of the corresponding column. Otherwise, it will be
taken as a normal comment line.
- Any comment lines before the first data line not covered by the
above will be concatenated to form the "description" parameter
of the table.
If the list of rules above looks frightening, don't worry,
in many cases it ought
to make sense of a table without you having to read the small print.
Here is an example of a suitable ASCII-format table:
#
# Here is a list of some animals.
#
# RECNO SPECIES NAME LEGS HEIGHT/m
1 pig "Pigling Bland" 4 0.8
2 cow Daisy 4 2
3 goldfish Dobbin "" 0.05
4 ant "" 6 0.001
5 ant "" 6 0.001
6 ant '' 6 0.001
7 "queen ant" 'Ma\'am' 6 2e-3
8 human "Mark" 2 1.8
In this case it will identify the following columns:
Name Type
---- ----
RECNO Integer
SPECIES String
NAME String
LEGS Integer
HEIGHT/m Float
It will also use the text "Here is a list of some animals
"
as the Description parameter of the table.
Without any of the comment lines, it would still interpret the table,
but the columns would be given the names col1
..col5
.
If you understand the format of your files but they don't exactly
match the criteria above, the best thing is probably to write a
simple free-standing program or script which will convert them
into the format described here.
You may find Perl, awk or sed suitable languages for this sort of thing.
Alternatively, you could write a new input handler as explained in
Section 3.1 - you may find it easiest to subclass
the uk.ac.starlink.table.formats.StreamStarTable
class
in this case.
Next Previous Up Contents
Next: Comma-Separated Values
Up: Supplied Input Handlers
Previous: VOTable
STIL - Starlink Tables Infrastructure Library
Starlink User Note
252
STIL web page:
http://www.starlink.ac.uk/stil/
Author email:
m.b.taylor@bristol.ac.uk
Starlink: http://www.starlink.ac.uk/