Next Previous Up Contents
Next: Mode Specifier
Up: Usage
Previous: Input Specifier

A.2.1.2 Filter Specifiers

The filter specifiers each specify a processing step which is performed on a table, transforming an input table to an output one. You can have any combination of them, and they are used in the order that they are given on the command line. They are like filter-type commands in a Unix pipeline. Some of them have additional optional or mandatory arguments.

-select <expr>
Include in the output table only rows for which the given expression <expr> evaluates to true. <expr> is an expression using the syntax described in Section 4 with a boolean-type value.
-sort [-down] [-nullsfirst] <colid-list>
Sorts the table according to the columns named in <colid-list>. <colid-list> is a space-separated list of column identifiers (names, $IDs or numbers, where 1 is the first column). One or more columns may be specified: sorting is done on the values in the first-specified field, but if they are equal the tie is resolved by looking at the second-specified field, and so on. If the -down flag is used, the sort order is descending instead of ascending. If the -nullsfirst flag is used, blank entries are considered to come at the start of the collation sequence instead of the end.
-sortexpr <expr>
Sorts the table according to the value of an algebraic expression. The syntax of <expr> is described in Section 4. Its value must be of a type that it makes sense to sort, for instance numeric.
-every <step>
Include only every <step>'th row in the result, starting with the first row.
-head <nrows>
Include only the first <nrows> rows of the table.
-tail <nrows>
Include only the last <nrows> rows of the table.
-addcol [-after <col-id> | -before <col-id>] <col-name> <expr>
Add a new column called <col-name> defined by the algebraic expression <expr>. Expression syntax is described in Section 4. By default the new row appears after the last row of the table, but you can position it using either the -after or -before flags. In either case, a <col-id> is either the column's name (if it is syntactically a Java identifier), or its number (the first column is 1), or its $ID ($1 is the first column).
-keepcols <colid-list>
Output table consists of only those columns named in <colid-list>, in that order. <colid-list> is space-separated. col-id is either the column's name (if it is syntactically a Java identifier) or its number (the first column is 1) or its $ID ($1 is the first column).
-delcols <colid-list>
Delete named columns. <colid-list> is a space-separated list of identifiers which are either a column's name (if it is syntactically a Java identifier) or its number (the first column is 1) or its $ID ($1 is the first column).
-explode
Turns any column which is an N-element array into N scalar columns. Only works if the array size is fixed.
-cache
Stores in memory or on disk a temporary copy of the table at this point in the pipeline. This can provide improvements in efficiency if there is an expensive step upstream and a step which requires more than one read of the data downstream. If you see an error like "Can't re-read data from stream" then adding this flag near the start of the filters might help.
-progress
Monitors progress by displaying the number of rows processed so far on the terminal (standard error). This number is updated every second or thereabouts; if all the processing is done in under a second you won't see any output. If the total number of rows in the table is known, an ASCII-art progress bar is updated, otherwise just the number of rows seen so far is written.

Specifying -verbose has the effect of inserting a -progress flag at the start of the pipeline, so you can see how much progress has been made through the initial input table. By putting a -progress at different points in pipeline you can monitor how far different stages of the processing have progressed. If you insert more than one -progress however, output to the terminal is going to get quite messy.

-random
Ensures that steps downstream see the table as random access. Only useful for debugging.
-sequential
Ensures that steps downstream see the table as sequential access. Only useful for debugging.

If no filter specifiers are given, the input table will be sent directly to its destination without any modifications.


Next Previous Up Contents
Next: Mode Specifier
Up: Usage
Previous: Input Specifier

STILTS - Starlink Tables Infrastructure Library Tool Set
Starlink User Note 256
STILTS web page: http://www.starlink.ac.uk/stilts/
Author email: m.b.taylor@bristol.ac.uk