public class ParquetTableBuilder extends DocumentedTableBuilder
Constructor and Description |
---|
ParquetTableBuilder()
Constructor.
|
Modifier and Type | Method and Description |
---|---|
boolean |
canImport(DataFlavor flavor)
Indicates whether this builder is able to turn a resource of
media type indicated by
flavor into a table. |
boolean |
canStream()
Returns false; parquet metadata is in the footer.
|
boolean |
docIncludesExample()
Indicates whether the serialization of some (short) example table
should be added to the user documentation for this handler.
|
Boolean |
getCacheCols()
Returns policy for table construction.
|
String |
getFormatName()
Returns the name of the format which can be read by this handler.
|
int |
getReadThreadCount()
Returns the number of read threads to use when caching column data.
|
boolean |
getTryUrl()
Indicates whether an attempt is made to open parquet files from
non-file URLs.
|
String |
getVOTableLocation()
Returns the location of a data-less VOTable document that will be used
to supply additional metadata for a parquet table being read.
|
Boolean |
getVOTableMetadata()
Indicates whether a DATA-less VOTable stored in input parquet file
will be used to supply metadata.
|
String |
getXmlDescription()
Returns user-directed documentation in XML format.
|
StarTable |
makeStarTable(DataSource datsrc,
boolean wantRandom,
StoragePolicy storage)
Constructs a
StarTable based on a given DataSource . |
static String |
readUtf8FromLocation(String loc)
Reads all the characters from a given location into a string.
|
void |
setCacheCols(Boolean cacheCols)
Determines policy for table construction.
|
void |
setReadThreadCount(int nThread)
Sets the number of read threads to use when caching column data.
|
void |
setTryUrl(boolean tryUrl)
Configures whether an attempt is made to open parquet files from
non-file URLs.
|
void |
setVOTableLocation(String votableLoc)
Sets the location of a data-less VOTable document that will be used
to supply additional metadata for a parquet table being read.
|
void |
setVOTableMetadata(Boolean votMeta)
Determines whether a DATA-less VOTable stored in input parquet file
will be used to supply metadata.
|
void |
streamStarTable(InputStream istrm,
TableSink sink,
String pos)
Reads a table from an input stream and writes it a row at a time
to a sink.
|
getExtensions, looksLikeFile
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
matchesExtension, readText, toLink
public String getFormatName()
TableBuilder
public boolean canStream()
canStream
in class DocumentedTableBuilder
public boolean docIncludesExample()
DocumentedIOHandler
Documented.getXmlDescription()
method already includes some example output, should return false.public String getXmlDescription()
Documented
The output should be a sequence of one or more <P> elements, using XHTML-like XML. Since rendering may be done in a number of contexts however, use of the full range of XHTML elements is discouraged. Where possible, the content should stick to simple markup such as the elements P, A, UL, OL, LI, DL, DT, DD EM, STRONG, I, B, CODE, TT, PRE.
public StarTable makeStarTable(DataSource datsrc, boolean wantRandom, StoragePolicy storage) throws IOException
TableBuilder
StarTable
based on a given DataSource
.
If the source is not recognised or this builder does not know
how to construct a table from it, then a
TableFormatException
should be thrown.
If this builder thinks it should be able to handle the source
but an error occurs during processing, an IOException
can be thrown.
The wantRandom
parameter is used to indicate whether,
ideally, a random-access table should be returned. There is no
requirement for the builder to honour this request, but if
it knows how to make both random and non-random tables, it can
use this flag to decide which to return.
Note: the presence of the wantRandom
parameter is somewhat misleading. TableBuilder implementations
usually should, and do, ignore it (it would be removed from the
interface if it were not for backward compatibility issues).
Regardless of the value of this parameter, implementations should
return a random-access table only if it is easy for them to do so;
in particular they should not use the supplied
storagePolicy
, or any other resource-expensive measure,
to randomise a sequential table just because the
wantRandom
parameter is true.
datsrc
- the DataSource containing the table resourcewantRandom
- whether, preferentially, a random access table
should be returnedstorage
- a StoragePolicy object which may be used to
supply scratch storage if the builder needs itdatsrc
TableFormatException
- if the table is not of a kind that
can be handled by this handlerIOException
- if an unexpected I/O error occurs during processingpublic void streamStarTable(InputStream istrm, TableSink sink, String pos) throws TableFormatException
TableBuilder
TableFormatException
.
The input stream should be prepared for use prior to calling
this method, so implementations should not in general attempt to
decompress or buffer istrm
.
istrm
- input stream containing table datasink
- destination of the tablepos
- position identifier describing the location of the
table within the stream;
see DataSource.getPosition()
(may be null)TableFormatException
- if the table can't be streamed or
the data is malformedpublic boolean canImport(DataFlavor flavor)
TableBuilder
flavor
into a table.
It should return true
if it thinks that its
TableBuilder.streamStarTable(java.io.InputStream, uk.ac.starlink.table.TableSink, java.lang.String)
method stands a reasonable chance of
successfully constructing a StarTable
from a
DataSource
whose input stream is described by the
DataFlavor
flavor
.
It will typically make this determination based on the flavor's
MIME type.
This method should only return true
if the flavor looks like
it is targeted at this builder; for instance a builder which
uses a text-based format should return false for a
flavor which indicates a MIME type of text/plain
.
This method is used in supporting drag and drop functionality
(see StarTableFactory.canImport(java.awt.datatransfer.DataFlavor[])
).
flavor
- the DataFlavor whose suitability as stream input
is to be assessedtrue
iff this builder reckons it stands a good
chance of turning a stream of type flavor
into a
StarTable
@ConfigMethod(property="cachecols", usage="true|false|null", example="true", doc="<p>Forces whether to read all the column data at table load\ntime. If <code>true</code>, then when the table is loaded,\nall data is read by column into local scratch disk files,\nwhich is generally the fastest way to ingest all the data.\nIf <code>false</code>, the table rows are read as required,\nand possibly cached using the normal STIL mechanisms.\nIf <code>null</code> (the default), the decision is taken\nautomatically based on available information.\n</p>") public void setCacheCols(Boolean cacheCols)
makeStarTable
method returns a
CachedParquetStarTable
and if false a
SequentialParquetStarTable
.
If null, the decision is made automatically on the basis of
whether it looks like random access is required and file size etc.cacheCols
- column data read policypublic Boolean getCacheCols()
@ConfigMethod(property="nThread", usage="<int>", example="4", doc="<p>Sets the number of read threads used for concurrently\nreading table columns if the columns are cached at load time\n- see the <code>cachecols</code> option.\nIf the value is <=0 (the default), a value is chosen\nbased on the number of apparently available processors.\n</p>") public void setReadThreadCount(int nThread)
CachedParquetStarTable
constructor, and ignored when constructing a
SequentialParquetStarTable
.nThread
- read thread count, or <=0 for autopublic int getReadThreadCount()
@ConfigMethod(property="tryUrl", doc="<p>Whether to attempt to open non-file URLs as parquet files.\nThis usually seems to fail with a cryptic error message,\nso it is not attempted by default, but it\'s possible that with\nsuitable library support on the classpath it might work,\nso this option exists to make the attempt.\n</p>", example="true") public void setTryUrl(boolean tryUrl)
tryUrl
- true to attempt opening non-file URLspublic boolean getTryUrl()
@ConfigMethod(property="votmeta", example="false", doc="<p>If true, the content of the parquet extra metadata\nkey-value list item with key\n<code>IVOA.VOTable-Parquet.content</code>\nwill be read to supply the metadata for the input table,\nfollowing the\n<webref url=\'https://www.ivoa.net/documents/Notes/VOParquet/\'>VOParquet convention</webref>.\nIf false, any such VOTable metadata is ignored.\nIf set null, the default, then such VOTable metadata\nwill be used only if it is present and apparently consistent\nwith the parquet data and metadata.\n</p>") public void setVOTableMetadata(Boolean votMeta)
votMeta
- whether to read metadata from VOTable textpublic Boolean getVOTableMetadata()
@ConfigMethod(property="votable", doc="<p>Location of a UTF-8-encoded data-less VOTable\nthat will supply additional metadata for a parquet table\nbeing read, according to the\n<webref url=\'https://www.ivoa.net/documents/Notes/VOParquet/\'>VOParquet convention</webref>.\nThis is normally not required, but if present it overrides\nany such metadata VOTable embedded within the parquet file.\nThis value will only be used if the <code>votmeta</code>\nconfiguration is not false.\n</p>", usage="<filename-or-url>", example="./metadata.vot") public void setVOTableLocation(String votableLoc)
getVOTableMetadata()
returns
a non-FALSE value.votableLoc
- filename or URL of UTF-8-encoded data-less VOTablepublic String getVOTableLocation()
public static String readUtf8FromLocation(String loc) throws IOException
loc
- filname or URLIOException
Copyright © 2025 Central Laboratory of the Research Councils. All Rights Reserved.