public class ParquetTableBuilder extends DocumentedTableBuilder
Constructor and Description |
---|
ParquetTableBuilder()
Constructor.
|
Modifier and Type | Method and Description |
---|---|
boolean |
canImport(DataFlavor flavor)
Indicates whether this builder is able to turn a resource of
media type indicated by
flavor into a table. |
boolean |
canStream()
Returns false; parquet metadata is in the footer.
|
boolean |
docIncludesExample()
Indicates whether the serialization of some (short) example table
should be added to the user documentation for this handler.
|
Boolean |
getCacheCols()
Returns policy for table construction.
|
String |
getFormatName()
Returns the name of the format which can be read by this handler.
|
int |
getReadThreadCount()
Returns the number of read threads to use when caching column data.
|
boolean |
getTryUrl()
Indicates whether an attempt is made to open parquet files from
non-file URLs.
|
String |
getXmlDescription()
Returns user-directed documentation in XML format.
|
StarTable |
makeStarTable(DataSource datsrc,
boolean wantRandom,
StoragePolicy storage)
Constructs a
StarTable based on a given DataSource . |
void |
setCacheCols(Boolean cacheCols)
Determines policy for table construction.
|
void |
setReadThreadCount(int nThread)
Sets the number of read threads to use when caching column data.
|
void |
setTryUrl(boolean tryUrl)
Configures whether an attempt is made to open parquet files from
non-file URLs.
|
void |
streamStarTable(InputStream istrm,
TableSink sink,
String pos)
Reads a table from an input stream and writes it a row at a time
to a sink.
|
getExtensions, looksLikeFile
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
matchesExtension, readText, toLink
public String getFormatName()
TableBuilder
public boolean canStream()
canStream
in class DocumentedTableBuilder
public boolean docIncludesExample()
DocumentedIOHandler
Documented.getXmlDescription()
method already includes some example output, should return false.public String getXmlDescription()
Documented
The output should be a sequence of one or more <P> elements, using XHTML-like XML. Since rendering may be done in a number of contexts however, use of the full range of XHTML elements is discouraged. Where possible, the content should stick to simple markup such as the elements P, A, UL, OL, LI, DL, DT, DD EM, STRONG, I, B, CODE, TT, PRE.
public StarTable makeStarTable(DataSource datsrc, boolean wantRandom, StoragePolicy storage) throws IOException
TableBuilder
StarTable
based on a given DataSource
.
If the source is not recognised or this builder does not know
how to construct a table from it, then a
TableFormatException
should be thrown.
If this builder thinks it should be able to handle the source
but an error occurs during processing, an IOException
can be thrown.
The wantRandom
parameter is used to indicate whether,
ideally, a random-access table should be returned. There is no
requirement for the builder to honour this request, but if
it knows how to make both random and non-random tables, it can
use this flag to decide which to return.
Note: the presence of the wantRandom
parameter is somewhat misleading. TableBuilder implementations
usually should, and do, ignore it (it would be removed from the
interface if it were not for backward compatibility issues).
Regardless of the value of this parameter, implementations should
return a random-access table only if it is easy for them to do so;
in particular they should not use the supplied
storagePolicy
, or any other resource-expensive measure,
to randomise a sequential table just because the
wantRandom
parameter is true.
datsrc
- the DataSource containing the table resourcewantRandom
- whether, preferentially, a random access table
should be returnedstorage
- a StoragePolicy object which may be used to
supply scratch storage if the builder needs itdatsrc
TableFormatException
- if the table is not of a kind that
can be handled by this handlerIOException
- if an unexpected I/O error occurs during processingpublic void streamStarTable(InputStream istrm, TableSink sink, String pos) throws TableFormatException
TableBuilder
TableFormatException
.
The input stream should be prepared for use prior to calling
this method, so implementations should not in general attempt to
decompress or buffer istrm
.
istrm
- input stream containing table datasink
- destination of the tablepos
- position identifier describing the location of the
table within the stream;
see DataSource.getPosition()
(may be null)TableFormatException
- if the table can't be streamed or
the data is malformedpublic boolean canImport(DataFlavor flavor)
TableBuilder
flavor
into a table.
It should return true
if it thinks that its
TableBuilder.streamStarTable(java.io.InputStream, uk.ac.starlink.table.TableSink, java.lang.String)
method stands a reasonable chance of
successfully constructing a StarTable
from a
DataSource
whose input stream is described by the
DataFlavor
flavor
.
It will typically make this determination based on the flavor's
MIME type.
This method should only return true
if the flavor looks like
it is targeted at this builder; for instance a builder which
uses a text-based format should return false for a
flavor which indicates a MIME type of text/plain
.
This method is used in supporting drag and drop functionality
(see StarTableFactory.canImport(java.awt.datatransfer.DataFlavor[])
).
flavor
- the DataFlavor whose suitability as stream input
is to be assessedtrue
iff this builder reckons it stands a good
chance of turning a stream of type flavor
into a
StarTable
@ConfigMethod(property="cachecols", usage="true|false|null", example="true", doc="<p>Forces whether to read all the column data at table load\ntime. If <code>true</code>, then when the table is loaded,\nall data is read by column into local scratch disk files,\nwhich is generally the fastest way to ingest all the data.\nIf <code>false</code>, the table rows are read as required,\nand possibly cached using the normal STIL mechanisms.\nIf <code>null</code> (the default), the decision is taken\nautomatically based on available information.\n</p>") public void setCacheCols(Boolean cacheCols)
makeStarTable
method returns a
CachedParquetStarTable
and if false a
SequentialParquetStarTable
.
If null, the decision is made automatically on the basis of
whether it looks like random access is required and file size etc.cacheCols
- column data read policypublic Boolean getCacheCols()
@ConfigMethod(property="nThread", usage="<int>", example="4", doc="<p>Sets the number of read threads used for concurrently\nreading table columns if the columns are cached at load time\n- see the <code>cachecols</code> option.\nIf the value is <=0 (the default), a value is chosen\nbased on the number of apparently available processors.\n</p>") public void setReadThreadCount(int nThread)
CachedParquetStarTable
constructor, and ignored when constructing a
SequentialParquetStarTable
.nThread
- read thread count, or <=0 for autopublic int getReadThreadCount()
@ConfigMethod(property="tryUrl", doc="<p>Whether to attempt to open non-file URLs as parquet files.\nThis usually seems to fail with a cryptic error message,\nso it is not attempted by default, but it\'s possible that with\nsuitable library support on the classpath it might work,\nso this option exists to make the attempt.\n</p>", example="true") public void setTryUrl(boolean tryUrl)
tryUrl
- true to attempt opening non-file URLspublic boolean getTryUrl()
Copyright © 2024 Central Laboratory of the Research Councils. All Rights Reserved.