public abstract class DataSource
extends java.lang.Object
 As well as the ability to return a stream, a DataSource may
 also have a position, which corresponds to the 'ref' or 'frag'
 part of a URL (the bit after the #).  This is an indication
 of a location in the stream; it is a string, and its interpretation
 is entirely up to the application (though may be specified by
 the documentation of specific DataSource subclasses).
 
 As well as providing the facility for several different objects to
 get their own copy of the underlying input stream, this class also
 handles decompression of the stream.
 Compression types are as understood by the associated Compression
 class.
 
For efficiency, a buffer of the bytes at the start of the stream called the 'intro buffer' is recorded the first time that the stream is read. This can then be used for magic number queries cheaply, without having to open a new input stream. In the case that the whole input stream is shorter than the intro buffer, the underlying input stream never has to be read again.
 Any implementation which implements getRawInputStream() in such
 a way as to return different byte sequences on different occasions
 may lead to unpredictable behaviour from this class.
Compression| Modifier and Type | Field and Description | 
|---|---|
| static int | DEFAULT_INTRO_LIMIT | 
| static java.lang.String | MARK_WORKAROUND_PROPERTY | 
| Constructor and Description | 
|---|
| DataSource()Constructs a DataSource with a default size of intro buffer. | 
| DataSource(int introLimit)Constructs a DataSource with a given size of intro buffer. | 
| Modifier and Type | Method and Description | 
|---|---|
| void | close()Closes any open streams owned and not yet dispatched by this
 DataSource. | 
| DataSource | forceCompression(Compression compress)Returns a DataSource representing the same underlying stream,
 but with a forced compression mode  compress. | 
| Compression | getCompression()Returns an object which will handle any required decompression
 for this stream. | 
| java.io.InputStream | getHybridInputStream()Returns an input stream which appears just the same as the
 one returned by  getInputStream(), but only incurs the
 expense of obtaining an actual input stream (by callinggetRawInputStream()if more bytes are read than the
 cached magic number. | 
| java.io.InputStream | getInputStream()Returns an InputStream containing the whole of this DataSource. | 
| static java.io.InputStream | getInputStream(java.lang.String location,
              boolean allowSystem)Returns an input stream based on the given location string. | 
| byte[] | getIntro()Returns the intro buffer, first reading it if this hasn't been
 done before. | 
| int | getIntroLimit()Returns the maximum length of the intro buffer. | 
| long | getLength()Returns the length of the stream returned by  getInputStreamin bytes, if known. | 
| static boolean | getMarkWorkaround()Returns true if we are working around potential bugs in InputStream
  InputStream.mark(int)/InputStream.reset()methods (common, including in J2SE classes). | 
| java.lang.String | getName()Returns a name for this source. | 
| java.lang.String | getPosition()Returns the position associated with this source. | 
| protected abstract java.io.InputStream | getRawInputStream()Provides a new InputStream for this data source. | 
| long | getRawLength()Returns the length in bytes of the stream returned by
  getRawInputStream, if known. | 
| java.lang.String | getSystemId()Returns a System ID for this DataSource; this is a string
 representation of a file name or URL, as used by
  Sourceand friends. | 
| java.net.URL | getURL()Returns a URL which corresponds to this data source, if one exists. | 
| static DataSource | makeDataSource(java.lang.String loc)Attempts to make a source given a string identifying its location
 as a file, URL or system command output. | 
| static DataSource | makeDataSource(java.lang.String loc,
              boolean allowSystem)Attempts to make a source given a string identifying its location
 as a file, URL or optionally a system command output. | 
| static DataSource | makeDataSource(java.net.URL url)Makes a source from a URL. | 
| void | setCompression(Compression compress)Sets the compression to be associated with this data source. | 
| void | setIntroLimit(int limit)Sets the maximum size of the intro buffer to a new value. | 
| static void | setMarkWorkaround(boolean workaround)Sets whether we want to work around bugs in InputStream mark/reset
 methods. | 
| void | setName(java.lang.String name)Sets the name of this source. | 
| void | setPosition(java.lang.String position)Sets the position associated with this source. | 
| java.lang.String | toString()Returns a short description of this source (name plus compression type). | 
public static final int DEFAULT_INTRO_LIMIT
public static final java.lang.String MARK_WORKAROUND_PROPERTY
public DataSource(int introLimit)
introLimit - the maximum number of bytes in the intro bufferpublic DataSource()
protected abstract java.io.InputStream getRawInputStream()
                                                  throws java.io.IOException
java.io.IOExceptionpublic java.net.URL getURL()
URL.openConnection() method call on the URL
 returned by this method should provide a stream with the
 same content as the getRawInputStream() method of this
 data source.  If no such URL exists or is known, then null
 should be returned.
 If this source has a non-null position value, it will be appended to the main part of the URL after a '#' character (as the URL's ref part).
nullpublic int getIntroLimit()
public void setIntroLimit(int limit)
limit - the new maximum length of the intro bufferpublic long getRawLength()
getRawInputStream, if known.  If the length is not known
 then -1 should be returned.
 The implementation of this method in DataSource returns -1;
 subclasses should override it if they can determine their length.public long getLength()
getInputStream
 in bytes, if known.
 A return value of -1 indicates that the length is unknown.
 The return value of this method may change from -1 to a positive
 value during the life of this object if it happens to work out
 how long it is.public java.lang.String getName()
getURL() method
 (or some suitable class-specific method) should be used.
 If this source has a position, it should probably form part of
 this name.public void setName(java.lang.String name)
name - a namegetName()public java.lang.String getPosition()
nullpublic void setPosition(java.lang.String position)
position - the new posisition (may be null)public java.lang.String getSystemId()
Source and friends.
 The return value may be null if none is known.
 This does not contain any reference to the position.nullpublic Compression getCompression() throws java.io.IOException
Compression.NONE is returned.java.io.IOExceptionpublic byte[] getIntro()
                throws java.io.IOException
introLimit and the length of the underlying uncompressed
 stream.
 The returned buffer is the original not a copy - don't change its contents!
introLimitjava.io.IOExceptionpublic void setCompression(Compression compress)
setCompression(Compression.NONE) can
 be used to force direct examination of the underlying stream
 without decompression, even if the underlying stream is in fact
 compressed.
 The effects of setting a compression to a mode (other than NONE) which does not match the actual compression mode of the underlying stream are undefined, so this method should be used with care.
compress - the compression mode encoding the underlying
         streampublic DataSource forceCompression(Compression compress)
compress.
 The returned DataSource object may be the same object
 as this one, but 
 if it has a different compression mode from compress
 a new one will be created.  As with setCompression(uk.ac.starlink.util.Compression),
 the consequences of using a different value of compress
 than the correct one (other than Compression.NONE
 are unpredictable.compress - the compression mode to be used for the returned
                   data sourcecompresspublic java.io.InputStream getInputStream()
                                   throws java.io.IOException
java.io.IOExceptionpublic java.io.InputStream getHybridInputStream()
                                         throws java.io.IOException
getInputStream(), but only incurs the
 expense of obtaining an actual input stream (by calling
 getRawInputStream() if more bytes are read than the
 cached magic number.  This is an efficient way to read if you
 need an InputStream but may only end up reading the first
 few bytes of it.java.io.IOExceptionpublic void close()
IOException
 thrown during closing any owned streams are simply discarded.public java.lang.String toString()
toString in class java.lang.Objectpublic static DataSource makeDataSource(java.lang.String loc) throws java.io.IOException
If a '#' character exists in the string, text after it will be
 interpreted as a position value.  Otherwise, the position is
 considered to be null.
 
Note: this method presents a security risk if the
 loc string is vulnerable to injection.
 Consider using the variant method
 makeDataSource(loc,false) in such cases.
 This method just calls makeDataSource(loc,true).
loc - the location of the data, with optional positionlocjava.io.IOException - if loc does not name
          an existing readable file or valid URLpublic static DataSource makeDataSource(java.lang.String loc, boolean allowSystem) throws java.io.IOException
The supplied loc may be one of the following:
 
allowSystem=true:
     a string preceded by "<" or followed by "|",
     giving a shell command line (may not work on all platforms)If a '#' character exists in the string, text after it will be
 interpreted as a position value.  Otherwise, the position is
 considered to be null.
 
Note: setting allowSystem=true may
 introduce a security risk if the loc string is
 vulnerable to injection.
loc - the location of the data, with optional positionallowSystem - whether to allow system commands
                      using the format abovelocjava.io.IOException - if loc does not name
          an existing readable file or valid URLpublic static DataSource makeDataSource(java.net.URL url)
url is a file-protocol URL
 referencing an existing file then 
 a FileDataSource will be returned, otherwise it will be
 a URLDataSource.  Under certain circumstances, it may
 be more efficient to use a FileDataSource than a URLDataSource,
 which is why this method may be worth using.url - location of the data streamurlpublic static java.io.InputStream getInputStream(java.lang.String location,
                                                 boolean allowSystem)
                                          throws java.io.IOException
allowSystem=true:
     a string preceded by "<" or followed by "|",
     giving a shell command line (may not work on all platforms)Note: setting allowSystem=true may
 introduce a security risk if the loc string is
 vulnerable to injection.
location - URL, filename, "cmdline|"/"<cmdline", or "-"allowSystem - whether to allow system commands
                      using the format abovelocationjava.io.FileNotFoundException - if location cannot be
          interpreted as a source of bytesjava.io.IOException - if there is an error obtaining the streampublic static boolean getMarkWorkaround()
InputStream.mark(int)/InputStream.reset()
 methods (common, including in J2SE classes).
 The return value is dependent on the system property named
 MARK_WORKAROUND_PROPERTY.public static void setMarkWorkaround(boolean workaround)
workaround - true to employ the workaround