uk.ac.starlink.table.join
Class RowMatcher

java.lang.Object
  |
  +--uk.ac.starlink.table.join.RowMatcher

public class RowMatcher
extends Object

Performs matching on the rows of one or more tables. The specifics of what constitutes a matched row, and some additional intelligence about how to determine this, are supplied by an associated MatchEngine object, but the generic parts of the matching algorithms are done here.


Constructor Summary
RowMatcher(MatchEngine engine, StarTable[] tables)
          Constructs a new matcher with match characteristics defined by a given matching engine.
 
Method Summary
 List findGroupMatches(boolean[] useAll)
          Returns a list of RowLink objects corresponding to a match performed with this matcher's tables using its match engine.
 List findInternalMatches(boolean includeSingles)
          Returns a list of RowLink objects corresponding to all the internal matches in this matcher's sole table using its match engine.
 Map findPairMatches(boolean req1, boolean req2)
          Returns a set of RowLink objects corresponding to a match between this matcher's two tables performed with its match engine.
 ProgressIndicator getIndicator()
          Returns the current progress indicator for this matcher.
 Set getPossibleInterLinks(int index1, int index2, Comparable[] min, Comparable[] max)
          Gets a list of all the pairs of rows which constitute possible links between two tables.
 void setIndicator(ProgressIndicator indicator)
          Sets the progress indicator for this matcher.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RowMatcher

public RowMatcher(MatchEngine engine,
                  StarTable[] tables)
Constructs a new matcher with match characteristics defined by a given matching engine.

Parameters:
engine - matching engine
tables - the array of tables on which matches are to be done
Method Detail

setIndicator

public void setIndicator(ProgressIndicator indicator)
Sets the progress indicator for this matcher.

Parameters:
indicator - new indicator

getIndicator

public ProgressIndicator getIndicator()
Returns the current progress indicator for this matcher.

Returns:
indicator

findPairMatches

public Map findPairMatches(boolean req1,
                           boolean req2)
                    throws IOException,
                           InterruptedException
Returns a set of RowLink objects corresponding to a match between this matcher's two tables performed with its match engine. Each element in the returned list basically corresponds to a matched pair with one entry from each of the input tables, however using the req1 and req2 arguments you can specify whether both input tables need to be represented. Each input table row appears in no more than one RowLink in the returned list.

The returned value is a RowLink->Number mapping; where the value is not null, it represents the match score corresponding to the link. Being a Map, this isn't ordered, but the natural ordering of the keys does give you a sensible ordering of rows for the output table.

Parameters:
req1 - whether an entry from the first table must be present in each element of the result
req2 - whether an entry from the second table must be present in each element of the result
Returns:
RowLink->Number mapping
IOException
InterruptedException

findGroupMatches

public List findGroupMatches(boolean[] useAll)
                      throws IOException,
                             InterruptedException
Returns a list of RowLink objects corresponding to a match performed with this matcher's tables using its match engine. Each element in the returned list corresponds to a matched group of input rows, with no more than one entry from each table. Each input table row appears in no more than one RowLink in the returned list. Whether each returned RowLink must contain an entry from every input table is determined by the useAll argument. Any number of tables can be matched.

Parameters:
useAll - array of booleans indicating for each table whether all rows are to be used (otherwise just matched)
Returns:
list of RowLinks corresponding to the selected rows
IOException
InterruptedException

findInternalMatches

public List findInternalMatches(boolean includeSingles)
                         throws IOException,
                                InterruptedException
Returns a list of RowLink objects corresponding to all the internal matches in this matcher's sole table using its match engine.

Parameters:
includeSingles - whether to include unmatched (singleton) row links in the returned link set
Returns:
a list of RowLink objects giving all the groups of matched objects in this matcher's sole table
IOException
InterruptedException

getPossibleInterLinks

public Set getPossibleInterLinks(int index1,
                                 int index2,
                                 Comparable[] min,
                                 Comparable[] max)
                          throws IOException,
                                 InterruptedException
Gets a list of all the pairs of rows which constitute possible links between two tables. For efficiency reasons, the table at index1 ought to be the one with fewer rows in the match region.

Parameters:
index1 - index of the first table
index2 - index of the second table
min - array of tuple elements to consider as minimum values to consider for the match. If one of the elements, or min itself, is null, no minimum is considered to be in effect
max - array of tuple elements to consider as maximum values to consider for the match. If one of the elements, or max itself, is null, no maximum is considered to be in effect
Returns:
set of RowLink objects which constitute possible matches
IOException
InterruptedException

Copyright © 2004 CLRC: Central Laboratory of the Research Councils. All rights reserved.