public class RowMatcher extends Object
MatchEngine
object, but the generic parts of
the matching algorithms are done here.
Note that since the LinkSets and other objects handled by this class may be very large when large tables are being matched, the algorithms in this class are coded carefully to use as little memory as possible. Techniques include removing items from one collection as they are added to another. This means that in many cases input values may be modified by the methods.
Some of the computationally intensive work done by this abstract class is defined as abstract methods to be implemented by concrete subclasses.
Modifier and Type | Field and Description |
---|---|
static int |
DFLT_PARALLELISM
Actual value for default parallelism (also limited by machine).
|
static int |
DFLT_PARALLELISM_LIMIT
Maximum suggested value for parallelism.
|
Modifier and Type | Method and Description |
---|---|
LinkSet |
createLinkSet()
Constructs a new empty LinkSet for use by this matcher.
|
static RowMatcher |
createMatcher(MatchEngine engine,
StarTable[] tables,
RowRunner runner)
Creates a RowMatcher instance.
|
LinkSet |
findGroupMatches(MultiJoinType[] joinTypes)
Returns a list of RowLink objects corresponding to a match
performed with this matcher's tables using its match engine.
|
LinkSet |
findInternalMatches(boolean includeSingles)
Returns a list of RowLink objects corresponding to all the internal
matches in this matcher's sole table using its match engine.
|
LinkSet |
findMultiPairMatches(int index0,
boolean bestOnly,
MultiJoinType[] joinTypes)
Returns a set of RowLink objects each of which represents matches
between one of the rows of a reference table and any of the other tables
which can provide matches.
|
LinkSet |
findPairMatches(PairMode pairMode)
Returns a set of RowLink objects corresponding to a pairwise match
between this matcher's two tables performed with its match engine.
|
ProgressIndicator |
getIndicator()
Returns the current progress indicator for this matcher.
|
void |
setIndicator(ProgressIndicator indicator)
Sets the progress indicator for this matcher.
|
public static final int DFLT_PARALLELISM_LIMIT
public static final int DFLT_PARALLELISM
public void setIndicator(ProgressIndicator indicator)
indicator
- new indicatorpublic ProgressIndicator getIndicator()
public LinkSet createLinkSet()
public LinkSet findPairMatches(PairMode pairMode) throws IOException, InterruptedException
pairMode
- matching mode to determine which rows appear
in the resultIOException
InterruptedException
public LinkSet findMultiPairMatches(int index0, boolean bestOnly, MultiJoinType[] joinTypes) throws IOException, InterruptedException
PairsRowLink
.index0
- index of the reference table in the list of tables
owned by this row matcherbestOnly
- true if only the best match between the reference
table and any other table should be retainedjoinTypes
- inclusion criteria for output table rowsIOException
InterruptedException
public LinkSet findGroupMatches(MultiJoinType[] joinTypes) throws IOException, InterruptedException
joinTypes
- inclusion criteria for output table rowsRowLink
s corresponding to the selected rowsIOException
InterruptedException
public LinkSet findInternalMatches(boolean includeSingles) throws IOException, InterruptedException
includeSingles
- whether to include unmatched (singleton)
row links in the returned link setRowLink
objects giving all the groups of
matched objects in this matcher's sole tableIOException
InterruptedException
public static RowMatcher createMatcher(MatchEngine engine, StarTable[] tables, RowRunner runner)
engine
- matching enginetables
- the array of tables on which matches are to be donerunner
- RowRunner to control multithreading,
or null to fall back to sequential implementationCopyright © 2024 Central Laboratory of the Research Councils. All Rights Reserved.