uk.ac.starlink.table.join
Class EqualsMatchEngine

java.lang.Object
  extended by uk.ac.starlink.table.join.EqualsMatchEngine
All Implemented Interfaces:
MatchEngine

public class EqualsMatchEngine
extends java.lang.Object
implements MatchEngine

Match engine which considers two rows matched if they contain objects which are non-blank and equal. The objects will typically be strings, but could equally be something else. Match scores are always either 0.0 (equal) or -1.0 (unequal).

The equality is roughly in the sense of Object.equals(java.lang.Object), but some additional work is done, so that for instance (multi-dimensional) arrays are compared (recursively) on their contents, and blank objects are compared in the sense used in the rest of STIL. A blank value is not considered equal to anything, including another blank value.

Since:
25 Mar 2004

Field Summary
 
Fields inherited from interface uk.ac.starlink.table.join.MatchEngine
NO_BINS
 
Constructor Summary
EqualsMatchEngine()
           
 
Method Summary
 boolean canBoundMatch()
          Indicates that the MatchEngine.getMatchBounds(java.lang.Comparable[], java.lang.Comparable[]) method can be invoked to provide some sort of useful result.
 java.lang.Object[] getBins(java.lang.Object[] tuple)
          Returns a set of keys for bins into which possible matches for a given tuple might fall.
 java.lang.Comparable[][] getMatchBounds(java.lang.Comparable[] min, java.lang.Comparable[] max)
          Given a range of tuple values, returns a range outside which no match to anything within that range can result.
 DescribedValue[] getMatchParameters()
          Returns a set of DescribedValue objects whose values can be modified to modify the matching criteria.
 ValueInfo getMatchScoreInfo()
          The match score is uninteresting, since it's either -1 or 0.
 DescribedValue[] getTuningParameters()
          Returns a set of DescribedValue objects whose values can be modified to tune the performance of the match.
 ValueInfo[] getTupleInfos()
          Returns a set of ValueInfo objects indicating what is required for the elements of each tuple.
 double matchScore(java.lang.Object[] tuple1, java.lang.Object[] tuple2)
          Indicates whether two tuples count as matching each other, and if so how closely.
 java.lang.String toString()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

EqualsMatchEngine

public EqualsMatchEngine()
Method Detail

matchScore

public double matchScore(java.lang.Object[] tuple1,
                         java.lang.Object[] tuple2)
Description copied from interface: MatchEngine
Indicates whether two tuples count as matching each other, and if so how closely. If tuple1 and tuple2 are considered as a matching pair, then a non-negative value should be returned indicating how close the match is - the higher the number the worse the match, and a return value of zero indicates a 'perfect' match. If the two tuples do not consitute a matching pair, then a negative number (conventionally -1.0) should be returned. This return value can be thought of as (and will often correspond physically with) the distance in some real or notional space between the points represented by the two submitted tuples.

If there's no reason to do otherwise, the range 0..1 is recommended for successul matches. However, if the result has some sort of physical meaning (such as a distance in real space) that may be used instead.

Specified by:
matchScore in interface MatchEngine
Parameters:
tuple1 - one tuple
tuple2 - the other tuple
Returns:
'distance' between tuple1 and tuple2; 0 is a perfect match, larger values indicate worse matches, negative values indicate no match

getBins

public java.lang.Object[] getBins(java.lang.Object[] tuple)
Description copied from interface: MatchEngine
Returns a set of keys for bins into which possible matches for a given tuple might fall. The returned objects can be anything, but should have their equals and hashCode methods implemented properly for comparison.

Specified by:
getBins in interface MatchEngine
Parameters:
tuple - tuple
Returns:
set of bin keys which might be returned by invoking this method on other tuples which count as matches for the submitted tuple

getMatchScoreInfo

public ValueInfo getMatchScoreInfo()
The match score is uninteresting, since it's either -1 or 0. We flag this by returning null here.

Specified by:
getMatchScoreInfo in interface MatchEngine
Returns:
null

getTupleInfos

public ValueInfo[] getTupleInfos()
Description copied from interface: MatchEngine
Returns a set of ValueInfo objects indicating what is required for the elements of each tuple. The length of this array is the number of elements in the tuple. Each element should at least have a defined name and content class. The info's nullable attribute has a special meaning: if true it means that it makes sense for this element of the tuple to be always blank (for instance assigned to no column).

Specified by:
getTupleInfos in interface MatchEngine
Returns:
array of objects describing the requirements on each element of the tuples used for matching

getMatchParameters

public DescribedValue[] getMatchParameters()
Description copied from interface: MatchEngine
Returns a set of DescribedValue objects whose values can be modified to modify the matching criteria. Typically at least one of these will be some sort of tolerance separation which determines how close tuples must be to count as a match. This match engine's behaviour can be modified by calling DescribedValue.setValue(java.lang.Object) on the returned objects.

Specified by:
getMatchParameters in interface MatchEngine
Returns:
array of described values which influence the match

getTuningParameters

public DescribedValue[] getTuningParameters()
Description copied from interface: MatchEngine
Returns a set of DescribedValue objects whose values can be modified to tune the performance of the match. This match engine's performance can be influenced by calling DescribedValue.setValue(java.lang.Object) on the returned objects.

Changing these values will make no difference to the output of MatchEngine.matchScore(java.lang.Object[], java.lang.Object[]), but may change the output of MatchEngine.getBins(java.lang.Object[]). This may change the CPU and memory requirements of the match, but will not change the result. The default value should be something sensible, so that setting the value of these parameters is not in general required.

Specified by:
getTuningParameters in interface MatchEngine
Returns:
array of described values which may influence match performance

canBoundMatch

public boolean canBoundMatch()
Description copied from interface: MatchEngine
Indicates that the MatchEngine.getMatchBounds(java.lang.Comparable[], java.lang.Comparable[]) method can be invoked to provide some sort of useful result.

Specified by:
canBoundMatch in interface MatchEngine
Returns:
true iff getMatchBounds may provide useful information

getMatchBounds

public java.lang.Comparable[][] getMatchBounds(java.lang.Comparable[] min,
                                               java.lang.Comparable[] max)
Description copied from interface: MatchEngine
Given a range of tuple values, returns a range outside which no match to anything within that range can result. If the tuples on which this engine works represent some kind of space, the input values and output values specify a hyper-rectangular region of this space. In the common case in which the match criteria are based on proximity in this space up to a certain error, this method should return a rectangle which is like the input one but broadened in each direction by an amount corresponding to the error.

Both the input and output rectangles are specified by tuples representing its opposite corners; equivalently, they are the minimum and maximum values of each tuple element. In either the input or output min/max tuples, any element may be null to indicate that no information is available on the bounds of that tuple element (coordinate).

This method can be used by match algorithms which know in advance the range of coordinates they will match against and wish to reduce workload by not attempting matches which are bound to fail.

For example, a 1-d Cartesian match engine with an isotropic match error 0.5 would turn input values of ((0,200),(10,210)) into output values ((-0.5,199.5),(10.5,210.5)).

This method will only be called if MatchEngine.canBoundMatch() returns true. Thus engines that cannot provide any useful information along these lines (for instance because none of its tuple elements is Comparable do not need to implement it in a meaningful way.

Specified by:
getMatchBounds in interface MatchEngine
Parameters:
min - tuple consisting of the minimum values of each tuple element in a possible match (to put it another way - coordinates of one corner of a tuple-space rectangle containing such a match)
max - tuple consisting of the maximum values of each tuple element in a possible match (to put it another way - coordinates of the other corner of a tuple-space rectangle containing such a match)
Returns:
2-element array of tuples - effectively (minTuple,maxTuple) broadened by errors
See Also:
MatchEngine.canBoundMatch()

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object

Copyright © 2004 CLRC: Central Laboratory of the Research Councils. All rights reserved.