This package implements a general tree comparison as a special case of visiting generic trees pairwise. Trees are wrapped as GenericTreeNode, which has a static method boolean traverse(..) which accepts a visitor and traverses a pair of trees, calling the visitor on each.

This package supports four forms of generality through the following classes:
Classes Capability
GenericTreeNode Able to handle trees of varying types by wrapping in a generic form.
GenericTreeNodesVisitorI, GenericTreeNode.traverse(..) Can handle any type of pairwise visitor function by accepting visitor in the traverse method.
{java.util.Comparator}, GenericTreeNode.getComparator() The object comparison can be sensitive to the type of the object on a per-object basis, established during the process of wrapping the tree.
GenericTreeListOrdererI, GenericTreeListOrdererFactoryI The ordering of children can be appropriate to the objective of the traversal. e.g., when computing "set intersection" rather than "list equals", the order of children might be changed to align matching children for the visits.
This ordering can be determined as appropriate for each list comparison by implementing a factory which selects from the appropriate orderers. Any factory product is used by the traverse(..) method to order children before visiting.

Supported variants: The following variants are implemented or planned using the components above:
ComponentDescription
Current
GenericTreeNode.PRINTALL A visitor which prints out the entire tree.
GenericTreeNode.PRINTERR A visitor which prints the nonmatching pairs.
GenericTreeNode.EXACT A visitor which returns false if any pairs do not match.
TreeCompare A sample program to read in serialized trees and compare them. (but see Structure in the compare subpackage for a better example)
CompareUtil Misc comparison utilities (e.g., for short-circuiting comparisons).
Planned
GenericTreeNode.KIDDIFF A visitor which calculates tree differences, using ordering children (catches swaps, missed or added elements, within children)
GenericTreeNode.TREEDIFF A visitor which calculates tree differences, accumulating the tree (catches swaps, missed or added elements, throughout tree.)

Use: Run TreeCompare to use the comparer from the command line on a supported tree, (currently only the Swing TreeNode implemented as DefaultMutableTreeNode).

Programming:
To support a new tree, see the Structure example in the compare subpackage or use example of the Swing TreeNode:

  • Write an adapter that uses GenericTreeNode to wrap your tree nodes
  • Write a factory that produces a wrapped tree
  • Write a Comparator that compares the underlying node object to include with each node object wrapped. Be sure to implement the Comparator.equals(Object) method correctly, i.e., returning true when it is equivalent to another comparator. It is best to use a singleton for each type of node you support.
  • Optionally write a visitor to perform whatever operations you wish. Note that visitors must tolerate a single null input in order to completely traverse unmatching trees.
  • To perform operations requiring preprocessing of child List's, write children orderer(s) and provide a factory to select them.
  • To design new algorithms/applications, bear in mind the main tools:

  • The comparator that follows the node object
  • The visitor that traverses the tree
  • The child orderer that may preprocess the child lists

  • In particular, when going beyond pair-wise comparisons to list-wise or tree-wise comparisons, you'll have to decide where to put the appropriate logic. You may have a relatively lightweight visitor and a heavyweight orderer, or no orderer at all. In that case, you may need to invoke a special method on your visitor after the traversal completes to do any final processing.

    Future Work

    :

    Smarter group comparisons:
    Does calculating maps help with diagnosing problems?

    Given two lists,
      A [ a, b, c, d, e, f, g, h ]
      B [ a, e, c, d, b, g, h ]
    The result should say:
      - B swapped order of e and b
      - B omitted f
    Note the match-map (index of matching element, if any):
      A->B [ 0, 4, 2, 3, 1, -, 5, 6 ]
      B->A [ 0, 4, 2, 3, 1, 6, 7 ]
    the shift-map (difference between expected and actual order):
      A->B [ 0, 3, 0, 0, -3, -, -1, -1 ]
      B->A [ 0, 3, 0, 0, -3, 1, 1 ]
    
    Thus:
    - detect swaps as complementary out-of-index order pairs
      (todo: three-way or n-ary?)
      - fix or ignore swaps
    - detect shifts as complementary series
      where shift-width n is +/- for both
      - -n => 
        - if negative element is actual, n elements omitted
        - if negative element is expected, n elements added