diff options
author | Shawn O. Pearce <spearce@spearce.org> | 2010-10-29 17:50:26 -0700 |
---|---|---|
committer | Shawn O. Pearce <spearce@spearce.org> | 2010-11-01 14:08:45 -0700 |
commit | b88b693a3d8ad07801ce2027aaa25f68704b0fcf (patch) | |
tree | b79fc8db56867ac537b387ede356db9f045da45b /org.eclipse.jgit.test/tst/org/eclipse/jgit/diff | |
parent | 33ae28b4822653d89ac22633d3227895d4a32c1d (diff) | |
download | jgit-b88b693a3d8ad07801ce2027aaa25f68704b0fcf.tar.gz jgit-b88b693a3d8ad07801ce2027aaa25f68704b0fcf.zip |
Fix broken HistogramDiff
HistogramDiff failed on cases where the initial element for the LCS
was actually very common (e.g. has 20 occurrences), and the first
element of the inserted region after the LCS was also common but
had fewer occurrences (e.g. 10), while the LCS also contained a
unique element (1 occurrence).
This happens often in Java source code. The initial element for
the LCS might be the empty line ("\n"), and the inserted but common
element might be "\t/**\n", with the LCS being a large span of
lines that contains unique method declarations. Even though "/**"
occurs less often than the empty line its not a better LCS if the
LCS we already have contains a unique element.
The logic in HistogramDiff would normally have worked fine, except I
tried to optimize scanning of B by making tryLongestCommonSequence
return the end of the region when there are matching elements
found in A. This allows us to skip over the current LCS region,
as it has already been examined, but caused us to fail to identify
an element that had a lower occurrence count within the region.
The solution used here is to trade space-for-time by keeping a
table of A positions to their occurrence counts. This allows the
matching logic to always use the smallest count for this region,
even if the smallest count doesn't appear on the initial element.
The new unit test testEdit_LcsContainsUnique() verifies this new
behavior works as expected.
Bug: 328895
Change-Id: Id170783b891f645b6a8cf6f133c6682b8de40aaf
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Diffstat (limited to 'org.eclipse.jgit.test/tst/org/eclipse/jgit/diff')
-rw-r--r-- | org.eclipse.jgit.test/tst/org/eclipse/jgit/diff/HistogramDiffTest.java | 8 |
1 files changed, 8 insertions, 0 deletions
diff --git a/org.eclipse.jgit.test/tst/org/eclipse/jgit/diff/HistogramDiffTest.java b/org.eclipse.jgit.test/tst/org/eclipse/jgit/diff/HistogramDiffTest.java index ba0d7d8e35..f9353af174 100644 --- a/org.eclipse.jgit.test/tst/org/eclipse/jgit/diff/HistogramDiffTest.java +++ b/org.eclipse.jgit.test/tst/org/eclipse/jgit/diff/HistogramDiffTest.java @@ -74,6 +74,14 @@ public class HistogramDiffTest extends AbstractDiffTestCase { assertEquals(new Edit(3, 3, 2, 5), r.get(1)); // INSERT "SRR" } + public void testEdit_LcsContainsUnique() { + EditList r = diff(t("nqnjrnjsnm"), t("AnqnjrnjsnjTnmZ")); + assertEquals(new Edit(0, 0, 0, 1), r.get(0)); // INSERT "A"; + assertEquals(new Edit(9, 9, 10, 13), r.get(1)); // INSERT "jTn"; + assertEquals(new Edit(10, 10, 14, 15), r.get(2)); // INSERT "Z"; + assertEquals(3, r.size()); + } + public void testExceedsChainLength_DuringScanOfA() { HistogramDiff hd = new HistogramDiff(); hd.setFallbackAlgorithm(null); |