]> source.dussan.org Git - jgit.git/log
jgit.git
13 years agoUse readFully() instead of read() 20/1820/1
Robin Stocker [Fri, 29 Oct 2010 12:52:52 +0000 (14:52 +0200)]
Use readFully() instead of read()

Fixes the "Method ignores results of InputStream.read()" warning.

This is the only place where read() was used instead of readFully()
and the return value was not checked. So it was either an oversight
or should be documented. This change assumes it was an oversight.

Change-Id: I859404a7d80449c538a552427787f3e57d7c92b4

13 years agoMerge "Fix Severe Bug in Merge Algorithm"
Shawn Pearce [Thu, 28 Oct 2010 19:54:36 +0000 (15:54 -0400)]
Merge "Fix Severe Bug in Merge Algorithm"

13 years agoFix Severe Bug in Merge Algorithm 09/1809/3
Christian Halstrick [Wed, 27 Oct 2010 20:12:46 +0000 (22:12 +0200)]
Fix Severe Bug in Merge Algorithm

As described in Bug 328551 there was a bug that the merge algorithm
was not always reporting conflicts when the same line was deleted
and modified. This problem was introduced during commit
0c017188b4d41cc80c297e35097095026734b3d4 when reported conflicts have
been checked for common pre- and suffixes.

This was fixed here by better determining whether after stripping
off common prefixes and suffixes from a conflicting region there
is still some conflicting part left.
I also added a unit test to test this situation.

Bug: 328551
Change-Id: Iec6c9055d00e5049938484a27ab98dda2577afc4
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
13 years agoPullCommand: support upstream configuration for local branches 47/1747/4
Mathias Kinzler [Tue, 19 Oct 2010 06:54:28 +0000 (08:54 +0200)]
PullCommand: support upstream configuration for local branches

When creating a local branch based on another local branch, the
upstream configuration contains "." as origin and the source branch
as "merge". The PullCommand should support this by skipping the
fetch step altogether and use the base branch to merge with.

Change-Id: I260a1771aeeffca5b0161d1494fd63c672ecc2a6
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
13 years agoMerge "Fix FQCN of moved classes in FindBugsExcludeFilter.xml"
Shawn Pearce [Thu, 28 Oct 2010 15:47:44 +0000 (11:47 -0400)]
Merge "Fix FQCN of moved classes in FindBugsExcludeFilter.xml"

13 years agoMerge "Make AbbreviatedObjectId serializable"
Shawn Pearce [Thu, 28 Oct 2010 15:47:06 +0000 (11:47 -0400)]
Merge "Make AbbreviatedObjectId serializable"

13 years agoMerge "Fix oddness check in MyersDiff for negative numbers"
Shawn Pearce [Thu, 28 Oct 2010 15:46:41 +0000 (11:46 -0400)]
Merge "Fix oddness check in MyersDiff for negative numbers"

13 years agoMake AbbreviatedObjectId serializable 12/1812/1
Robin Stocker [Thu, 28 Oct 2010 15:40:15 +0000 (17:40 +0200)]
Make AbbreviatedObjectId serializable

AmbiguousObjectException contains an AbbreviatedObjectId and is
supposed to be serializable, so it should be serializable as well.

Change-Id: I8056e78aee20fdd3cb9600b52cd8ed988544293d

13 years agoFix FQCN of moved classes in FindBugsExcludeFilter.xml 11/1811/1
Robin Stocker [Thu, 28 Oct 2010 15:38:16 +0000 (17:38 +0200)]
Fix FQCN of moved classes in FindBugsExcludeFilter.xml

FindBugs would generate warnings for these even though they should
be ignored.

Change-Id: Ieccadbf11fd55853541c04857d8e79a4db014cb4

13 years agoFix oddness check in MyersDiff for negative numbers 13/1813/1
Robin Stocker [Thu, 28 Oct 2010 15:37:21 +0000 (17:37 +0200)]
Fix oddness check in MyersDiff for negative numbers

It's probably not possible that these numbers are negative in the
algorithm, but it's cleaner this way and gets rid of three more
FindBugs warnings.

Change-Id: Ifbce4e2c787fb9a7cd309c605e8d86211ef8a352

13 years agoFix FindBugs and Eclipse warnings in org.eclipse.jgit.ui 14/1814/1
Robin Stocker [Thu, 28 Oct 2010 14:37:49 +0000 (16:37 +0200)]
Fix FindBugs and Eclipse warnings in org.eclipse.jgit.ui

Change-Id: Ie6b3ff7d470cc9b7044fd6288cbf86dcc58220eb

13 years agoMerge "Call ProgressMonitor.update() from main thread"
Shawn O. Pearce [Wed, 27 Oct 2010 15:37:55 +0000 (11:37 -0400)]
Merge "Call ProgressMonitor.update() from main thread"

13 years agoCall ProgressMonitor.update() from main thread 07/1707/5
Shawn O. Pearce [Thu, 14 Oct 2010 20:06:54 +0000 (13:06 -0700)]
Call ProgressMonitor.update() from main thread

Don't permit transient worker threads to access the underlying output
stream of a ProgressMonitor, as they might get marked as the stream's
writer thread.  Instead proxy update events from the workers back onto
the application's real work thread.  This ensures that the stream only
sees a single thread, and its the thread that will remain alive for
the entire life cycle of the operation.

This fixes IOException("Write end dead") during local repository fetch
when threaded delta search is enabled.  One of the transient delta
search threads became the designated writer for the pipe, and when it
terminated the reader end thought the writer was dead, even though the
main writer thread was still executing in PackWriter.

Bug: 326557
Change-Id: I01d1b20a3d7be1c0b480c7fb5c9773c161fe5c15
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoPrevent endless loop of events fired by RefsDirectory 04/1804/1
Christian Halstrick [Wed, 27 Oct 2010 08:37:05 +0000 (10:37 +0200)]
Prevent endless loop of events fired by RefsDirectory

RefsDirectory fires a RefsChangedEvent when it detect that one
ref changed (new, modified, deleted). But there was a potential
of wrong events beeing fired leading to a endless loop in EGit.
Problem is that when calling getRefs(ALL) we don't want to report
additional refs and by that we remove the additional refs from
the list of "refs reported upwards last time". We fire an
RefsChangedEvent because we think that the special refs are not
there anymore.
I fixed this by removing eventing for the additional refs. Another
alternative would be to always scan also for additional refs and
put them in the list of refs. But getRefs(ALL) would then remove
the additional refs again. I didn't do that for performance reasons
and also because I am not sure whether we want evnting for
additional refs.

Change-Id: Icb9398b55a8c6bbf03e38f6670feb67754ce91e0
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
13 years agoOptimize DirCacheCheckout 82/1782/4
Lluis Sanchez [Tue, 26 Oct 2010 09:18:18 +0000 (11:18 +0200)]
Optimize DirCacheCheckout

When checking out a tree, files that are identical to the file in
the current index and working directory don't need to be updated.

Change-Id: I9e025a53facd42410796eae821baaeff684a25c5
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
13 years agoAdd option to select diff algorithm for diff command 03/1803/1
Christian Halstrick [Tue, 26 Oct 2010 16:04:35 +0000 (18:04 +0200)]
Add option to select diff algorithm for diff command

The diff command in the pgm package was enhanced to allow
choosing the diff algorithm (currently myers or histogram)

Change-Id: I72083e78fb5c92868eb5d8ec512277d212a39349
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
13 years agoMerge "[findbugs] Respect exclude filter in maven build"
Shawn Pearce [Mon, 25 Oct 2010 15:06:15 +0000 (11:06 -0400)]
Merge "[findbugs] Respect exclude filter in maven build"

13 years agoMerge "Allow setting a filter in IndexDiff"
Christian Halstrick [Mon, 25 Oct 2010 12:37:59 +0000 (08:37 -0400)]
Merge "Allow setting a filter in IndexDiff"

13 years agoAllow setting a filter in IndexDiff 79/1779/3
Jens Baumgart [Mon, 25 Oct 2010 11:00:13 +0000 (13:00 +0200)]
Allow setting a filter in IndexDiff

IndexDiff now allows to set an additional filter. This can be used
e.g. for restricting the tree walk to a given set of files.

Change-Id: I642de17e74b997fa0c5878c90631f6640ed70bdd
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
13 years agoAdd support for special symref FETCH_HEAD and MERGE_HEAD 72/1772/3
Christian Halstrick [Wed, 20 Oct 2010 15:39:19 +0000 (17:39 +0200)]
Add support for special symref FETCH_HEAD and MERGE_HEAD

The RefDirectory class was not returning FETCH_HEAD and
MERGE_HEAD when trying to get all refs via getRefs(RefDatabase.ALL).
This fix adds constants for FETCH_HEAD and ORIG_HEAD and adds a
new getter getAdditionalRefs() to get these additional refs.
To be compatible with c git the getRefs(ALL) method will not return
FETCH_HEAD, MERGE_HEAD and ORIG_HEAD.

Change-Id: Ie114ca92e9d5e7d61d892f4413ade65acdc08c32
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
13 years ago[findbugs] Fix illegal format specifier 85/1785/2
Matthias Sohn [Sat, 23 Oct 2010 21:56:06 +0000 (23:56 +0200)]
[findbugs] Fix illegal format specifier

For integral arguments the precision is not applicable, would cause a
runtime exception when executed, see
http://download.oracle.com/javase/1.5.0/docs/api/java/util/Formatter.html#syntax

Change-Id: I4738c64c1153a8d4ef5430e15d0fe54f0a37949f
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
13 years ago[findbugs] Respect exclude filter in maven build 86/1786/1
Matthias Sohn [Sat, 23 Oct 2010 21:29:25 +0000 (23:29 +0200)]
[findbugs] Respect exclude filter in maven build

Change-Id: Ic29310dc14f120ebdb49d33cbf4bd6d380ae1393
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
13 years ago[findbugs] Static comparator made final 84/1784/1
Matthias Sohn [Sat, 23 Oct 2010 20:25:29 +0000 (22:25 +0200)]
[findbugs] Static comparator made final

Fixing FindBugs warning MS_SHOULD_BE_FINAL.

Change-Id: Ic69e6f6425e0a8950ce809eb3894f48a33e860aa
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
13 years agoAdd FindBugs and CPD to the build. 70/1770/5
Chris Aniszczyk [Thu, 14 Oct 2010 16:26:46 +0000 (09:26 -0700)]
Add FindBugs and CPD to the build.

We need to use findbugs-maven-plugin:2.3.2-SNAPSHOT
since otherwise build fails with maven-3.0 [1], [2].
We should switch to the release version as soon
as this becomes available.

[1] http://www.sonatype.com/people/2010/10/maven-3-0-has-landed/
[2] http://jira.codehaus.org/browse/MFINDBUGS-122

Bug: 327799
Change-Id: I1c57f81cf6f0450e56411881488c4ee754e458e3
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
14 years agoMake ObjectDirectory getPacks() work the first time 58/1758/3
Shawn O. Pearce [Tue, 19 Oct 2010 22:51:44 +0000 (00:51 +0200)]
Make ObjectDirectory getPacks() work the first time

If an object hasn't been accessed yet the pack list for a repository
may not have been scanned from disk.  If an application (e.g. the dumb
transport servlet support code) asks for the pack list for an
ObjectDirectory, we should load it immediately.

Change-Id: I93d7b1bca422d905948e8e83b2afa83c8894a68b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
14 years agoUpdate CachedObjectDirectory when inserting objects 60/1760/1
Shawn O. Pearce [Mon, 18 Oct 2010 06:10:47 +0000 (23:10 -0700)]
Update CachedObjectDirectory when inserting objects

If an ObjectInserter is created from a CachedObjectDirectory, we need
to ensure the cache is updated whenever a new loose object is actually
added to the loose objects directory, otherwise a future read from an
ObjectReader on the CachedObjectDirectory might not be able to open
the newly created object.

We mostly had the infrastructure in place to implement this due to the
injection of unpacked large deltas, but we didn't have a way to pass
the ObjectId from ObjectDirectoryInserter to CachedObjectDirectory,
because the inserter was using the underlying ObjectDirectory and not
the CachedObjectDirectory.  Redirecting to CachedObjectDirectory
ensures the cache is updated.

Change-Id: I1f7bdfacc7ad77ebdb885f655e549cc570652225
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoIndexPack: Make translated progress messages non-static 59/1759/1
Shawn O. Pearce [Mon, 18 Oct 2010 05:21:38 +0000 (22:21 -0700)]
IndexPack: Make translated progress messages non-static

These messages may need to change depending on the current
thread's configured locale, and thus cannot be static.

Change-Id: I96751a63852ec9c4bf6c47edadcf8752700543df
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoMerge "Allow pgm Main to be extended"
Matthias Sohn [Sun, 17 Oct 2010 22:21:49 +0000 (18:21 -0400)]
Merge "Allow pgm Main to be extended"

14 years agoFix possible NPE in DirCacheCheckout 55/1755/2
Christian Halstrick [Fri, 15 Oct 2010 12:51:52 +0000 (14:51 +0200)]
Fix possible NPE in DirCacheCheckout

There was a chance that we hit a NPE which doing a checkout
with DirCacheCheckout when there is no HEAD (e.g. initial
checkout). This is fixed here.

Change-Id: Ie3b8cae21dcd90ba8352823ea94a700f8ee9221a
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
14 years agoAllow pgm Main to be extended 57/1757/2
Shawn O. Pearce [Sun, 17 Oct 2010 04:33:06 +0000 (21:33 -0700)]
Allow pgm Main to be extended

3rd party packages that use repository types other than FileRepository
may wish to extend our pgm package and implement their own resolution
scheme for repository "names" that are passed in by the --git-dir
command line option.  Make that possible by allowing the package to
extend the Main class and override the lookup.

This is primarily useful when developing new storage implementations
and trying to experiment with the results, without linking all of it
into the core JGit package.

Change-Id: Id30e168da16341e5da43365688a63aa30c7b7e2c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoAdd Cherry-Pick command 34/1734/3
Christian Halstrick [Fri, 8 Oct 2010 13:39:04 +0000 (15:39 +0200)]
Add Cherry-Pick command

Implemented the initial version of a cherry-pick command.
A correct error handling is missing (what happens if the
checkout fails, the cherry-pick leads to conflicts etc).
But straightforward cherry-picks works.

Change-Id: I235c0eb3a7a2d5bdfe40400f1deed06f29d746e1
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
14 years agoAdd getString utility functions to RawText 44/1744/1
Shawn O. Pearce [Thu, 14 Oct 2010 02:05:52 +0000 (19:05 -0700)]
Add getString utility functions to RawText

These routines can be useful when debugging, because we can add an
expression to the Eclipse "Expressions" panel to show the text that
appears on a line.  Gerrit Code Review also uses these in its own
subclass of RawText in order to format patch files, so pulling it up
to be part of core JGit may help other applications too.

Change-Id: I20a6b112e3403ecfc1c2715ae75dcecc1a85b167
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoRemove dead RawText(RawTextComparator) constructor 43/1743/1
Shawn O. Pearce [Thu, 14 Oct 2010 03:47:08 +0000 (20:47 -0700)]
Remove dead RawText(RawTextComparator) constructor

Since the introduction of HashedSequence we no longer need to supply
the RawTextComparator at the time of constructing a RawText.  Drop the
definition from the constructor, because it doesn't make sense as part
of our public API.

Change-Id: Iaab34611d60eee4a2036830142b089b2dae81842
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoFix RawTextComparator reduceCommonStartEnd at empty lines 42/1742/1
Shawn O. Pearce [Thu, 14 Oct 2010 03:43:11 +0000 (20:43 -0700)]
Fix RawTextComparator reduceCommonStartEnd at empty lines

When an empty line was inserted at the beginning of the common end
part of a RawText the comparator incorrectly considered it to be
common, which meant the DiffAlgorithm would later not even have it be
part of the region it examines.  This would cause JGit to skip a line
of insertion, which later confused Gerrit Code Review when it tried to
match up the pre and post RawText files for a difference that had this
type of insertion.

Define two new unit tests to check for this insertion of a blank line
condition and correct for it by removing the LF from the common region
when the condition is detected.

Change-Id: I2108570eb2929803b9a56f9fb9c400c758e7156b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoCorrect spelling of tests in HistogramDiffTest 41/1741/1
Shawn O. Pearce [Thu, 14 Oct 2010 03:41:17 +0000 (20:41 -0700)]
Correct spelling of tests in HistogramDiffTest

Change-Id: I003b601f384ff1213da6750dd13846367a511d0b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoFix three-word command names 22/1722/2
Shawn O. Pearce [Mon, 11 Oct 2010 00:49:26 +0000 (17:49 -0700)]
Fix three-word command names

Command names like MakeCacheTree weren't coming up with hyphens between
every word, so they read "debug-make-cachetree" rather than the
expected "debug-make-cache-tree".  On each lowercase character reset
the lastWasDash flag so the next uppercase will insert a hyphen before
the next word.

Change-Id: I539fabb339e60896165619c307dec71e3317b0d8
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
14 years agoGenerate correct version for jgit source bundle 39/1739/2
Matthias Sohn [Wed, 13 Oct 2010 13:26:33 +0000 (15:26 +0200)]
Generate correct version for jgit source bundle

The maven 2 build for jgit source bundle didn't create a correct
OSGi version string, instead of
    org.eclipse.jgit.source_0.10.0.<timestamp>
the generated OSGi version was
    org.eclipse.jgit.source_0.10.0.SNAPSHOT
This caused trouble when trying to install it from p2 repository.

Bug: 327616
Change-Id: Ic27c763ae9a4bcbb5bd6ed9562cd98bb4da3386b
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
14 years agoRename method to ResolveMerger.setWorkingTreeIterator() 38/1738/1
Christian Halstrick [Wed, 13 Oct 2010 09:11:12 +0000 (11:11 +0200)]
Rename method to ResolveMerger.setWorkingTreeIterator()

renamed an ugly methodname

Change-Id: I26bda06ef64b8644fd3a555dc55dff43cdb56a71
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
14 years agoDeleteBranchCommand does not clean up upstream configuration 27/1727/2
Mathias Kinzler [Mon, 11 Oct 2010 13:33:32 +0000 (15:33 +0200)]
DeleteBranchCommand does not clean up upstream configuration

It wrongly uses the full name of the branch to remove the
configuration entries but must use the shortened one.

Change-Id: Ie386a128a6c6beccc20bafd15c2e36254c5f560d
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
14 years agoMerge "Update Tag to use TagCommand API"
Shawn Pearce [Tue, 12 Oct 2010 19:05:55 +0000 (15:05 -0400)]
Merge "Update Tag to use TagCommand API"

14 years agoFix NPE when calling CreateBranch without explict startpoint 33/1733/1
Christian Halstrick [Tue, 12 Oct 2010 11:36:02 +0000 (13:36 +0200)]
Fix NPE when calling CreateBranch without explict startpoint

When creating a branch with CreateBranchCommand.call() without
specifying an explicit startPoint HEAD should be used as startPoint.
There was a bug leading to an NPE in such a case.

Change-Id: Ic0a5dc1f33a0987d66c09996c8012c45785500ff
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
14 years agoRemove wrong comment in MergeCommand 32/1732/1
Christian Halstrick [Thu, 7 Oct 2010 13:58:34 +0000 (15:58 +0200)]
Remove wrong comment in MergeCommand

There was a wrong javadoc comment telling that MergeCommand
only supports fast-forward merges. This has been fixed.

Change-Id: I7edea779a83528beee34a1753026288c384881ce
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
14 years agoRemove AmbiguousObjectException from BranchCreateCommand.call() 31/1731/1
Christian Halstrick [Mon, 11 Oct 2010 14:53:18 +0000 (16:53 +0200)]
Remove AmbiguousObjectException from BranchCreateCommand.call()

We wanted to wrap all LowLevel JGit excpetions into a
JGitInternalException so that users of this high-level interface
don't have to explicitly catch all of them. This
was forgotten on BranchCreateCommand.call() and I added
it.

Change-Id: Ie140e99574fb004137c66e80fb92eb6c6d0fa5e1
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
14 years agoDelete PatienceDiff 63/1663/5
Shawn O. Pearce [Sat, 25 Sep 2010 02:03:10 +0000 (19:03 -0700)]
Delete PatienceDiff

HistogramDiff outperforms it for any case where PatienceDiff needs to
fallback to another algorithm.  Consequently it's not worth keeping
around, because we would always want a fallback enabled.

Change-Id: I39b99cb1db4b3be74a764dd3d68cd4c9ecd91481
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoUse HistogramDiff by default in DiffFormatter 45/1645/8
Shawn O. Pearce [Tue, 21 Sep 2010 22:26:03 +0000 (15:26 -0700)]
Use HistogramDiff by default in DiffFormatter

Its behavior is similar to PatienceDiff, and runs nearly as fast,
often beating the performance of MyersDiff.

Change-Id: I43c3faefa8109f1a68ef57522bec9cf27b5df252
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agodebug-diff-algorithms: Real world performance test implementations 21/1721/3
Shawn O. Pearce [Sun, 10 Oct 2010 20:36:30 +0000 (13:36 -0700)]
debug-diff-algorithms: Real world performance test implementations

When working on a difference algorithm's implementation, its generally
more important to care about how it behaves on real-world inputs than
it does on fake inputs created for unit test cases.  Run each
implementation against a number of real-world repositories, looking at
changes between files in each commit.  This gives a better picture of
how a particular algorithm performs.

This test suite run against JGit and linux-2.6 with the current
available algorithms shows HistogramDiff always out-performs
MyersDiff, and by a wide margin on the linux-2.6 sources.  As
HistogramDiff has similar output properties as PatienceDiff, the
resulting edits are probably also more human-readable.  These test
results show that HistogramDiff is a good choice for the default
implementation, and also show that PatienceDiff isn't worth keeping.

  jgit: start at baa83ae
            2686 files,          760 commits
    N=         3 min lines,     3016 max lines
  Algorithm                     Time(ns) (  Time(ns) on   Time(ns) on )
                                         (          N=3        N=3016 )
  ---------------------------------------------------------------------
  histogram_myers              314652100 (         3900        298100 )
  histogram                    315973000 (         3800        302100 )
  patience                     774724900 (         4500        347900 )
  patience_histogram_myers     786332800 (         3700        351200 )
  myers                        819359300 (         4100        379100 )
  patience_myers               843416700 (         3800        348000 )

  linux-2.6.git: start at 85a3318
            4001 files,         2680 commits
    N=         2 min lines,    39098 max lines
  Algorithm                     Time(ns) (  Time(ns) on   Time(ns) on )
                                         (          N=2       N=39098 )
  ---------------------------------------------------------------------
  histogram_myers             1229870000 (         5900       2642700 )
  histogram                   1235654100 (         6000       2695400 )
  patience                    3856546000 (         5900       2627700 )
  patience_histogram_myers    3866728100 (         7000       2624000 )
  patience_myers              4004875300 (         8000       2651700 )
  myers                       9794679000 (         7200       2716200 )

Change-Id: I2502684d31f7851e720356820d04d8cf767f7229
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoMerge "Define LowLevelDiffAlgorithm to bypass re-hashing"
Chris Aniszczyk [Mon, 11 Oct 2010 21:18:05 +0000 (17:18 -0400)]
Merge "Define LowLevelDiffAlgorithm to bypass re-hashing"

14 years agoMerge changes I50dcec81,Ieab28bb3
Chris Aniszczyk [Mon, 11 Oct 2010 19:00:51 +0000 (15:00 -0400)]
Merge changes I50dcec81,Ieab28bb3

* changes:
  Fix empty block corner case in PatienceDiff
  Fix infinite loop in PatienceDiff

14 years agoFix corrupted large deltas 23/1723/1
Shawn O. Pearce [Mon, 11 Oct 2010 01:48:15 +0000 (18:48 -0700)]
Fix corrupted large deltas

Large objects stored as deltas get unpacked by JGit into a loose
object, so they are cheaper to access later on.  This unpacking was
broken because TeeInputStream copied the wrong length into the loose
object, sometimes copying too many bytes into the result.  This
created a loose object that did not have the correct content, and
whose length did not match the length denoted in the object header.

Change-Id: I3ce1fd9f3dc5bd195249c7872b3bec49570424a2
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoDefine LowLevelDiffAlgorithm to bypass re-hashing 62/1662/3
Shawn O. Pearce [Sat, 25 Sep 2010 00:22:42 +0000 (17:22 -0700)]
Define LowLevelDiffAlgorithm to bypass re-hashing

When passing to a fallback algorithm, we can avoid creating a new copy
of the hash codes for each sequence by passing in the hashed sequences
directly.  This makes it cheaper to switch from HistogramDiff down to
MyersDiff in a single pass.

Change-Id: Ibf2e81be57c083862eeb134279aed676653bf9b5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoFix empty block corner case in PatienceDiff 20/1720/1
Shawn O. Pearce [Sun, 10 Oct 2010 20:36:39 +0000 (13:36 -0700)]
Fix empty block corner case in PatienceDiff

There is a corner case where we get an EMPTY region during recursion,
but we didn't expect to receive that.  Its harmless to ignore the
region since the region is empty and has no content, so do so rather
than throwing an exception

Change-Id: I50dcec81ecba763072bb739adfab5879fb48b23a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoFix infinite loop in PatienceDiff 19/1719/1
Shawn O. Pearce [Sun, 10 Oct 2010 20:08:50 +0000 (13:08 -0700)]
Fix infinite loop in PatienceDiff

Certain inputs caused an infinite loop because the prior match data
couldn't be used as expected.  Rather than incrementing the match
pointer before looking at an element, do it after, so the loop breaks
when we wrap around to the starting point.

Change-Id: Ieab28bb3485a914eeddc68aa38c256f255dd778c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoUpdate Push to use latest API 18/1718/1
Chris Aniszczyk [Sun, 10 Oct 2010 20:55:03 +0000 (15:55 -0500)]
Update Push to use latest API

Change-Id: I57ea8634a46472f40046f4ec69de505abbf5f6cf
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
14 years agoMerge "Cleanup RefUpdateTest"
Chris Aniszczyk [Sun, 10 Oct 2010 20:41:08 +0000 (16:41 -0400)]
Merge "Cleanup RefUpdateTest"

14 years agoAdd "Branch" command 04/1704/6
Mathias Kinzler [Fri, 8 Oct 2010 12:01:55 +0000 (14:01 +0200)]
Add "Branch" command

The need for branching becomes more pressing with pull
support: we need to make sure the upstream configuration entries
are written correctly when creating and renaming branches
(and of course are cleaned up when deleting them).
This adds support for listing, adding, deleting and renaming
branches including the more common options.

Bug: 326938
Change-Id: I00bcc19476e835d6fd78fd188acde64946c1505c
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
14 years agoCleanup RefUpdateTest 17/1717/1
Shawn O. Pearce [Sat, 9 Oct 2010 05:41:18 +0000 (22:41 -0700)]
Cleanup RefUpdateTest

Application code, including unit tests for storage implementations,
should not extend RevCommit outside of the scope of using it for a
RevWalk.  Its a lot of overhead and unlikely to work long-term.

Instead for the one test that matters, use a custom subclass of the
ObjectId type.  This lets us measure exactly what we are looking for,
which is that the subclass isn't retained.

A lot of other tests were unnecessarily wrapping an object with a
RevCommit and storing that back into the RefUpdate.  This is just
retesting what the earlier no-cache test was doing, and complicated
the test considerably.  Drop that code and just rely on the value that
was configured by the helper method.

Change-Id: I5b31813484eaa306e9bc4de9622dd5bd4846b16d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoAdd support for single-slash URI 03/1703/4
Christian Halstrick [Wed, 6 Oct 2010 15:35:06 +0000 (17:35 +0200)]
Add support for single-slash URI

In bug 323571 it is mentioned that if you call
'toURI().toURL().toString()' on a java.io.File you cannot pass
that string to jgit as an URIish. Problem is that the passed
URI looks like 'file:/C:/a/b.txt' and that we where expecting
double slashes after scheme':'. This fix adds support for this
single-slash file URLs.

Bug: 323571
Change-Id: I866a76a4fcd0c3b58e0d26a104fc4564e7ba5999
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
14 years agoMerge "Update build to use tycho 0.10.0"
Chris Aniszczyk [Fri, 8 Oct 2010 13:59:31 +0000 (09:59 -0400)]
Merge "Update build to use tycho 0.10.0"

14 years agoAdd "Pull" command 96/1696/4
Mathias Kinzler [Thu, 7 Oct 2010 09:37:16 +0000 (11:37 +0200)]
Add "Pull" command

This is the minimal implementation of a "Pull" command. It does not
have any parameters besides the generic progress monitor and timeout.
It works on the currently checked-out branch and assumes that the
configuration contains the keys "branch.<branch name>.remote" and
"branch.<branch name>.merge" to determine the remote configuration
for the fetch and the remote branch name for the merge.

Bug: 303404
Change-Id: I7fe09029996d0cfc09a7d8f097b5d6af1488fa93
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
14 years agoRefactored URI parsing to detect wrong URIs 90/1690/5
Christian Halstrick [Wed, 6 Oct 2010 15:29:25 +0000 (17:29 +0200)]
Refactored URI parsing to detect wrong URIs

There where quite some bugs regarding wrong URI parsing. In order
to solve them the parsing has to be refactored. We now have
specialized regexps for 'scheme://host/...', scp URIs and local
file names. Now we can detect problems while parsing 'git://host:/abc' which
was previously not possible.

Bug: 315571
Bug: 292897
Bug: 307017
Bug: 323571
Bug: 317388
Change-Id: If72576576ebb6b9d9dc8b7e51ddd87c9909e8b62
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
14 years agoFixed URI regexp regarding user/password part 02/1702/4
Christian Halstrick [Wed, 6 Oct 2010 13:43:42 +0000 (15:43 +0200)]
Fixed URI regexp regarding user/password part

The regular expression which should handle the
user/password part in an URI was potentially
processing too many chars. This led to problems
when user/pwd and port was specified

Change-Id: I87db02494c4b367283e1d00437b1c06d2c8fdd28
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
14 years agoUpdate build to use tycho 0.10.0 12/1712/1
Matthias Sohn [Fri, 8 Oct 2010 09:00:51 +0000 (11:00 +0200)]
Update build to use tycho 0.10.0

Change-Id: Ib3328379841fa79641fe1cd70cd87ee057eefb1a
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
14 years agoFix URIish tests to contain a hostname for git protocol 01/1701/3
Christian Halstrick [Wed, 6 Oct 2010 13:21:01 +0000 (15:21 +0200)]
Fix URIish tests to contain a hostname for git protocol

URIs for the git protocol have to have a hostname.
(see http://www.kernel.org/pub/software/scm/git/docs
/git-clone.html#_git_urls_a_id_urls_a) Some tests tested
URIs like git:/abc.git which is not allowed. Fixed this.

Change-Id: Ia3b8b681ad6592f03b090a874a6e91068a8301fe
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
14 years agoIntroduce commented constants for the segments of an URI regex 00/1700/3
Christian Halstrick [Wed, 6 Oct 2010 12:02:05 +0000 (14:02 +0200)]
Introduce commented constants for the segments of an URI regex

The regular expressions used to parse URI's are constructed by
concatenating different segments to a big String. Introduce
String constants for these segements and document them.

Change-Id: If8b9dbaaf57ca333ac0b6c9610c3d3a515c540f9
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
14 years agoExternalize strings in TransportHttp 10/1710/1
Matthias Sohn [Thu, 7 Oct 2010 22:03:17 +0000 (01:03 +0300)]
Externalize strings in TransportHttp

Some strings were not externalized. Also use them in HTTP tests to
ensure that they will also succeed when message bundles are
translated.

Change-Id: Id02717176557e7d57e676e1339cd89f2be88d330
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
14 years agoFix HTTP tests 09/1709/1
Matthias Sohn [Thu, 7 Oct 2010 20:40:40 +0000 (23:40 +0300)]
Fix HTTP tests

Since 858b2c92 we have a HTTP authentication implementation hence
we now get different exception messages when required authentication
headers are not available. This broke the HTTP tests.

Change-Id: Ie08c1ec37e497c2a6f70a75f7c59f0805812a5cc
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
14 years agoMerge "Support HTTP basic and digest authentication"
Chris Aniszczyk [Thu, 7 Oct 2010 16:25:15 +0000 (12:25 -0400)]
Merge "Support HTTP basic and digest authentication"

14 years agoSplit URI regex strings differently 99/1699/2
Christian Halstrick [Wed, 6 Oct 2010 10:15:30 +0000 (12:15 +0200)]
Split URI regex strings differently

The strings used to construct the regex to parse
URIs are split differently. This makes it easier
to introduce meaningful String constants later on.

Change-Id: I9355fd42e57e0983204465c5d6fe5b6b93655074
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
14 years agoAdd pull operation related constants 06/1706/1
Chris Aniszczyk [Wed, 6 Oct 2010 16:46:13 +0000 (11:46 -0500)]
Add pull operation related constants

Change-Id: Idb7526800e80e17624ec05fb10bbc19e7f744f49
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
14 years agoAdd PushCommand API 72/1672/3
Chris Aniszczyk [Tue, 28 Sep 2010 14:54:44 +0000 (09:54 -0500)]
Add PushCommand API

Change-Id: Iff144a51fdc9a1112a21492c390a873a2b293bc9
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
14 years agoIncrease core.streamFileThreshold default to 50 MiB 10/1610/2
Shawn O. Pearce [Fri, 17 Sep 2010 00:02:27 +0000 (17:02 -0700)]
Increase core.streamFileThreshold default to 50 MiB

Projects like org.eclipse.mdt contain large XML files about 6 MiB
in size.  So does the Android project platform/frameworks/base.
Doing a clone of either project with JGit takes forever to checkout
the files into the working directory, because delta decompression
tends to be very expensive as we need to constantly reposition the
base stream for each copy instruction.  This can be made worse by
a very bad ordering of offsets, possibly due to an XML editor that
doesn't preserve the order of elements in the file very well.

Increasing the threshold to the same limit PackWriter uses when
doing delta compression (50 MiB) permits a default configured
JGit to decompress these XML file objects using the faster
random-access arrays, rather than re-seeking through an inflate
stream, significantly reducing checkout time after a clone.

Since this new limit may be dangerously close to the JVM maximum
heap size, every allocation attempt is now wrapped in a try/catch
so that JGit can degrade by switching to the large object stream
mode when the allocation is refused.  It will run slower, but the
operation will still complete.

The large stream mode will run very well for big objects that aren't
delta compressed, and is acceptable for delta compressed objects that
are using only forward referencing copy instructions.  Copies using
prior offsets are still going to be horrible, and there is nothing
we can do about it except increase core.streamFileThreshold.

We might in the future want to consider changing the way the delta
generators work in JGit and native C Git to avoid prior offsets once
an object reaches a certain size, even if that causes the delta
instruction stream to be slightly larger.  Unfortunately native
C Git won't want to do that until its also able to stream objects
rather than malloc them as contiguous blocks.

Change-Id: Ief7a3896afce15073e80d3691bed90c6a3897307
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
14 years agoMerge "Update Fetch to use FetchCommand API"
Chris Aniszczyk [Fri, 1 Oct 2010 16:02:25 +0000 (12:02 -0400)]
Merge "Update Fetch to use FetchCommand API"

14 years agoMerge "Add reflog message to TagCommand"
Chris Aniszczyk [Wed, 29 Sep 2010 14:09:43 +0000 (10:09 -0400)]
Merge "Add reflog message to TagCommand"

14 years agoComment the use of System.gc in LocalDiskRepositoryTestCase 66/1666/2
Robin Rosenberg [Thu, 9 Sep 2010 22:42:07 +0000 (00:42 +0200)]
Comment the use of System.gc in LocalDiskRepositoryTestCase

Change-Id: Ic5e9bda4275006ef3bf6ea6255ddf1c0eecc3770
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
14 years agoShut up findbugs/protect the shutdownHook in LocalDiskRepositoryTestcase 65/1665/2
Robin Rosenberg [Thu, 9 Sep 2010 22:38:58 +0000 (00:38 +0200)]
Shut up findbugs/protect the shutdownHook in LocalDiskRepositoryTestcase

Singleton references should be protected from multiple threads. As far as we
know this cannot happen as JUnit is used today since we currently don't run
tests in parallel, but now this code will not prevent anyone.

Change-Id: I29109344d2e8025fa2a3ccaf7c2c16469544ce05
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
14 years agoSupport HTTP basic and digest authentication 70/1670/1
Shawn O. Pearce [Sun, 21 Mar 2010 04:04:33 +0000 (21:04 -0700)]
Support HTTP basic and digest authentication

Natively support the HTTP basic and digest authentication methods
by setting the Authorization header without going through the JREs
java.net.Authenticator API.  The Authenticator API is difficult to
work with in a multi-threaded server environment, where its using
a singleton for the entire JVM.  Instead compute the Authorization
header from the URIish user and pass, if available.

Change-Id: Ibf83fea57cfb17964020d6aeb3363982be944f87
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
14 years agoMerge "Use only a single instance for NLS translation bundles"
Chris Aniszczyk [Mon, 27 Sep 2010 21:59:36 +0000 (17:59 -0400)]
Merge "Use only a single instance for NLS translation  bundles"

14 years agoUpdate Fetch to use FetchCommand API 69/1669/2
Chris Aniszczyk [Fri, 24 Sep 2010 20:23:34 +0000 (15:23 -0500)]
Update Fetch to use FetchCommand API

Change-Id: I06ddc74f1ef658f4876e2bbcc3eaad3475a5371e
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
14 years agoMerge "Update FetchCommand with dry run and thin options"
Chris Aniszczyk [Mon, 27 Sep 2010 14:07:49 +0000 (10:07 -0400)]
Merge "Update FetchCommand with dry run and thin options"

14 years agoReturn the documented value from DirCacheCheckout.checkout 67/1667/2
Robin Rosenberg [Thu, 9 Sep 2010 22:55:29 +0000 (00:55 +0200)]
Return the documented value from DirCacheCheckout.checkout

Change-Id: I34d773b18e6a1ee05774d7b9471f9915c48aa63e
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
14 years agoMerge "Extend merge support for bare repositories"
Christian Halstrick [Mon, 27 Sep 2010 08:46:06 +0000 (04:46 -0400)]
Merge "Extend merge support for bare repositories"

14 years agoUse only a single instance for NLS translation bundles 64/1664/1
Robin Rosenberg [Thu, 9 Sep 2010 22:11:53 +0000 (00:11 +0200)]
Use only a single instance for NLS translation  bundles

As findbugs pointed out, there was a small risk for creating multiple instances of
translation bundles. If that happens, drop the second instance.

Change-Id: I3aacda86251d511f6bbc2ed7481d561449ce3b6c
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
14 years agoImplement HistogramDiff 43/1643/5
Shawn O. Pearce [Tue, 21 Sep 2010 20:41:47 +0000 (13:41 -0700)]
Implement HistogramDiff

HistogramDiff is an alternative implementation of patience diff,
performing a search over all matching locations and picking the
longest common subsequence that has the lowest occurrence count.
If there are unique common elements, its behavior is identical to
that of patience diff.

Actual performance on real-world source files usually beats
MyersDiff, sometimes by a factor of 3, especially for complex
comparators that ignore whitespace.

Change-Id: I1806cd708087e36d144fb824a0e5ab7cdd579d73
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoReuse DiffPerformanceTest support code to validate algorithms 41/1641/6
Shawn O. Pearce [Tue, 21 Sep 2010 21:47:33 +0000 (14:47 -0700)]
Reuse DiffPerformanceTest support code to validate algorithms

Each algorithm should produce a particular number of results
given one of the standard inputs used during the performance
tests.  To help ensure those tests are accurate, assert that
the edit list length is correct.

Change-Id: I292f8fde0cec6a60a75ce09e70814a00ca47cb99
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoMicro-optimize EditList.addAll 42/1642/3
Shawn O. Pearce [Tue, 21 Sep 2010 15:23:12 +0000 (08:23 -0700)]
Micro-optimize EditList.addAll

Pass through the addAll request to our underlying ArrayList.

This way the underlying ArrayList grows no more than once during the
call, which may be important if the list was originally allocated
at the default size of 16, but 64 Edits are being added.

Change-Id: I31c3261e895766f82c3c832b251a09f6e37e8860
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoUpdate FetchCommand with dry run and thin options 61/1661/2
Chris Aniszczyk [Fri, 24 Sep 2010 20:32:36 +0000 (15:32 -0500)]
Update FetchCommand with dry run and thin options

FetchCommand was missing the ability to set dry run and thin
preferences on the transport operation.

Change-Id: I0bef388a9b8f2e3a01ecc9e7782aaed7f9ac82ce
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
14 years agodebug-text-hashfunctions: Test suite for content hashes 60/1660/1
Shawn O. Pearce [Fri, 24 Sep 2010 19:49:29 +0000 (12:49 -0700)]
debug-text-hashfunctions: Test suite for content hashes

This is the test suite I was using to help understand why we had
such a high collision rate with RawTextComparator, and to select
a replacement function.

Since its not something we will run very often, lets make it a
program in the debug package rather than a JUnit test.  This way
we can run it on demand against any corpus of files we choose,
but we aren't bottlenecking our daily builds running tests with
no assertions.

Adding a new hash function to this suite is simple, just define
a new instance member of type "Hash" with the logic applied to
the region passed in.

Change-Id: Iec0b176adb464cf95b06cda157932b79c0b59886
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoExtend merge support for bare repositories 29/1629/5
Dmitry Fink [Tue, 21 Sep 2010 16:37:01 +0000 (09:37 -0700)]
Extend merge support for bare repositories

Optional inCore parameter to Resolver/Strategy will
instruct it to perform all the operations in memory
and avoid modifying working folder even if there is one.

Change-Id: I5b873dead3682f79110f58d7806e43f50bcc5045

14 years agoReduce content hash function collisions 56/1656/1
Shawn O. Pearce [Fri, 24 Sep 2010 00:13:20 +0000 (17:13 -0700)]
Reduce content hash function collisions

The hash code returned by RawTextComparator (or that is used
by the SimilarityIndex) play an important role in the speed of
any algorithm that is based upon them.  The lower the number of
collisions produced by the hash function, the shorter the hash
chains within hash tables will be, and the less likely we are to
fall into O(N^2) runtime behaviors for algorithms like PatienceDiff.

Our prior hash function was absolutely horrid, so replace it with
the proper definition of the DJB hash that was originally published
by Professor Daniel J. Bernstein.

To support this assertion, below is a table listing the maximum
number of collisions that result when hashing the unique lines in
each source code file of 3 randomly chosen projects:

  test_jgit: 931 files; 122 avg. unique lines/file
   Algorithm    | Collisions
   -------------+-----------
   prior_hash            418
   djb                     5
   sha1                    6
   string_hash31          11

  test_linux26: 30198 files; 258 avg. unique lines/file
   Algorithm    | Collisions
   -------------+-----------
   prior_hash           8675
   djb                    32
   sha1                    8
   string_hash31          32

  test_frameworks_base: 8381 files; 184 avg. unique lines/file
   Algorithm    | Collisions
   -------------+-----------
   prior_hash           4615
   djb                    10
   sha1                    6
   string_hash31          13

We can clearly see that prior_hash performed very poorly, resulting
in 8,675 collisions (elements in the same hash bucket) for at least
one file in the Linux kernel repository.  This leads to some very
bad O(N) style insertion and lookup performance, even though the
hash table was sized to be the next power-of-2 larger than the
total number of unique lines in the file.

The djb hash we are replacing prior_hash with performs closer to
SHA-1 in terms of having very few collisions.  This indicates it
provides a reasonably distributed output for this type of input,
despite being a much simpler algorithm (and therefore will be much
faster to execute).

The string_hash31 function is provided just to compare results with,
it is the algorithm commonly used by java.lang.String hashCode().

However, life isn't quite this simple.

djb produces a 32 bit hash code, but our hash tables are always
smaller than 2^32 buckets.  Mashing the 32 bit code into an array
index used to be done by simply taking the lower bits of the hash
code by a bitwise and operator.  This unfortuntely still produces
many collisions, e.g. 32 on the linux-2.6 repository files.

From [1] we can apply a final "cleanup" step to the hash code to
mix the bits together a little better, and give priority to the
higher order bits as they include data from more bytes of input:

  test_jgit: 931 files; 122 avg. unique lines/file
   Algorithm    | Collisions
   -------------+-----------
   prior_hash            418
   djb                     5
   djb + cleanup           6

  test_linux26: 30198 files; 258 avg. unique lines/file
   Algorithm    | Collisions
   -------------+-----------
   prior_hash           8675
   djb                    32
   djb + cleanup           7

  test_frameworks_base: 8381 files; 184 avg. unique lines/file
   Algorithm    | Collisions
   -------------+-----------
   prior_hash           4615
   djb                    10
   djb + cleanup           7

This is a massive improvement, as the number of collisions for
common inputs drops to acceptable levels, and we haven't really
made the hash functions any more complex than they were before.

[1] http://lkml.org/lkml/2009/10/27/404

Change-Id: Ia753b695de9526a157ddba265824240bd05dead1
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoFix PatienceDiffTest 55/1655/1
Shawn O. Pearce [Thu, 23 Sep 2010 21:45:07 +0000 (14:45 -0700)]
Fix PatienceDiffTest

Because PatienceDiff works by looking for common unique lines within
the region, the DiffTestDataGenerator needs to be modified to produce
a unique character for each region.  If we don't give PatienceDiff
a few unique points, it will just offer back a single REPLACE edit
that covers the entire files, and this doesn't tell us very much.

Change-Id: I5129faea1e763c74739118ca20d86bd62e0deaef
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoUpdate Tag to use TagCommand API 49/1649/1
Chris Aniszczyk [Wed, 22 Sep 2010 16:02:16 +0000 (11:02 -0500)]
Update Tag to use TagCommand API

Change-Id: I4f7f8e29c47980536398d73f2a71ed2b2c00f4f2
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
14 years agoAdd reflog message to TagCommand 48/1648/1
Chris Aniszczyk [Wed, 22 Sep 2010 15:47:44 +0000 (10:47 -0500)]
Add reflog message to TagCommand

Ensure we update the reflog when tagging.

Change-Id: I3f4a4d68cbfc62d2276e3a47e3e3720f02cb2522
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
14 years agoDefine an abstract DiffAlgorithm test framework 40/1640/2
Shawn O. Pearce [Tue, 21 Sep 2010 16:24:12 +0000 (09:24 -0700)]
Define an abstract DiffAlgorithm test framework

For certain tiny input sequences, every DiffAlgorithm should produce
exactly the same results, as there should be no ambiguity.  Package
these up in an abstract TestCase that algorithms can extend from in
order to perform basic validation of their implementation.

Since these tests are more complete than what we used to have for
the MyersDiff algorithm, throw away Johannes' tests and only use
this new package.

Change-Id: I9a044330887c849ad4c78aa5c7aa04c783c10252
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoPerform common start/end elimination by default for DiffAlgorithm 39/1639/2
Shawn O. Pearce [Tue, 21 Sep 2010 15:12:51 +0000 (08:12 -0700)]
Perform common start/end elimination by default for DiffAlgorithm

As it turns out, every single diff algorithm we might try to
implement can benfit from using the SequenceComparator's native
concept of the simple reduceCommonStartEnd() step.  For most inputs,
there can be a significant number of elements that can be removed
from the space the DiffAlgorithm needs to consider, which will
reduce the overall running time for the final solution.

Pool this logic inside of DiffAlgorithm itself as a default, but
permit a specific algorithm to override it when necessary.

Convert MyersDiff to use this reduction to reduce the space it
needs to search, making it perform slightly better on common inputs.

Change-Id: I14004d771117e4a4ab2a02cace8deaeda9814bc1
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoRemove unnecessary hash cache from PatienceDiffIndex 38/1638/2
Shawn O. Pearce [Tue, 21 Sep 2010 20:53:56 +0000 (13:53 -0700)]
Remove unnecessary hash cache from PatienceDiffIndex

PatienceDiff always uses a HashedSequence, which promises to provide
constant time access for hash codes during the equals method and
aborts fast if the hash codes don't match.  Therefore we don't need
to cache the hash codes inside of the index, saving us memory.

Change-Id: I80bf1e95094b7670e6c0acc26546364a1012d60e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoImplement Bram Cohen's Patience Diff 99/1499/10
Shawn O. Pearce [Thu, 2 Sep 2010 21:41:15 +0000 (14:41 -0700)]
Implement Bram Cohen's Patience Diff

Change-Id: Ic7a76df2861ea6c569ab9756a62018987912bd13
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoMove cached element hash codes to HashedSequence 85/1585/2
Shawn O. Pearce [Sat, 11 Sep 2010 00:48:24 +0000 (17:48 -0700)]
Move cached element hash codes to HashedSequence

Most diff implementations really want to use cached hash codes for
elements, rather than element equality, as they need to perform many
compares and unique hash codes for elements can really speed that
process up.

To make it easier to define element hash functions, move the caching
of hash codes into a wrapper sequence type, so that individual
sequence types like RawText don't need to do this themselves.  This
has a nice property of also allowing the sequence to no longer care
about the specific SequenceComparator that is going to be used, and
permits the caching to only examine the middle region that isn't
common to the two inputs.

Change-Id: If8623556da9419117b07c5073e8bce39de02570e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoMicro-optimize reduceCommonStartEnd for RawText 86/1586/3
Shawn O. Pearce [Sat, 11 Sep 2010 05:14:57 +0000 (22:14 -0700)]
Micro-optimize reduceCommonStartEnd for RawText

This is a faster exact match based form that tries to improve
performance for the common case of the header and trailer of
a text file not changing at all. After this fast path we use
the slower path based on the super class' using equals() to
allow for whitespace ignore modes to still work.

Some simple performance testing showed a major improvement over the
older implementation for a common edit we see in JGit.  The test
compared blob 29a89bc and 372a978, which is the ObjectDirectory.java
file difference in commit 41dd9ed1c054f9f9e1ab52fc7bbf1a55a56cf543.
The two text files are approximately 22 KiB in size.

  DEFAULT        old   203900 ns
  DEFAULT        new   100400 ns

This new version is 2x faster for the DEFAULT comparator, which does
not treat space specially.  This is because we can now examine a
larger swath of text with fewer instructions per byte compared.  The
older algorithm had to stop at each line break and recompute how to
examine the next line, while the new algorithm only stops when the
first difference is found.

  WS_IGNORE_ALL  old   298500 ns
  WS_IGNORE_ALL  new    63300 ns

Its 4.7x faster for the whitespace ignore comparator, as the common
header and footer do not have a whitespace difference.  Avoiding the
special case handling for whitespace on each byte considered saves a
lot of time.

Since most edits to source code (and other text like files) appears in
the interior of the file, fast elimination of common header/footer
means faster diff throughput.  In the less common case of an actual
header or footer edit, the common header/footer elimination is stopped
rather quickly either way, so there is very little downside to the
optimiation applied here.

Change-Id: I1d501b4c3ff80ed086b20bf12faf51ae62167db7
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoAdd Subsequence utility methods 35/1635/1
Shawn O. Pearce [Tue, 21 Sep 2010 01:04:00 +0000 (18:04 -0700)]
Add Subsequence utility methods

DiffAlgorithm implementations may find it useful to construct an Edit
and use that to later subsequence the two base sequences, so define
two new utility methods a() and b() to construct the A and B ranges.

Once a subsequence has had Edits created for it the indexes are
within the space of the subsequence.  These must be shifted back to
the original base sequence's indexes.  Define toBase() as a utility
method to perform that shifting work in-place, so DiffAlgorithm
implementations have an efficient way to convert back to the caller's
original space.

Change-Id: I8d788e4d158b9f466fa9cb4a40865fb806376aee
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years agoRemove duplicate resource bundle entry 28/1628/1
Matthias Sohn [Sun, 19 Sep 2010 06:36:52 +0000 (08:36 +0200)]
Remove duplicate resource bundle entry

Change-Id: Ifdf9fa5dd49bc3f4a0cc8a1ed505d77ec3fa526b
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>