]> source.dussan.org Git - jgit.git/log
jgit.git
13 years agoPackWriter: Refactor object writing loop 21/2621/1
Shawn O. Pearce [Tue, 1 Mar 2011 00:08:05 +0000 (16:08 -0800)]
PackWriter: Refactor object writing loop

This simple refactoring makes it easier to pre-process each of the
object lists before its handed into the actual write routine.

Change-Id: Iea95e5ecbc7374f6bcbb43d1c75285f4f564d09d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoPackWriter: Don't reuse commit or tag deltas 20/2620/1
Shawn O. Pearce [Mon, 28 Feb 2011 23:39:31 +0000 (15:39 -0800)]
PackWriter: Don't reuse commit or tag deltas

JGit doesn't generate deltas for commit or tag objects when it packs
a repository from scratch.  This is an explicit design decision that
is (mostly) justified by the fact that these objects do not delta
compress well.

Annotated tags are made once on stable points of the project history,
it is unlikely they will ever appear again with sufficient common
text to justify using a delta over just deflating the raw content.
JGit never tries to delta compress annotated tags and I take the
stance that these are best stored as non-deltas given how frequently
they might be accessed by repository viewers.

Commits only have sufficient common text when they are cherry-picked
to forward-port or back-port a change from one branch to another.
Even in these cases the distance between the commits as returned
by the log traversal has to be small enough that they would both
appear in the delta search window at the same time in order to
delta compress one of the messages against the other.  JGit never
tries to delta compress commits, as it requires a lot of CPU time
but typically does not produce a smaller pack file.

Avoid reusing deltas for either of these types when constructing a
new pack.  To avoid killing performance during serving of network
clients, UploadPack disables this code change by allowing PackWriter
to reuse delta commits.  Repositories that were already repacked by
C Git will not have their delta commits decompressed and recompressed
on the fly during object writing, saving server-side CPU resources.

Change-Id: I749407e7c5c677e05e4d054b40db7656cfa7fca8
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoPackWriter: Do not delta compress already packed objects 19/2619/1
Shawn O. Pearce [Tue, 1 Mar 2011 17:28:11 +0000 (09:28 -0800)]
PackWriter: Do not delta compress already packed objects

This is a tiny optimization to how delta search works.  Checking for
isReuseAsIs() avoids doing delta compression search on non-delta
objects already stored in packs within the repository.  Such objects
are not likely to be delta compressable, as they were already delta
searched when their containing pack was generated and they were
not delta compressed at that time.  Doing delta compression now is
unlikely to produce a different result, but would waste a lot of CPU.

The isReuseAsIs() flag is checked before isDoNotDelta() because it
is very common to reuse objects in the output pack.  Most objects
get reused, and only a handful have the isDoNotDelta() bit set.
Moving the check earlier allows the loop to more quickly skip
through objects that will never need to be considered.

Change-Id: Ied757363f775058177fc1befb8ace20fe9759bac
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoPaper bag fix BatchingProgressMonitor alarm queue 17/2617/1
Shawn O. Pearce [Tue, 1 Mar 2011 18:06:39 +0000 (10:06 -0800)]
Paper bag fix BatchingProgressMonitor alarm queue

The alarm queue threads were started with an empty task body, which
meant the thread started and terminated immediately, leaving the
queue itself with no worker.

Change-Id: I2a9b5fe9c2bdff4a5e0f7ec7ad41a54b41a4ddd6
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoMerge "ProgressMonitor: Refactor to use background alarms"
Chris Aniszczyk [Tue, 1 Mar 2011 17:05:59 +0000 (12:05 -0500)]
Merge "ProgressMonitor: Refactor to use background alarms"

13 years agoMerge "Show notes in Log CLI command - Part 2"
Chris Aniszczyk [Tue, 1 Mar 2011 16:13:00 +0000 (11:13 -0500)]
Merge "Show notes in Log CLI command - Part 2"

13 years agoShow notes in Log CLI command - Part 2 97/2597/2
Sasa Zivkov [Thu, 24 Feb 2011 10:12:06 +0000 (11:12 +0100)]
Show notes in Log CLI command - Part 2

This change fixes issues identified in the commit
5f3d577e5a1e8f23a2b6ea6a2bf24516806e01b8.

Change-Id: Idbd935f5f60ad043faa0d4982b3e101ef7c07d60
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
13 years agoProgressMonitor: Refactor to use background alarms 15/2615/1
Shawn O. Pearce [Tue, 1 Mar 2011 03:34:06 +0000 (19:34 -0800)]
ProgressMonitor: Refactor to use background alarms

Instead of polling the system clock on every update(1) method call,
use a scheduled executor to toggle a volatile once per second until
the task is done.  Check the volatile on each update(int), looking
to see if output should occur.

This limits progress output to either once per 1% complete, or once
per second.  To save time during update calls the timer isn't reset
during each 1% of output, which means we may see one unnecessary
output trigger if at least 1% completed during the one second of the
alarm time.

Change-Id: I8fdd7e31c37bef39a5d1b3da7105da0ef879eb84
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoFix NPE on checkout of remote tracking branch 14/2614/1
Matthias Sohn [Mon, 28 Feb 2011 23:21:14 +0000 (00:21 +0100)]
Fix NPE on checkout of remote tracking branch

Checkout of remote tracking branch failed when no local branch
existed. Also enhance RepositoryTestCase to enable checking index
state of another test repository.

Bug: 337695
Change-Id: Idf4c05bdf23b5161688818342b2bf9a45b49f479
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
13 years agoMerge branch 'stable-0.11' 05/2605/1
Shawn O. Pearce [Sat, 26 Feb 2011 01:24:55 +0000 (17:24 -0800)]
Merge branch 'stable-0.11'

* stable-0.11:
  JGit 0.11.3
  Fix NullPointer when pulling from a deleted local branch
  smart-http: Fix recognition of gzip encoding
  Fix processing of broken symbolic references in RefDirectory
  CreateBranchCommand: Wrong existence check
  Qualify post 0.11.1 builds

Conflicts:
org.eclipse.jgit.console/META-INF/MANIFEST.MF
org.eclipse.jgit.console/pom.xml
org.eclipse.jgit.http.server/META-INF/MANIFEST.MF
org.eclipse.jgit.http.server/pom.xml
org.eclipse.jgit.http.test/META-INF/MANIFEST.MF
org.eclipse.jgit.http.test/pom.xml
org.eclipse.jgit.iplog/META-INF/MANIFEST.MF
org.eclipse.jgit.iplog/pom.xml
org.eclipse.jgit.junit.http/META-INF/MANIFEST.MF
org.eclipse.jgit.junit.http/pom.xml
org.eclipse.jgit.junit/META-INF/MANIFEST.MF
org.eclipse.jgit.junit/pom.xml
org.eclipse.jgit.packaging/org.eclipse.jgit.feature/feature.xml
org.eclipse.jgit.packaging/org.eclipse.jgit.feature/pom.xml
org.eclipse.jgit.packaging/org.eclipse.jgit.junit.feature/feature.xml
org.eclipse.jgit.packaging/org.eclipse.jgit.junit.feature/pom.xml
org.eclipse.jgit.packaging/org.eclipse.jgit.source.feature/feature.xml
org.eclipse.jgit.packaging/org.eclipse.jgit.source.feature/pom.xml
org.eclipse.jgit.packaging/org.eclipse.jgit.updatesite/pom.xml
org.eclipse.jgit.packaging/pom.xml
org.eclipse.jgit.pgm/META-INF/MANIFEST.MF
org.eclipse.jgit.pgm/pom.xml
org.eclipse.jgit.test/META-INF/MANIFEST.MF
org.eclipse.jgit.test/pom.xml
org.eclipse.jgit.ui/META-INF/MANIFEST.MF
org.eclipse.jgit.ui/pom.xml
org.eclipse.jgit/META-INF/MANIFEST.MF
org.eclipse.jgit/META-INF/SOURCE-MANIFEST.MF
org.eclipse.jgit/pom.xml
pom.xml

Change-Id: I08067c028666f194687943a574512f5bc5ca9552

13 years agoUnpackedObject: Fix readSome() when initial read is short 04/2604/1
Shawn O. Pearce [Sat, 26 Feb 2011 01:20:14 +0000 (17:20 -0800)]
UnpackedObject: Fix readSome() when initial read is short

JDK7 changed behavior slightly on some InputStream types, resulting in
the first read being shorter than the count requested.  That caused us
to overwrite the earlier part of the buffer with later data, as the
offset index wasn't updated in the loop.

Fix the loop to increment offset by the number of bytes read in this
iteration, so the next read appends to the buffer rather than doing an
overwrite.

Bug: 338119
Change-Id: I222fb2f993cd9b637b6b8d93daab5777ef7ec7a6
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoMerge "RevWalk: Don't release during inMergeBase()"
Chris Aniszczyk [Thu, 24 Feb 2011 16:23:47 +0000 (11:23 -0500)]
Merge "RevWalk: Don't release during inMergeBase()"

13 years agoMerge "Fix formatting of pom.xml"
Shawn Pearce [Thu, 24 Feb 2011 15:29:47 +0000 (10:29 -0500)]
Merge "Fix formatting of pom.xml"

13 years agoFetchCommand: do not set a null credentials provider 86/2586/2
Matthias Sohn [Thu, 24 Feb 2011 12:52:24 +0000 (13:52 +0100)]
FetchCommand: do not set a null credentials provider

FetchCommand now does not set a null credentials provider on
Transport because in this case the default provider is replaced with
null and the default mechanism for providing credentials is not
working.

Change-Id: I44096aa856f031545df39d4b09af198caa2c21f6
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
13 years agoFix formatting of pom.xml 87/2587/2
Matthias Sohn [Thu, 24 Feb 2011 11:44:59 +0000 (12:44 +0100)]
Fix formatting of pom.xml

Change-Id: I508def09cb2d4e5bd27b412f4ad5d43984388749
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
13 years agoRevWalk: Don't release during inMergeBase() 82/2582/2
Shawn O. Pearce [Wed, 23 Feb 2011 20:00:25 +0000 (12:00 -0800)]
RevWalk: Don't release during inMergeBase()

In bc1af8459e ("RevWalk: Don't reset ObjectReader when stopping") we
stopped releasing the reader when the current log traversal is over.
This should have also been applied to the merge base logic that is
buried within MergeGenerator, but got missed.

Change-Id: I8328f43f02cba06fd545e22134872e781b9d4d36
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoMerge "Respect core.excludesfile to enable global ignore rules "
Shawn Pearce [Wed, 23 Feb 2011 23:08:50 +0000 (18:08 -0500)]
Merge "Respect core.excludesfile to enable global ignore rules "

13 years agoRespect core.excludesfile to enable global ignore rules 75/2575/2
Matthias Sohn [Wed, 23 Feb 2011 22:44:50 +0000 (23:44 +0100)]
Respect core.excludesfile to enable global ignore rules

Also use FS.resolve() to properly resolve files from path strings.

Bug: 328428 (partial fix)
Change-Id: I41d94694f220dcb85605c9acadfffb1fa23beaeb
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
13 years agoPackWriter: Add missing timers to Statistics 81/2581/1
Shawn O. Pearce [Wed, 23 Feb 2011 02:56:51 +0000 (18:56 -0800)]
PackWriter: Add missing timers to Statistics

We did not record the time spent on the object reuse search or the
object size lookup, both of which occur between the counting phase and
the compressing phase.  If there are enough objects involved, these
times can be significant so its worth timing them and recording it.

Change-Id: I89084acfc598bb6533d75d90cb8de459f0ed93be
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoShow notes in Log CLI command 64/2564/3
Sasa Zivkov [Mon, 21 Feb 2011 15:43:06 +0000 (16:43 +0100)]
Show notes in Log CLI command

Support for --no-standard-notes and --show-notes=REF options is added
to the Log command. The --show-notes option can be specified more than
once if more than one notes branch should be used for showing notes.

The notes are displayed from note branches in the order how the note
branches are specified in the command line. However, the standard note,
from the refs/notes/commits, is always displayed as first unless
the --no-standard-notes options is given.

Change-Id: I4e7940804ed9d388b625b8e8a8e25bfcf5ee15a6
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
13 years agoPackWriter: Fix total delta count 72/2572/1
Shawn O. Pearce [Wed, 23 Feb 2011 00:53:22 +0000 (16:53 -0800)]
PackWriter: Fix total delta count

The total delta count is supposed to include reused deltas, not
just newly created deltas.

Change-Id: I98cbdcef80d59714a4f62ff322e7b709b08b6d26
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoMerge "Create empty GIT_DIR/hooks directory"
Shawn O. Pearce [Tue, 22 Feb 2011 15:46:09 +0000 (10:46 -0500)]
Merge "Create empty GIT_DIR/hooks directory"

13 years agoMerge "Fix potential NullPointerException in PlotCommit"
Shawn Pearce [Tue, 22 Feb 2011 15:45:51 +0000 (10:45 -0500)]
Merge "Fix potential NullPointerException in PlotCommit"

13 years agoCreate empty GIT_DIR/hooks directory 66/2566/1
Shawn O. Pearce [Tue, 22 Feb 2011 15:38:51 +0000 (07:38 -0800)]
Create empty GIT_DIR/hooks directory

Bug: 337801
Change-Id: I5e0c4d838a211509fb4cc7e048dba6efaec15d5c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoFix potential NullPointerException in PlotCommit 56/2556/2
Mathias Kinzler [Tue, 22 Feb 2011 08:11:42 +0000 (09:11 +0100)]
Fix potential NullPointerException in PlotCommit

Change-Id: Ib7f661a259561251e74337fa233036e041c42423
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
13 years agoJGit 0.11.3 47/2547/1 stable-0.11 v0.11.3
Matthias Sohn [Sun, 20 Feb 2011 23:59:47 +0000 (00:59 +0100)]
JGit 0.11.3

Change-Id: I0a3d4d4400e7643c43d64bf60e566d533b5dcee1
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
13 years agoFix NullPointer when pulling from a deleted local branch 46/2546/1
Stefan Lay [Wed, 16 Feb 2011 14:46:26 +0000 (15:46 +0100)]
Fix NullPointer when pulling from a deleted local branch

A checked Exception is thrown instead.

The reason for throwing an Exception is that the state of the
repository is inconsistent in this case: There is a merge
configuration containing a non-existing local branch. Ideally the
deletion of a local branch should also delete the corresponding
merge configuration.

Bug: 337315
Change-Id: I8ed57d5aaed60aaab685fc11a8695e474e60215f
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
13 years agosmart-http: Fix recognition of gzip encoding 45/2545/1
Shawn O. Pearce [Tue, 15 Feb 2011 22:09:42 +0000 (14:09 -0800)]
smart-http: Fix recognition of gzip encoding

Some clients coming through proxies may advertise a different
Accept-Encoding, for example "Accept-Encoding: gzip(proxy)".
Matching by substring causes us to identify this as a false positive;
that the client understands gzip encoding and will inflate the
response before reading it.

In this particular case however it doesn't.  Its the reverse proxy
server in front of JGit letting us know the proxy<->JGit link can
be gzip compressed, while the client<->proxy part of the link is not:

  client <-- no gzip --> proxy <-- gzip --> JGit

Use a more standard method of parsing by splitting the value into
tokens, and only using gzip if one of the tokens is exactly the
string "gzip".  Add a unit test to make sure this isn't broken in
the future.

Change-Id: Ib4c40f9db177322c7a2640808a6c10b3c4a73819
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
13 years agoFix processing of broken symbolic references in RefDirectory 44/2544/1
Marc Strapetz [Wed, 9 Feb 2011 11:54:09 +0000 (12:54 +0100)]
Fix processing of broken symbolic references in RefDirectory

Change-Id: Ic1ceb9c99dca2c69e61ea0ef03ec64f13714b80a
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
13 years agoCreateBranchCommand: Wrong existence check 43/2543/1
Mathias Kinzler [Mon, 14 Feb 2011 14:48:43 +0000 (15:48 +0100)]
CreateBranchCommand: Wrong existence check

Bug: 337044
Change-Id: I89224719712c1f1ab80ea34280139dfeb00be3d0
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
13 years agoQualify post 0.11.1 builds 42/2542/1
Matthias Sohn [Sun, 20 Feb 2011 23:41:45 +0000 (00:41 +0100)]
Qualify post 0.11.1 builds

Change-Id: I48cca12fcc6212fbe6c42109e44e4a2dc20ecada
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
13 years agoPackWriter: Hoist and cluster reference targets 41/2541/1
Shawn O. Pearce [Sat, 19 Feb 2011 01:55:53 +0000 (17:55 -0800)]
PackWriter: Hoist and cluster reference targets

Many source browsers and network related tools like UploadPack need
to find and parse the target of all branches and annotated tags
within the repository during their startup phase.  Clustering these
together into the same part of the pack file will improve locality,
reducing thrashing when an application starts and needs to load
all of these into memory at once.

To prevent bottlenecking basic log viewing tools that are scannning
backwards from the tip of a current branch (and don't need tags)
we place this cluster of older targets after 4096 newer commits
have already been placed into the pack stream.  4096 was chosen as
a rough guess, but was based on a few factors:

  - log viewers typically show 5-200 commits per page
  - users only view the first page or two

  - DHT can cram 2200-4000 commits per 1 MiB chunk
    thus these will fall into the second commit chunk (roughly)

Unfortunately this placement hurts history tools that are scanning
backwards through the commit graph and completely ignored tags or
branch heads when they started.

An ancient tagged commit is no longer positioned behind its first
child (its now much earlier), resulting in a page fault for the
parser to reload this cluster of objects on demand.  This may be
an acceptable loss.  If a user is walking backwards and has already
scanned through more than 4096 commits of history, waiting for the
region to reload isn't really that bad compared to the amount of
time already spent.

If the repository is so small that there are less than 4096 commits,
this change has no impact on the placement of objects.

Change-Id: If3052e430d305e17878d94145c93754f56b74c61
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoPackWriter: Parse tag target objects in a batch 40/2540/1
Shawn O. Pearce [Sat, 19 Feb 2011 01:31:32 +0000 (17:31 -0800)]
PackWriter: Parse tag target objects in a batch

If the underlying storage has a high latency per SHA-1 lookup
(e.g. the DHT support we are working on), parsing each wanted
annotated tag object back to its underlying commit is too slow,
its a sequential lookup for each tag.  With hundreds of tags in
a repository this takes far too long.

Instead queue up a list of the tags whose objects need to be found,
and then locate all of those in one parseAny batch.  This works
for the common case of annotated tag to single tree or commit.
For the less often used tag->tag->commit, it at least gets us
one level parsed in the larger batch before we have to go back to
sequential lookups.

Change-Id: I94beef3f14281406f15c8cf9fa02d83faf102a19
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoPackWriter: Correct total delta count when reusing pack 38/2538/1
Shawn O. Pearce [Sat, 19 Feb 2011 01:21:09 +0000 (17:21 -0800)]
PackWriter: Correct total delta count when reusing pack

If the CachedPack knows its delta count, we need to increment both
the totalDeltas and reusedDeltas fields of the stats object.

Change-Id: I70113609c22476ce7f1e4d9a92f486e9b0f59e44
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoPackWriter: Short-circuit counting on full cached pack reuse 39/2539/1
Shawn O. Pearce [Sat, 19 Feb 2011 01:06:36 +0000 (17:06 -0800)]
PackWriter: Short-circuit counting on full cached pack reuse

If one or more cached packs fully covers the request, don't bother
with looking up the objects and trying to walk the graph.  Just use
the cached packs and return immediately.

This helps clones of quiet repositories that have not been modified
since their last repack, its likely the cached packs are accurate
and no graph walking is required.

Change-Id: I9062a5ac2f71b525322590209664a84051fd5f8a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoPackWriter: Fix warning about untyped collection 37/2537/1
Shawn O. Pearce [Sat, 19 Feb 2011 00:56:54 +0000 (16:56 -0800)]
PackWriter: Fix warning about untyped collection

Change-Id: I44699d8ab9768844ba91f7224a7d4ee685c93ce6
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoBundleWriter: Always use OFS_DELTA 35/2535/2
Shawn O. Pearce [Fri, 18 Feb 2011 22:14:56 +0000 (14:14 -0800)]
BundleWriter: Always use OFS_DELTA

CGit just learned to always use OFS_DELTA when writing out bundle
files.  This makes sense because bundle came about well after
OFS_DELTA was established, so any version of CGit that can read a
bundle file can also read OFS_DELTA.  Since OFS_DELTA is smaller,
always use it when writing bundles.

Change-Id: I44f9921494798ea0c99e16eab58b87bebeb9aff5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoMerge "PackWriter: Sort commits by parse order to improve locality"
Chris Aniszczyk [Fri, 18 Feb 2011 19:30:19 +0000 (14:30 -0500)]
Merge "PackWriter: Sort commits by parse order to improve locality"

13 years agoWrong constant used when configuring a repository 31/2531/2
Tomasz Zarna [Fri, 18 Feb 2011 10:41:19 +0000 (11:41 +0100)]
Wrong constant used when configuring a repository

Bug: 337546
Change-Id: Ib2f31d621caa5f8b24ce74ce82499889d4f30550

13 years agoPackWriter: Sort commits by parse order to improve locality 29/2529/1
Shawn O. Pearce [Thu, 17 Feb 2011 01:41:35 +0000 (17:41 -0800)]
PackWriter: Sort commits by parse order to improve locality

RevWalk in JGit and the revision code in C Git both parse commits out
of the pack file in an order that differs from strict timestamp and
topological sorting.  Both implementations pop a commit from the head
of a date queue, and then immediately parse all of its parents in
order to insert those into the date queue at the proper positions as
determined by their committer timestamp field.  This implies that the
parents are parsed when their most recent child is popped from the
queue, and not where they are popped during traversal.

Hoisting a parent commit to be immediately behind its child improves
locality by making sure all parents of a merge are clustered together,
and thus can be paged into the parser by the pack file buffering
system (aka WindowCache in JGit) together.

Change-Id: I80f9e64cafa2e8f082776b43845edf23065386a2
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoMerge "Changed TreeWalk.forPath(...) to work with recursive paths."
Shawn Pearce [Fri, 18 Feb 2011 05:21:59 +0000 (00:21 -0500)]
Merge "Changed TreeWalk.forPath(...) to work with recursive paths."

13 years agoAdd Reset to the JGit CLI 23/2523/2
Chris Aniszczyk [Thu, 17 Feb 2011 17:03:48 +0000 (11:03 -0600)]
Add Reset to the JGit CLI

Change-Id: I85368c849c0964b9a539fa1991920adb2ace94df
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
13 years agoChanged TreeWalk.forPath(...) to work with recursive paths. 10/2310/7
Jesse Greenwald [Sat, 22 Jan 2011 15:51:29 +0000 (07:51 -0800)]
Changed TreeWalk.forPath(...) to work with recursive paths.

Previously, this method would not (always) work when a recursive path
such as "a/b" was passed into it.

Change-Id: I0752a1f5fc7fef32064d8f921b33187c0bdc7227
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
13 years agoAdd git-reset to the Git API 43/2443/4
Chris Aniszczyk [Tue, 1 Feb 2011 14:47:04 +0000 (08:47 -0600)]
Add git-reset to the Git API

Bug: 334764
Change-Id: Ice404629687d7f2a595d8d4eccf471b12f7e32ec
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
13 years agoMerge "Fix NullPointer when pulling from a deleted local branch"
Shawn Pearce [Wed, 16 Feb 2011 15:15:31 +0000 (10:15 -0500)]
Merge "Fix NullPointer when pulling from a deleted local branch"

13 years agoFix NullPointer when pulling from a deleted local branch 14/2514/1
Stefan Lay [Wed, 16 Feb 2011 14:46:26 +0000 (15:46 +0100)]
Fix NullPointer when pulling from a deleted local branch

A checked Exception is thrown instead.

The reason for throwing an Exception is that the state of the
repository is inconsistent in this case: There is a merge
configuration containing a non-existing local branch. Ideally the
deletion of a local branch should also delete the corresponding
merge configuration.

Bug: 337315
Change-Id: I71e56ffb90e11e6e3c1bbd964ad63972d67990c0
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
13 years agosmart-http: Support progress in ReceivePack 11/2511/1
Shawn O. Pearce [Tue, 15 Feb 2011 22:46:30 +0000 (14:46 -0800)]
smart-http: Support progress in ReceivePack

As PackParser supports a progress meter for the "Resolving deltas"
phase of its work, we should export this to smart HTTP clients so
they know the server is still working on their (large) upload.

However this isn't as simple as just dropping in a binding for
the SmartOutputStream to flush when its told to.  We want to
avoid spurious flushes triggered by the use of sideband, or the
status report formatting in the send-pack/receive-pack protocol.

Change-Id: Ibd88022a298c5fed0edb23dfaf2e90278807ba8b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agosmart-http: Fix recognition of gzip encoding 10/2510/1
Shawn O. Pearce [Tue, 15 Feb 2011 22:09:42 +0000 (14:09 -0800)]
smart-http: Fix recognition of gzip encoding

Some clients coming through proxies may advertise a different
Accept-Encoding, for example "Accept-Encoding: gzip(proxy)".
Matching by substring causes us to identify this as a false positive;
that the client understands gzip encoding and will inflate the
response before reading it.

In this particular case however it doesn't.  Its the reverse proxy
server in front of JGit letting us know the proxy<->JGit link can
be gzip compressed, while the client<->proxy part of the link is not:

  client <-- no gzip --> proxy <-- gzip --> JGit

Use a more standard method of parsing by splitting the value into
tokens, and only using gzip if one of the tokens is exactly the
string "gzip".  Add a unit test to make sure this isn't broken in
the future.

Change-Id: I30cda8a6d11ad235b56457adf54a2d27095d964e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agohttp.test: Delete badly named JUnit configurations 09/2509/1
Shawn O. Pearce [Tue, 15 Feb 2011 22:13:59 +0000 (14:13 -0800)]
http.test: Delete badly named JUnit configurations

We also have org.eclipse.jgit.http--All-Tests, which matches the
style of the org.eclipse.jgit.core--All-Tests name. Drop the others
as these are just redundant duplicates.

Change-Id: I8600a343f6a85d21dc07bda68a8cb834c82946b5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoPackWriter: Try for accurate delta reuse on cached pack 08/2508/1
Shawn O. Pearce [Tue, 15 Feb 2011 17:40:16 +0000 (09:40 -0800)]
PackWriter: Try for accurate delta reuse on cached pack

If a cached pack is used, it might know how many deltas are contained
within it.  Record that count as part of our reusedDeltas field
for the stats line we show clients.

Change-Id: I1c61fb817305a95eeac654cccf132cba20b2339c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoUploadPack: Expose advertised refs to callers 07/2507/1
Shawn O. Pearce [Mon, 14 Feb 2011 17:02:57 +0000 (09:02 -0800)]
UploadPack: Expose advertised refs to callers

Like ReceivePack, callers that embed UploadPack within their
service may wish to see the set of references that were sent
to the client. We already have the map on hand, it just needs
to be exposed with a getter.

Change-Id: I123b23e475860d5bb968906bef59068985088b7b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoRepositoryBuilder: Allow callers to require repository exists 92/2492/4
Shawn O. Pearce [Sun, 13 Feb 2011 02:44:39 +0000 (18:44 -0800)]
RepositoryBuilder: Allow callers to require repository exists

The setMustExist() method allows callers to require the repository
exists in order for build() to succeed. This is useful within a
RepositoryResolver where existence is required.

Change-Id: I6a1154551435cf0da6c2b4a7f4dce266abea5dff
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
13 years agopgm: Make --git-dir a string 41/2441/6
Shawn O. Pearce [Mon, 7 Feb 2011 01:42:28 +0000 (17:42 -0800)]
pgm: Make --git-dir a string

DHT based repository types don't use a java.io.File to name the
repository.  Moving the type to a string starts to open up more types
of repository names, making the standard pgm package easier to reuse
on other storage systems.

Change-Id: I262ccc8c01cd6db88f832ef317b0e1e5db2d016a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
13 years agoMerge "daemon: Use HTTP's resolver and factory pattern"
Chris Aniszczyk [Tue, 15 Feb 2011 17:09:39 +0000 (12:09 -0500)]
Merge "daemon: Use HTTP's resolver and factory pattern"

13 years agoMerge "Fix processing of broken symbolic references in RefDirectory"
Shawn Pearce [Tue, 15 Feb 2011 14:33:31 +0000 (09:33 -0500)]
Merge "Fix processing of broken symbolic references in RefDirectory"

13 years agoFix processing of broken symbolic references in RefDirectory 62/2462/2
Marc Strapetz [Wed, 9 Feb 2011 11:54:09 +0000 (12:54 +0100)]
Fix processing of broken symbolic references in RefDirectory

Change-Id: I1f85890fe718f38ef4b62ebe711f0668267873a2

13 years agodaemon: Use HTTP's resolver and factory pattern 40/2440/5
Shawn O. Pearce [Mon, 7 Feb 2011 00:38:02 +0000 (16:38 -0800)]
daemon: Use HTTP's resolver and factory pattern

Using a resolver and factory pattern for the anonymous git:// Daemon
class makes transport.Daemon more useful on non-file storage systems,
or in embedded applications where the caller wants more precise
control over the work tasks constructed within the daemon.

Rather than defining new interfaces, move the existing HTTP ones
into transport.resolver and make them generic on the connection
handle type.  For HTTP, continue to use HttpServletRequest, and
for transport.Daemon use DaemonClient.

To remain compatible with transport.Daemon, FileResolver needs to
learn how to use multiple base directories, and how to export any
Repository instance at a fixed name.

Change-Id: I1efa6b2bd7c6567e983fbbf346947238ea2e847e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoMerge "UploadPack: Expose PackWriter activity to a logger"
Chris Aniszczyk [Mon, 14 Feb 2011 23:15:34 +0000 (18:15 -0500)]
Merge "UploadPack: Expose PackWriter activity to a logger"

13 years agoMerge "RevWalk: Avoid unnecessary re-parsing of commit bodies"
Chris Aniszczyk [Mon, 14 Feb 2011 23:14:59 +0000 (18:14 -0500)]
Merge "RevWalk: Avoid unnecessary re-parsing of commit bodies"

13 years agoMerge "RevWalk: Don't reset ObjectReader when stopping"
Chris Aniszczyk [Mon, 14 Feb 2011 23:13:39 +0000 (18:13 -0500)]
Merge "RevWalk: Don't reset ObjectReader when stopping"

13 years agoMerge "UploadPack: Donate parsed commits to PackWriter"
Chris Aniszczyk [Mon, 14 Feb 2011 23:13:00 +0000 (18:13 -0500)]
Merge "UploadPack: Donate parsed commits to PackWriter"

13 years agoCreateBranchCommand: Wrong existence check 95/2495/1
Mathias Kinzler [Mon, 14 Feb 2011 14:48:43 +0000 (15:48 +0100)]
CreateBranchCommand: Wrong existence check

Bug: 337044
Change-Id: I3bc42fea1f552f10d4729999cab6fb4241b70325
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
13 years agoUploadPack: Expose PackWriter activity to a logger 39/2439/3
Shawn O. Pearce [Sun, 6 Feb 2011 23:58:24 +0000 (15:58 -0800)]
UploadPack: Expose PackWriter activity to a logger

The UploadPackLogger interface allows applications that embed
GitServlet or otherwise use UploadPack to service clients to
track and log how PackWriter was used, and what it sent.  This
provides more granularity into the request activity than might
be available from the HTTP server logs, helping administrators
to better understand utilization and Git server performance.

Change-Id: I1d36b060eb3385339d5f986e68192789ef70fc4e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoRevWalk: Avoid unnecessary re-parsing of commit bodies 50/2450/2
Shawn O. Pearce [Tue, 8 Feb 2011 00:30:49 +0000 (16:30 -0800)]
RevWalk: Avoid unnecessary re-parsing of commit bodies

If the RevFilter doesn't actually require the commit body,
we shouldn't reparse it if the body was disposed.  This happens
often inside of UploadPack during common ancestor negotation, the
RevWalk is reset and re-run over roughly the same commit space,
but the bodies are discarded because the commit message is not
relevant to the process.

Change-Id: I87b6b6a5fb269669867047698abf718d366bd002
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoRevWalk: Don't reset ObjectReader when stopping 33/2433/4
Shawn O. Pearce [Sun, 6 Feb 2011 09:15:33 +0000 (01:15 -0800)]
RevWalk: Don't reset ObjectReader when stopping

Applications like UploadPack reset() and reuse the same RevWalk
multiple times in very rapid succession.  Releasing the ObjectReader's
internal state on each use, only to allocate it again on the next
cycle kills performance if the ObjectReader has internal caches, or
even if the Inflater gets returned and pulled from the InflaterCache
too frequently.

Making releasing the ObjectReader the application's responsibility
when it is done with the RevWalk, which most already do by wrapping
their loop in a try/finally block.

Change-Id: I3ad188a719e8d7f6bf27d1a7ca16d465534713f4
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoUploadPack: Donate parsed commits to PackWriter 32/2432/4
Shawn O. Pearce [Sun, 6 Feb 2011 09:00:44 +0000 (01:00 -0800)]
UploadPack: Donate parsed commits to PackWriter

When UploadPack has computed the merge base between the client's have
set and the want set, its already loaded and parsed all of the
interesting commits that PackWriter needs to transmit to the client.
Switching the RevWalk and its object pool over to be an ObjectWalk
saves PackWriter from needing to re-parse these same commits from the
ObjectDatabase, reducing the startup latency for the enumeration
phase of packing.

UploadPack doesn't want to use an ObjectWalk for the okToGiveUp()
tests because its slower, during each commit popped it needs to cache
the tree into the pendingObjects list, and during each reset() it
discards a bunch of ObjectWalk specific state and reallocates some
internal collections.  ObjectWalk was never meant to be rapidly
reset() like UploadPack does, so its perhaps somewhat cleaner to allow
"upgrading" a RevWalk to an ObjectWalk.

Bug: 301639
Change-Id: I97ef52a0b79d78229c272880aedb7f74d0f7532f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoSetup the default remote and merge config in CloneCommand 56/2456/3
Chris Aniszczyk [Tue, 8 Feb 2011 14:18:18 +0000 (08:18 -0600)]
Setup the default remote and merge config in CloneCommand

Bug: 336621
Change-Id: I8c889d7b42f6f121d096acad1fada8e3752d74f9
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
13 years agoUploadPack: Rely on peeled ref data for include-tag 49/2449/2
Shawn O. Pearce [Tue, 8 Feb 2011 00:46:06 +0000 (16:46 -0800)]
UploadPack: Rely on peeled ref data for include-tag

The peeled reference information for tags is more efficient to
work with than parsing the tag objects, as usually its coming from
the packed-refs file, which stores the peeled information for us.
Rely on the peeled information to decide if the tag should be
included or not, instead of using our RevWalk to parse the object.

Change-Id: I6714a8560a1c04b5578e9c5b469ea3c77188dff3
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoUploadPack: Assume okToGiveUp is initially false 31/2431/2
Shawn O. Pearce [Sun, 6 Feb 2011 08:42:23 +0000 (00:42 -0800)]
UploadPack: Assume okToGiveUp is initially false

When negotiate() starts there is at least one want, but no haves, and
thus no common base exists.  Its not ok to give up yet, the client
should try to find a common base with the server.  Avoid scanning our
history along the want chains until we have found at least one commit
in common with the client, this will trigger okToGiveUp to be set to
null, enabling okToGiveUp() to perform the scan.

Bug: 301639
Change-Id: I98a82a5424fd4c9995924375c7910f76ca4f03af
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoUploadPack: Avoid walking the entire project history 30/2430/2
Shawn O. Pearce [Sun, 6 Feb 2011 03:00:15 +0000 (19:00 -0800)]
UploadPack: Avoid walking the entire project history

If the client presents a common commit on a side branch, and there is
a want for a disconnected branch UploadPack was walking back on the
entire history of the disconnected branch because it never would find
the common commit.

Limit our search back along any given want to be no earlier than the
oldest common commit received via a "have" line from our client.  This
prevents us from looking at all of the project history.

Bug: 301639
Change-Id: Iffaaa2250907150d6efa1cf2f2fcf59851d5267d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
13 years agoMerge "UploadPack: Tag non-commits SATISIFIED earlier"
Chris Aniszczyk [Sun, 13 Feb 2011 21:23:35 +0000 (16:23 -0500)]
Merge "UploadPack: Tag non-commits SATISIFIED earlier"

13 years agoMerge "UploadPack: Don't discard COMMON, SATISIFIED flags"
Chris Aniszczyk [Sun, 13 Feb 2011 21:23:07 +0000 (16:23 -0500)]
Merge "UploadPack: Don't discard COMMON, SATISIFIED flags"

13 years agoMerge "UploadPack: Fix want-is-satisfied test"
Chris Aniszczyk [Sun, 13 Feb 2011 21:21:11 +0000 (16:21 -0500)]
Merge "UploadPack: Fix want-is-satisfied test"

13 years agoMerge "UploadPack: Avoid parsing want list on clone"
Chris Aniszczyk [Sun, 13 Feb 2011 21:20:27 +0000 (16:20 -0500)]
Merge "UploadPack: Avoid parsing want list on clone"

13 years agoQualify post 0.11 builds 87/2487/1
Matthias Sohn [Sat, 12 Feb 2011 02:30:05 +0000 (03:30 +0100)]
Qualify post 0.11 builds

Change-Id: Ibcef4fc4c986c2cda01e943d16aa1c53eff99f25
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
13 years agoJGit 0.11.1 82/2482/1 v0.11.1
Matthias Sohn [Fri, 11 Feb 2011 22:25:34 +0000 (23:25 +0100)]
JGit 0.11.1

Change-Id: I9ac2fdfb4326536502964ba614d37d0bd103f524
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
13 years agoFix version.sh 81/2481/1
Matthias Sohn [Fri, 11 Feb 2011 22:03:40 +0000 (23:03 +0100)]
Fix version.sh

Change-Id: Ia010c9cecefbfb90ae54786adc7c8d838525d2f3
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
13 years agoMerge "Fix NPE on reading global config on MAC" into stable-0.11
Chris Aniszczyk [Wed, 9 Feb 2011 17:27:36 +0000 (12:27 -0500)]
Merge "Fix NPE on reading global config on MAC" into stable-0.11

13 years agoFix NPE on reading global config on MAC 69/2469/1
Jens Baumgart [Wed, 9 Feb 2011 14:12:31 +0000 (15:12 +0100)]
Fix NPE on reading global config on MAC

Bug: 336610

Change-Id: Iefcb85e791723801faa315b3ee45fb19e3ca52fb
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
13 years agoAdd isOutdated method to DirCache 67/2467/1
Jens Baumgart [Wed, 9 Feb 2011 14:02:22 +0000 (15:02 +0100)]
Add isOutdated method to DirCache

isOutdated returns true iff the memory state differs from the index
file.

Change-Id: If35db06743f5f588ab19d360fd2a18a07c918edb
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
13 years agoPullCommand: use default remote instead of throwing Exception 53/2453/1
Mathias Kinzler [Tue, 8 Feb 2011 07:56:19 +0000 (08:56 +0100)]
PullCommand: use default remote instead of throwing Exception

When pulling into a local branch that has no upstream configuration,
pull should try to used the default remote ("origin") instead of
throwing an Exception.

Bug: 336504
Change-Id: Ife75858e89ea79c0d6d88ba73877fe8400448e34
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
13 years agoRemove quoting of command over SSH 35/2435/1
Shawn O. Pearce [Sun, 6 Feb 2011 22:04:39 +0000 (14:04 -0800)]
Remove quoting of command over SSH

If the command contains spaces, it needs to be evaluated by the remote
shell.  Quoting the command breaks this, making it impossible to run a
remote command that needs additional options.

Bug: 336301
Change-Id: Ib5d88f0b2151df2d1d2b4e08d51ee979f6da67b5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoUploadPack: Tag non-commits SATISIFIED earlier 29/2429/1
Shawn O. Pearce [Sun, 6 Feb 2011 02:47:42 +0000 (18:47 -0800)]
UploadPack: Tag non-commits SATISIFIED earlier

This gets non-commits out of the wantSatisfied() main loop by making
use of the cached SATISIFIED flag and its existing bypass.  Anything
that isn't a commit cannot be discovered by the have negotiation, so
its always assumed to be SATISIFIED by the server.

Bug: 301639
Change-Id: I1ef354fbf2e2ed44c9020a4069d7179f2159f19f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoUploadPack: Don't discard COMMON, SATISIFIED flags 28/2428/1
Shawn O. Pearce [Sun, 6 Feb 2011 02:23:18 +0000 (18:23 -0800)]
UploadPack: Don't discard COMMON, SATISIFIED flags

When the walker resets, its going to scrub the COMMON and SATISIFIED
flags off a commit if the commit is contained within another commit
the client wants.  This is common if the client asks for both a
'maint' and 'master' branch, and 'maint' is also fully merged into
'master'.

COMMON shouldn't be scrubbed during reset because its used to control
membership of the commonBase collection, which is a List.  commonBase
should technically be a set, but membership is cheaper with a RevFlag.
COMMON appears on a commit reachable from a WANT when there is also a
PEER_HAS flag present, as this is a merge base.  Scrubbing this off
when another branch is tested isn't useful.

SATISIFIED is a cache to tell us if wantSatisified() has already
completed for this particular WANT.  If it has, there isn't a need to
recompute on that branch.  Scrubbing it off 'maint' when we test
'master' just means we would later need to re-test 'maint', wasting
CPU time on the server.

Bug: 301639
Change-Id: I3bb67d68212e4f579e8c5dfb138f007b406d775f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoUploadPack: Fix want-is-satisfied test 27/2427/1
Shawn O. Pearce [Sun, 6 Feb 2011 01:49:01 +0000 (17:49 -0800)]
UploadPack: Fix want-is-satisfied test

okToGiveUpImp() has been missing a ! for a long time.  This loop over
wantAll() is looking for an object where wantSatisfied() returns
false, because there is no common merge base present.  Unfortunately
it was missing a !, causing the loop to break and return false after
at least one want was satisified.

Bug: 301639
Change-Id: Ifdbe0b22c9cd0a9181546d090b4990d792d70c82
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoFix JGit --upload-pack, --receive-pack options 16/2416/4
Shawn O. Pearce [Fri, 4 Feb 2011 13:51:27 +0000 (05:51 -0800)]
Fix JGit --upload-pack, --receive-pack options

JGit did not use sh -c to run the receive-pack or upload-pack programs
locally, which caused errors if these strings contained spaces and
needed the local shell to evaluate them.

Win32 support using cmd.exe /c is completely untested, but seems like
it should work based on the limited information I could get through
Google search results.

Bug: 336301
Change-Id: I22e5e3492fdebbae092d1ce6b47ad411e57cc1ba
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoIn iplog list approved CQs as "active" 26/2426/1
Matthias Sohn [Sun, 6 Feb 2011 00:19:52 +0000 (01:19 +0100)]
In iplog list approved CQs as "active"

Change-Id: I69c60576ae648fea2a730c9e9f042004bccecc90
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
13 years agoUploadPack: Avoid parsing want list on clone 11/2411/3
Shawn O. Pearce [Thu, 3 Feb 2011 20:37:39 +0000 (12:37 -0800)]
UploadPack: Avoid parsing want list on clone

If a client wants to perform a clone of the repository, it sends
wants, but no haves.  There is no point in parsing the want list
within UploadPack, as there won't be a common merge base search.
Instead just defer the parsing to PackWriter, which will do its
own parsing and object enumeration.

If the client does have a "have" set, defer parsing of the want list
until the have list is also parsed, and parse them together in a
single batch queue.  This lets the underlying storage system use a
larger lookup batch if there is significant latency involved when
resolving an ObjectId to a RevObject.

Change-Id: I9c30d34f8e344da05c8a2c041a6dc181d8e8bc19
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoReuse cached SHA-1 when computing from WorkingTreeIterator 27/1927/7
Shawn O. Pearce [Fri, 19 Nov 2010 01:15:19 +0000 (17:15 -0800)]
Reuse cached SHA-1 when computing from WorkingTreeIterator

Change-Id: I2b2170c29017993d8cb7a1d3c8cd94fb16c7dd02
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
13 years agoPackWriter: Support reuse of entire packs 88/2388/4
Shawn O. Pearce [Mon, 31 Jan 2011 16:35:17 +0000 (08:35 -0800)]
PackWriter: Support reuse of entire packs

The most expensive part of packing a repository for transport to
another system is enumerating all of the objects in the repository.
Once this gets to the size of the linux-2.6 repository (1.8 million
objects), enumeration can take several CPU minutes and costs a lot
of temporary working set memory.

Teach PackWriter to efficiently reuse an existing "cached pack"
by answering a clone request with a thin pack followed by a larger
cached pack appended to the end.  This requires the repository
owner to first construct the cached pack by hand, and record the
tip commits inside of $GIT_DIR/objects/info/cached-packs:

  cd $GIT_DIR
  root=$(git rev-parse master)
  tmp=objects/.tmp-$$
  names=$(echo $root | git pack-objects --keep-true-parents --revs $tmp)
  for n in $names; do
    chmod a-w $tmp-$n.pack $tmp-$n.idx
    touch objects/pack/pack-$n.keep
    mv $tmp-$n.pack objects/pack/pack-$n.pack
    mv $tmp-$n.idx objects/pack/pack-$n.idx
  done

  (echo "+ $root";
   for n in $names; do echo "P $n"; done;
   echo) >>objects/info/cached-packs

  git repack -a -d

When a clone request needs to include $root, the corresponding
cached pack will be copied as-is, rather than enumerating all of
the objects that are reachable from $root.

For a linux-2.6 kernel repository that should be about 376 MiB,
the above process creates two packs of 368 MiB and 38 MiB[1].
This is a local disk usage increase of ~26 MiB, due to reduced
delta compression between the large cached pack and the smaller
recent activity pack.  The overhead is similar to 1 full copy of
the compressed project sources.

With this cached pack in hand, JGit daemon completes a clone request
in 1m17s less time, but a slightly larger data transfer (+2.39 MiB):

  Before:
    remote: Counting objects: 1861830, done
    remote: Finding sources: 100% (1861830/1861830)
    remote: Getting sizes: 100% (88243/88243)
    remote: Compressing objects: 100% (88184/88184)
    Receiving objects: 100% (1861830/1861830), 376.01 MiB | 19.01 MiB/s, done.
    remote: Total 1861830 (delta 4706), reused 1851053 (delta 1553844)
    Resolving deltas: 100% (1564621/1564621), done.

    real  3m19.005s

  After:
    remote: Counting objects: 1601, done
    remote: Counting objects: 1828460, done
    remote: Finding sources: 100% (50475/50475)
    remote: Getting sizes: 100% (18843/18843)
    remote: Compressing objects: 100% (7585/7585)
    remote: Total 1861830 (delta 2407), reused 1856197 (delta 37510)
    Receiving objects: 100% (1861830/1861830), 378.40 MiB | 31.31 MiB/s, done.
    Resolving deltas: 100% (1559477/1559477), done.

    real 2m2.938s

Repository owners can periodically refresh their cached packs by
repacking their repository, folding all newer objects into a larger
cached pack.  Since repacking is already considered to be a normal
Git maintenance activity, this isn't a very big burden.

[1] In this test $root was set back about two weeks.

Change-Id: Ib87131d5c4b5e8c5cacb0f4fe16ff4ece554734b
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoPackWriter: Display totals after sending objects 87/2387/2
Shawn O. Pearce [Mon, 31 Jan 2011 16:58:23 +0000 (08:58 -0800)]
PackWriter: Display totals after sending objects

CGit pack-objects displays a totals line after the pack data
was fully written.  This can be useful to understand some of
the decisions made by the packer, and has been a great tool
for helping to debug some of that code.

Track some of the basic values, and send it to the client when
packing is done:

  remote: Counting objects: 1826776, done
  remote: Finding sources: 100% (55121/55121)
  remote: Getting sizes: 100% (25654/25654)
  remote: Compressing objects: 100% (11434/11434)
  remote: Total 1861830 (delta 3926), reused 1854705 (delta 38306)
  Receiving objects: 100% (1861830/1861830), 386.03 MiB | 30.32 MiB/s, done.

Change-Id: If3b039017a984ed5d5ae80940ce32bda93652df5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoRefAdvertiser: Avoid object parsing 10/2410/1
Shawn O. Pearce [Wed, 2 Feb 2011 21:48:26 +0000 (13:48 -0800)]
RefAdvertiser: Avoid object parsing

It isn't strictly necessary to validate every reference's target
object is reachable in the repository before advertising it to a
client. This is an expensive operation when there are thousands of
references, and its very unlikely that a reference uses a missing
object, because garbage collection proceeds from the references and
walks down through the graph. So trying to hide a dangling reference
from clients is relatively pointless.

Even if we are trying to avoid giving a client a corrupt repository,
this simple check isn't sufficient.  It is possible for a reference to
point to a valid commit, but that commit to have a missing blob in its
root tree.  This can be caused by staging a file into the index,
waiting several weeks, then committing that file while also racing
against a prune.  The prune may delete the blob, since its
modification time is more than 2 weeks ago, but retain the commit,
since its modification time is right now.

Such graph corruption is already caught during PackWriter as it
enumerates the graph from the client's want list and digs back
to the roots or common base.  Leave the reference validation also
for that same phase, where we know we have to parse the object to
support the enumeration.

Change-Id: Iee70ead0d3ed2d2fcc980417d09d7a69b05f5c2f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years agoMerge "Expose some constants needed for reading the Pull configuration"
Chris Aniszczyk [Wed, 2 Feb 2011 15:22:23 +0000 (10:22 -0500)]
Merge "Expose some constants needed for reading the Pull configuration"

13 years agoMerge "Adapt expected commit message in tests"
Chris Aniszczyk [Wed, 2 Feb 2011 15:18:31 +0000 (10:18 -0500)]
Merge "Adapt expected commit message in tests"

13 years agoAdapt expected commit message in tests 03/2403/1
Robin Stocker [Wed, 2 Feb 2011 15:11:39 +0000 (16:11 +0100)]
Adapt expected commit message in tests

Because of change I28ae5713, the commit message lost the "into HEAD" and
caused the MergeCommandTest to fail. This change fixes it.

Bug: 336059
Change-Id: Ifac0138c6c6d66c40d7295b5e11ff3cd98bc9e0c

13 years agoExpose some constants needed for reading the Pull configuration 02/2402/1
Mathias Kinzler [Wed, 2 Feb 2011 13:45:37 +0000 (14:45 +0100)]
Expose some constants needed for reading the Pull configuration

Change-Id: I72cb1cc718800c09366306ab2eebd43cd82023ff
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
13 years agoPushCommand: do not set a null credentials provider 00/2400/1
Jens Baumgart [Wed, 2 Feb 2011 12:13:28 +0000 (13:13 +0100)]
PushCommand: do not set a null credentials provider

PushCommand now does not set a null credentials provider on
Transport because in this case the default provider is replaced with
null and the default mechanism for providing credentials is not
working.

Bug: 336023
Change-Id: I7a7a9221afcfebe2e1595a5e59641e6c1ae4a207
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
13 years agoDon't print "into HEAD" when merging refs/heads/master 96/2396/1
Robin Stocker [Tue, 1 Feb 2011 21:27:33 +0000 (22:27 +0100)]
Don't print "into HEAD" when merging refs/heads/master

When MergeMessageFormatter was given a symbolic ref HEAD which points to
refs/heads/master (which is the case when merging a branch in EGit), it
would result in a merge message like the following:

  Merge branch 'a' into HEAD

But it should print the following (as C Git does):

  Merge branch 'a'

The solution is to use the leaf ref when checking for refs/heads/master.

Change-Id: I28ae5713b7e8123a0176fc6d7356e469900e7e97

13 years agoPackWriter: Make thin packs more efficient 86/2386/2
Shawn O. Pearce [Mon, 31 Jan 2011 07:40:08 +0000 (23:40 -0800)]
PackWriter: Make thin packs more efficient

There is no point in pushing all of the files within the edge
commits into the delta search when making a thin pack.  This floods
the delta search window with objects that are unlikely to be useful
bases for the objects that will be written out, resulting in lower
data compression and higher transfer sizes.

Instead observe the path of a tree or blob that is being pushed
into the outgoing set, and use that path to locate up to WINDOW
ancestor versions from the edge commits.  Push only those objects
into the edgeObjects set, reducing the number of objects seen by the
search window.  This allows PackWriter to only look at ancestors
for the modified files, rather than all files in the project.
Limiting the search to WINDOW size makes sense, because more than
WINDOW edge objects will just skip through the window search as
none of them need to be delta compressed.

To further improve compression, sort edge objects into the front
of the window list, rather than randomly throughout.  This puts
non-edges later in the window and gives them a better chance at
finding their base, since they search backwards through the window.

These changes make a significant difference in the thin-pack:

  Before:
    remote: Counting objects: 144190, done
    remote: Finding sources: 100% (50275/50275)
    remote: Getting sizes: 100% (101405/101405)
    remote: Compressing objects: 100% (7587/7587)
    Receiving objects: 100% (50275/50275), 24.67 MiB | 9.90 MiB/s, done.
    Resolving deltas: 100% (40339/40339), completed with 2218 local objects.

    real    0m30.267s

  After:
    remote: Counting objects: 61549, done
    remote: Finding sources: 100% (50275/50275)
    remote: Getting sizes: 100% (18862/18862)
    remote: Compressing objects: 100% (7588/7588)
    Receiving objects: 100% (50275/50275), 11.04 MiB | 3.51 MiB/s, done.
    Resolving deltas: 100% (43160/43160), completed with 5014 local objects.

    real    0m22.170s

The resulting pack is 13.63 MiB smaller, even though it contains the
same exact objects.  82,543 fewer objects had to have their sizes
looked up, which saved about 8s of server CPU time.  2,796 more
objects from the client were used as part of the base object set,
which contributed to the smaller transfer size.

Change-Id: Id01271950432c6960897495b09deab70e33993a9
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Sigend-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
13 years agoPackWriter: Cleanup findObjectToPack method 85/2385/2
Shawn O. Pearce [Sun, 30 Jan 2011 23:10:51 +0000 (15:10 -0800)]
PackWriter: Cleanup findObjectToPack method

Some of this code predates making ObjectId.equals() final
and fixing RevObject.equals() to match ObjectId.equals().
It was therefore more complex than it needs to be, because
it tried to work around RevObject's broken equals() rules
by converting to ObjectId in a different collection.

Also combine setUpWalker() and findObjectsToPack() methods,
these can be one method and the code is actually cleaner.

Change-Id: I0f4cf9997cd66d8b6e7f80873979ef1439e507fe
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>