Jesse Greenwald [Wed, 9 Mar 2011 17:48:52 +0000 (09:48 -0800)]
Fixed ordering of Config.getSubsections(...)
A standard HashSet was being used to store the list of subsections as
they were being parsed. This was changed to use a LinkedHashSet so
that iterating over the set would return values in the same order as
they are listed in the config file.
Matthias Sohn [Tue, 8 Mar 2011 22:41:47 +0000 (23:41 +0100)]
[findbugs] Avoid futile attempt to change max pool size
Javadoc for ScheduledThreadPoolExecutor says [1]:
While ScheduledThreadPoolExecutor inherits from ThreadPoolExecutor, a
few of the inherited tuning methods are not useful for it. In
particular, because it acts as a fixed-sized pool using corePoolSize
threads and an unbounded queue, adjustments to maximumPoolSize have no
useful effect.
Shawn O. Pearce [Tue, 8 Mar 2011 01:49:08 +0000 (17:49 -0800)]
PackWriter: Reduce GC during enumeration
Instead of resizing an ArrayList until all objects have been added,
append objects into a specialized List type that uses small arrays
of 1024 entries for each 1024 objects added.
For a large repository like linux-2.6, PackWriter will now allocate
1,758 smaller arrays to hold the object list, without creating any
garbage from the intermediate states due to list expansion.
1024 was chosen as the block size (and initial directory size) as this
is a reasonable balance for the PackWriter code. Each block uses
approximately 4096 bytes in a 32 bit JVM, as does the default top
level block directory. The top level directory doesn't expand until 1
million items have been added to the list, which for linux-2.6 won't
yet occur as the lists are per-object-type and are thus bounded to
about 1/3 of 1.8 million.
Change-Id: If9e4092eb502394c5d3d044b58cf49952772f6d6 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 7 Mar 2011 20:29:59 +0000 (12:29 -0800)]
Remove deprecated TreeVisitor
This type and its associated methods has been deprecated for a while
now. Time to remove it. Applications can use a TreeWalk instead to
access the elements of any tree-like object.
Change-Id: I047e552ac77b77e2de086f63cb4fb318da57c208 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sat, 5 Mar 2011 02:56:16 +0000 (18:56 -0800)]
PackFile: Fix copy as-is for small objects
When I disabled validation I broke the code that handled copying small
objects whose contents were below 8192 bytes in size but spanned over
the end of one window and into the next window. These objects did not
ever populate the temporary write buffer, resulting in garbage writing
into the output stream instead of valid object contents.
Change-Id: Ie26a2aaa885d0eee4888a9b12c222040ee4a8562 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Robin Rosenberg [Fri, 4 Mar 2011 15:00:25 +0000 (16:00 +0100)]
Fix DirCache re-read.
During unit tests and most likely elsewhere, updates come too fast for
a simple timestamp comparison (with one seconds resolution) to work.
I.e. DirCache thinks it hasn't changed.
Use FileSnapshot instead which has more advanced logic.
Change-Id: Ib850f84398ef7d4b8a8a6f5a0ae6963e37f2b470 Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Shawn O. Pearce [Fri, 4 Mar 2011 00:17:29 +0000 (16:17 -0800)]
resolve(): Fix wrong parsing of branch "foo-gbed2-dev"
When parsing a string such as "foo-gbed2" resolve() was assuming the
suffix was from git describe output. This lead to JGit trying to find
the completion for the object abbreviation "bed2", rather than using
the current value of the reference. If there was only one such object
in the repository, JGit might actually use the wrong value here, as
resolve() would return the completion of the abbreviation "bed2"
rather than the current value of the reference "refs/heads/foo-gbed2".
Move the parsing of git describe abbreviations out of the operator
portion of the resolve() method and into the simple portion that is
supposed to handle only object ids or reference names, and only do the
describe parsing after all other approaches have already failed to
provide a resolution.
Add new unit tests to verify the behavior is as expected by users.
Bug: 338839
Change-Id: I52054d7b89628700c730f9a4bd7743b16b9042a9 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 3 Mar 2011 22:36:19 +0000 (14:36 -0800)]
RemoteRefUpdate: Accept Ref and ObjectId arguments for source
Applications may already have a Ref or ObjectId on hand that they want
the remote to be updated to. Instead of converting these into a
String and relying on the parsing rules of resolve(), allow the
application to supply the Ref or ObjectId directly.
Bug: 338839
Change-Id: If5865ac9eb069de1c8f224090b6020fc422f9f12 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 2 Mar 2011 20:49:00 +0000 (12:49 -0800)]
PackWriter: Validate reused cached packs
If object reuse validation is enabled, the output pack is going to
probably be stored locally. When reusing an existing cached pack
to save object enumeration costs, ensure the cached pack has not
been corrupted by checking its SHA-1 trailer. If it has, writing
will abort and the output pack won't be complete. This prevents
anyone from trying to use the output pack, and catches corruption
before it can be carried any further.
Change-Id: If89d0d4e429d9f4c86f14de6c0020902705153e6 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 2 Mar 2011 20:23:55 +0000 (12:23 -0800)]
PackWriter: Avoid CRC-32 validation when feeding IndexPack
There is no need to validate the object contents during
copyObjectAsIs if the result is going to be parsed by unpack-objects
or index-pack. Both programs will compute the SHA-1 of the object,
and also validate most of the pack structure. For git daemon
like servers, this work is already done on the client end of the
connection, so the server doesn't need to repeat that work itself.
Disable object validation for the 3 transport cases where we know
the remote side will handle object validation for us (push, bundle
creation, and upload pack). This improves performance on the server
side by reducing the work that must be done.
Change-Id: Iabb78eec45898e4a17f7aab3fb94c004d8d69af6 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 1 Mar 2011 00:30:23 +0000 (16:30 -0800)]
PackWriter: Position tags after commits
Annotated tags need to be parsed by many viewing tools, but putting
them at the end of the pack hurts because kernel prefetching might
not have loaded them, since they are so far from the commits they
reference.
Position tags right behind the commits, but before the trees.
Typically the annotated tag set for a repository is very small,
so the extra prefetch burden it puts on tools that don't need
annotated tags (but do need commits and trees) is fairly low.
Change-Id: Ibbabdd94e7d563901c0309c79a496ee049cdec50 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 28 Feb 2011 23:39:31 +0000 (15:39 -0800)]
PackWriter: Don't reuse commit or tag deltas
JGit doesn't generate deltas for commit or tag objects when it packs
a repository from scratch. This is an explicit design decision that
is (mostly) justified by the fact that these objects do not delta
compress well.
Annotated tags are made once on stable points of the project history,
it is unlikely they will ever appear again with sufficient common
text to justify using a delta over just deflating the raw content.
JGit never tries to delta compress annotated tags and I take the
stance that these are best stored as non-deltas given how frequently
they might be accessed by repository viewers.
Commits only have sufficient common text when they are cherry-picked
to forward-port or back-port a change from one branch to another.
Even in these cases the distance between the commits as returned
by the log traversal has to be small enough that they would both
appear in the delta search window at the same time in order to
delta compress one of the messages against the other. JGit never
tries to delta compress commits, as it requires a lot of CPU time
but typically does not produce a smaller pack file.
Avoid reusing deltas for either of these types when constructing a
new pack. To avoid killing performance during serving of network
clients, UploadPack disables this code change by allowing PackWriter
to reuse delta commits. Repositories that were already repacked by
C Git will not have their delta commits decompressed and recompressed
on the fly during object writing, saving server-side CPU resources.
Change-Id: I749407e7c5c677e05e4d054b40db7656cfa7fca8 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 1 Mar 2011 17:28:11 +0000 (09:28 -0800)]
PackWriter: Do not delta compress already packed objects
This is a tiny optimization to how delta search works. Checking for
isReuseAsIs() avoids doing delta compression search on non-delta
objects already stored in packs within the repository. Such objects
are not likely to be delta compressable, as they were already delta
searched when their containing pack was generated and they were
not delta compressed at that time. Doing delta compression now is
unlikely to produce a different result, but would waste a lot of CPU.
The isReuseAsIs() flag is checked before isDoNotDelta() because it
is very common to reuse objects in the output pack. Most objects
get reused, and only a handful have the isDoNotDelta() bit set.
Moving the check earlier allows the loop to more quickly skip
through objects that will never need to be considered.
Change-Id: Ied757363f775058177fc1befb8ace20fe9759bac Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 1 Mar 2011 18:06:39 +0000 (10:06 -0800)]
Paper bag fix BatchingProgressMonitor alarm queue
The alarm queue threads were started with an empty task body, which
meant the thread started and terminated immediately, leaving the
queue itself with no worker.
Change-Id: I2a9b5fe9c2bdff4a5e0f7ec7ad41a54b41a4ddd6 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 1 Mar 2011 03:34:06 +0000 (19:34 -0800)]
ProgressMonitor: Refactor to use background alarms
Instead of polling the system clock on every update(1) method call,
use a scheduled executor to toggle a volatile once per second until
the task is done. Check the volatile on each update(int), looking
to see if output should occur.
This limits progress output to either once per 1% complete, or once
per second. To save time during update calls the timer isn't reset
during each 1% of output, which means we may see one unnecessary
output trigger if at least 1% completed during the one second of the
alarm time.
Change-Id: I8fdd7e31c37bef39a5d1b3da7105da0ef879eb84 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Matthias Sohn [Mon, 28 Feb 2011 23:21:14 +0000 (00:21 +0100)]
Fix NPE on checkout of remote tracking branch
Checkout of remote tracking branch failed when no local branch
existed. Also enhance RepositoryTestCase to enable checking index
state of another test repository.
Bug: 337695
Change-Id: Idf4c05bdf23b5161688818342b2bf9a45b49f479 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Shawn O. Pearce [Sat, 26 Feb 2011 01:24:55 +0000 (17:24 -0800)]
Merge branch 'stable-0.11'
* stable-0.11:
JGit 0.11.3
Fix NullPointer when pulling from a deleted local branch
smart-http: Fix recognition of gzip encoding
Fix processing of broken symbolic references in RefDirectory
CreateBranchCommand: Wrong existence check
Qualify post 0.11.1 builds
Shawn O. Pearce [Sat, 26 Feb 2011 01:20:14 +0000 (17:20 -0800)]
UnpackedObject: Fix readSome() when initial read is short
JDK7 changed behavior slightly on some InputStream types, resulting in
the first read being shorter than the count requested. That caused us
to overwrite the earlier part of the buffer with later data, as the
offset index wasn't updated in the loop.
Fix the loop to increment offset by the number of bytes read in this
iteration, so the next read appends to the buffer rather than doing an
overwrite.
Bug: 338119
Change-Id: I222fb2f993cd9b637b6b8d93daab5777ef7ec7a6 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Matthias Sohn [Thu, 24 Feb 2011 12:52:24 +0000 (13:52 +0100)]
FetchCommand: do not set a null credentials provider
FetchCommand now does not set a null credentials provider on
Transport because in this case the default provider is replaced with
null and the default mechanism for providing credentials is not
working.
Change-Id: I44096aa856f031545df39d4b09af198caa2c21f6 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Shawn O. Pearce [Wed, 23 Feb 2011 20:00:25 +0000 (12:00 -0800)]
RevWalk: Don't release during inMergeBase()
In bc1af8459e ("RevWalk: Don't reset ObjectReader when stopping") we
stopped releasing the reader when the current log traversal is over.
This should have also been applied to the merge base logic that is
buried within MergeGenerator, but got missed.
Change-Id: I8328f43f02cba06fd545e22134872e781b9d4d36 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 23 Feb 2011 02:56:51 +0000 (18:56 -0800)]
PackWriter: Add missing timers to Statistics
We did not record the time spent on the object reuse search or the
object size lookup, both of which occur between the counting phase and
the compressing phase. If there are enough objects involved, these
times can be significant so its worth timing them and recording it.
Change-Id: I89084acfc598bb6533d75d90cb8de459f0ed93be Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Sasa Zivkov [Mon, 21 Feb 2011 15:43:06 +0000 (16:43 +0100)]
Show notes in Log CLI command
Support for --no-standard-notes and --show-notes=REF options is added
to the Log command. The --show-notes option can be specified more than
once if more than one notes branch should be used for showing notes.
The notes are displayed from note branches in the order how the note
branches are specified in the command line. However, the standard note,
from the refs/notes/commits, is always displayed as first unless
the --no-standard-notes options is given.
Change-Id: I4e7940804ed9d388b625b8e8a8e25bfcf5ee15a6 Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Stefan Lay [Wed, 16 Feb 2011 14:46:26 +0000 (15:46 +0100)]
Fix NullPointer when pulling from a deleted local branch
A checked Exception is thrown instead.
The reason for throwing an Exception is that the state of the
repository is inconsistent in this case: There is a merge
configuration containing a non-existing local branch. Ideally the
deletion of a local branch should also delete the corresponding
merge configuration.
Bug: 337315
Change-Id: I8ed57d5aaed60aaab685fc11a8695e474e60215f Signed-off-by: Stefan Lay <stefan.lay@sap.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Shawn O. Pearce [Tue, 15 Feb 2011 22:09:42 +0000 (14:09 -0800)]
smart-http: Fix recognition of gzip encoding
Some clients coming through proxies may advertise a different
Accept-Encoding, for example "Accept-Encoding: gzip(proxy)".
Matching by substring causes us to identify this as a false positive;
that the client understands gzip encoding and will inflate the
response before reading it.
In this particular case however it doesn't. Its the reverse proxy
server in front of JGit letting us know the proxy<->JGit link can
be gzip compressed, while the client<->proxy part of the link is not:
client <-- no gzip --> proxy <-- gzip --> JGit
Use a more standard method of parsing by splitting the value into
tokens, and only using gzip if one of the tokens is exactly the
string "gzip". Add a unit test to make sure this isn't broken in
the future.
Change-Id: Ib4c40f9db177322c7a2640808a6c10b3c4a73819 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Shawn O. Pearce [Sat, 19 Feb 2011 01:55:53 +0000 (17:55 -0800)]
PackWriter: Hoist and cluster reference targets
Many source browsers and network related tools like UploadPack need
to find and parse the target of all branches and annotated tags
within the repository during their startup phase. Clustering these
together into the same part of the pack file will improve locality,
reducing thrashing when an application starts and needs to load
all of these into memory at once.
To prevent bottlenecking basic log viewing tools that are scannning
backwards from the tip of a current branch (and don't need tags)
we place this cluster of older targets after 4096 newer commits
have already been placed into the pack stream. 4096 was chosen as
a rough guess, but was based on a few factors:
- log viewers typically show 5-200 commits per page
- users only view the first page or two
- DHT can cram 2200-4000 commits per 1 MiB chunk
thus these will fall into the second commit chunk (roughly)
Unfortunately this placement hurts history tools that are scanning
backwards through the commit graph and completely ignored tags or
branch heads when they started.
An ancient tagged commit is no longer positioned behind its first
child (its now much earlier), resulting in a page fault for the
parser to reload this cluster of objects on demand. This may be
an acceptable loss. If a user is walking backwards and has already
scanned through more than 4096 commits of history, waiting for the
region to reload isn't really that bad compared to the amount of
time already spent.
If the repository is so small that there are less than 4096 commits,
this change has no impact on the placement of objects.
Change-Id: If3052e430d305e17878d94145c93754f56b74c61 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sat, 19 Feb 2011 01:31:32 +0000 (17:31 -0800)]
PackWriter: Parse tag target objects in a batch
If the underlying storage has a high latency per SHA-1 lookup
(e.g. the DHT support we are working on), parsing each wanted
annotated tag object back to its underlying commit is too slow,
its a sequential lookup for each tag. With hundreds of tags in
a repository this takes far too long.
Instead queue up a list of the tags whose objects need to be found,
and then locate all of those in one parseAny batch. This works
for the common case of annotated tag to single tree or commit.
For the less often used tag->tag->commit, it at least gets us
one level parsed in the larger batch before we have to go back to
sequential lookups.
Change-Id: I94beef3f14281406f15c8cf9fa02d83faf102a19 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sat, 19 Feb 2011 01:06:36 +0000 (17:06 -0800)]
PackWriter: Short-circuit counting on full cached pack reuse
If one or more cached packs fully covers the request, don't bother
with looking up the objects and trying to walk the graph. Just use
the cached packs and return immediately.
This helps clones of quiet repositories that have not been modified
since their last repack, its likely the cached packs are accurate
and no graph walking is required.
Change-Id: I9062a5ac2f71b525322590209664a84051fd5f8a Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 18 Feb 2011 22:14:56 +0000 (14:14 -0800)]
BundleWriter: Always use OFS_DELTA
CGit just learned to always use OFS_DELTA when writing out bundle
files. This makes sense because bundle came about well after
OFS_DELTA was established, so any version of CGit that can read a
bundle file can also read OFS_DELTA. Since OFS_DELTA is smaller,
always use it when writing bundles.
Change-Id: I44f9921494798ea0c99e16eab58b87bebeb9aff5 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 17 Feb 2011 01:41:35 +0000 (17:41 -0800)]
PackWriter: Sort commits by parse order to improve locality
RevWalk in JGit and the revision code in C Git both parse commits out
of the pack file in an order that differs from strict timestamp and
topological sorting. Both implementations pop a commit from the head
of a date queue, and then immediately parse all of its parents in
order to insert those into the date queue at the proper positions as
determined by their committer timestamp field. This implies that the
parents are parsed when their most recent child is popped from the
queue, and not where they are popped during traversal.
Hoisting a parent commit to be immediately behind its child improves
locality by making sure all parents of a merge are clustered together,
and thus can be paged into the parser by the pack file buffering
system (aka WindowCache in JGit) together.
Change-Id: I80f9e64cafa2e8f082776b43845edf23065386a2 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Stefan Lay [Wed, 16 Feb 2011 14:46:26 +0000 (15:46 +0100)]
Fix NullPointer when pulling from a deleted local branch
A checked Exception is thrown instead.
The reason for throwing an Exception is that the state of the
repository is inconsistent in this case: There is a merge
configuration containing a non-existing local branch. Ideally the
deletion of a local branch should also delete the corresponding
merge configuration.
Bug: 337315
Change-Id: I71e56ffb90e11e6e3c1bbd964ad63972d67990c0 Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Shawn O. Pearce [Tue, 15 Feb 2011 22:46:30 +0000 (14:46 -0800)]
smart-http: Support progress in ReceivePack
As PackParser supports a progress meter for the "Resolving deltas"
phase of its work, we should export this to smart HTTP clients so
they know the server is still working on their (large) upload.
However this isn't as simple as just dropping in a binding for
the SmartOutputStream to flush when its told to. We want to
avoid spurious flushes triggered by the use of sideband, or the
status report formatting in the send-pack/receive-pack protocol.
Change-Id: Ibd88022a298c5fed0edb23dfaf2e90278807ba8b Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 15 Feb 2011 22:09:42 +0000 (14:09 -0800)]
smart-http: Fix recognition of gzip encoding
Some clients coming through proxies may advertise a different
Accept-Encoding, for example "Accept-Encoding: gzip(proxy)".
Matching by substring causes us to identify this as a false positive;
that the client understands gzip encoding and will inflate the
response before reading it.
In this particular case however it doesn't. Its the reverse proxy
server in front of JGit letting us know the proxy<->JGit link can
be gzip compressed, while the client<->proxy part of the link is not:
client <-- no gzip --> proxy <-- gzip --> JGit
Use a more standard method of parsing by splitting the value into
tokens, and only using gzip if one of the tokens is exactly the
string "gzip". Add a unit test to make sure this isn't broken in
the future.
Change-Id: I30cda8a6d11ad235b56457adf54a2d27095d964e Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 15 Feb 2011 22:13:59 +0000 (14:13 -0800)]
http.test: Delete badly named JUnit configurations
We also have org.eclipse.jgit.http--All-Tests, which matches the
style of the org.eclipse.jgit.core--All-Tests name. Drop the others
as these are just redundant duplicates.
Change-Id: I8600a343f6a85d21dc07bda68a8cb834c82946b5 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 15 Feb 2011 17:40:16 +0000 (09:40 -0800)]
PackWriter: Try for accurate delta reuse on cached pack
If a cached pack is used, it might know how many deltas are contained
within it. Record that count as part of our reusedDeltas field
for the stats line we show clients.
Change-Id: I1c61fb817305a95eeac654cccf132cba20b2339c Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 14 Feb 2011 17:02:57 +0000 (09:02 -0800)]
UploadPack: Expose advertised refs to callers
Like ReceivePack, callers that embed UploadPack within their
service may wish to see the set of references that were sent
to the client. We already have the map on hand, it just needs
to be exposed with a getter.
Change-Id: I123b23e475860d5bb968906bef59068985088b7b Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sun, 13 Feb 2011 02:44:39 +0000 (18:44 -0800)]
RepositoryBuilder: Allow callers to require repository exists
The setMustExist() method allows callers to require the repository
exists in order for build() to succeed. This is useful within a
RepositoryResolver where existence is required.
Change-Id: I6a1154551435cf0da6c2b4a7f4dce266abea5dff Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Shawn O. Pearce [Mon, 7 Feb 2011 01:42:28 +0000 (17:42 -0800)]
pgm: Make --git-dir a string
DHT based repository types don't use a java.io.File to name the
repository. Moving the type to a string starts to open up more types
of repository names, making the standard pgm package easier to reuse
on other storage systems.
Change-Id: I262ccc8c01cd6db88f832ef317b0e1e5db2d016a Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Shawn O. Pearce [Mon, 7 Feb 2011 00:38:02 +0000 (16:38 -0800)]
daemon: Use HTTP's resolver and factory pattern
Using a resolver and factory pattern for the anonymous git:// Daemon
class makes transport.Daemon more useful on non-file storage systems,
or in embedded applications where the caller wants more precise
control over the work tasks constructed within the daemon.
Rather than defining new interfaces, move the existing HTTP ones
into transport.resolver and make them generic on the connection
handle type. For HTTP, continue to use HttpServletRequest, and
for transport.Daemon use DaemonClient.
To remain compatible with transport.Daemon, FileResolver needs to
learn how to use multiple base directories, and how to export any
Repository instance at a fixed name.
Change-Id: I1efa6b2bd7c6567e983fbbf346947238ea2e847e Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sun, 6 Feb 2011 23:58:24 +0000 (15:58 -0800)]
UploadPack: Expose PackWriter activity to a logger
The UploadPackLogger interface allows applications that embed
GitServlet or otherwise use UploadPack to service clients to
track and log how PackWriter was used, and what it sent. This
provides more granularity into the request activity than might
be available from the HTTP server logs, helping administrators
to better understand utilization and Git server performance.
Change-Id: I1d36b060eb3385339d5f986e68192789ef70fc4e Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 8 Feb 2011 00:30:49 +0000 (16:30 -0800)]
RevWalk: Avoid unnecessary re-parsing of commit bodies
If the RevFilter doesn't actually require the commit body,
we shouldn't reparse it if the body was disposed. This happens
often inside of UploadPack during common ancestor negotation, the
RevWalk is reset and re-run over roughly the same commit space,
but the bodies are discarded because the commit message is not
relevant to the process.
Change-Id: I87b6b6a5fb269669867047698abf718d366bd002 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sun, 6 Feb 2011 09:15:33 +0000 (01:15 -0800)]
RevWalk: Don't reset ObjectReader when stopping
Applications like UploadPack reset() and reuse the same RevWalk
multiple times in very rapid succession. Releasing the ObjectReader's
internal state on each use, only to allocate it again on the next
cycle kills performance if the ObjectReader has internal caches, or
even if the Inflater gets returned and pulled from the InflaterCache
too frequently.
Making releasing the ObjectReader the application's responsibility
when it is done with the RevWalk, which most already do by wrapping
their loop in a try/finally block.
Change-Id: I3ad188a719e8d7f6bf27d1a7ca16d465534713f4 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sun, 6 Feb 2011 09:00:44 +0000 (01:00 -0800)]
UploadPack: Donate parsed commits to PackWriter
When UploadPack has computed the merge base between the client's have
set and the want set, its already loaded and parsed all of the
interesting commits that PackWriter needs to transmit to the client.
Switching the RevWalk and its object pool over to be an ObjectWalk
saves PackWriter from needing to re-parse these same commits from the
ObjectDatabase, reducing the startup latency for the enumeration
phase of packing.
UploadPack doesn't want to use an ObjectWalk for the okToGiveUp()
tests because its slower, during each commit popped it needs to cache
the tree into the pendingObjects list, and during each reset() it
discards a bunch of ObjectWalk specific state and reallocates some
internal collections. ObjectWalk was never meant to be rapidly
reset() like UploadPack does, so its perhaps somewhat cleaner to allow
"upgrading" a RevWalk to an ObjectWalk.
Bug: 301639
Change-Id: I97ef52a0b79d78229c272880aedb7f74d0f7532f Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 8 Feb 2011 00:46:06 +0000 (16:46 -0800)]
UploadPack: Rely on peeled ref data for include-tag
The peeled reference information for tags is more efficient to
work with than parsing the tag objects, as usually its coming from
the packed-refs file, which stores the peeled information for us.
Rely on the peeled information to decide if the tag should be
included or not, instead of using our RevWalk to parse the object.
Change-Id: I6714a8560a1c04b5578e9c5b469ea3c77188dff3 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sun, 6 Feb 2011 08:42:23 +0000 (00:42 -0800)]
UploadPack: Assume okToGiveUp is initially false
When negotiate() starts there is at least one want, but no haves, and
thus no common base exists. Its not ok to give up yet, the client
should try to find a common base with the server. Avoid scanning our
history along the want chains until we have found at least one commit
in common with the client, this will trigger okToGiveUp to be set to
null, enabling okToGiveUp() to perform the scan.
Bug: 301639
Change-Id: I98a82a5424fd4c9995924375c7910f76ca4f03af Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sun, 6 Feb 2011 03:00:15 +0000 (19:00 -0800)]
UploadPack: Avoid walking the entire project history
If the client presents a common commit on a side branch, and there is
a want for a disconnected branch UploadPack was walking back on the
entire history of the disconnected branch because it never would find
the common commit.
Limit our search back along any given want to be no earlier than the
oldest common commit received via a "have" line from our client. This
prevents us from looking at all of the project history.
Bug: 301639
Change-Id: Iffaaa2250907150d6efa1cf2f2fcf59851d5267d Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>