Philipp Thun [Mon, 4 Apr 2011 16:04:21 +0000 (18:04 +0200)]
Fix DirCache.isModified()
Change I61a1b45db2d60fdcc0f87373ac6fd75ac4c4a202 fixed a possible NPE
occurring for newly created repositories - but in that case a wrong
value (false = not modified) was returned.
If a current version of the index file exists (liveFile), but there is
no snapshot, this means that there have been modifications (i.e. true
has to be returned).
Change-Id: I698f78112249f9924860fc58eb7eab7afdf87eb7 Signed-off-by: Philipp Thun <philipp.thun@sap.com>
When a new Git instance for an exisiting git repository should be
created there are two use-cases: either the application has already a
Repository instance in hand or the application knows where the
repository resides in the filesystem. Two methods are added to
explicitly support these use-cases: wrap(Repository db) and open(File
gitDir)
Change-Id: I2970e4aa8d4602cb1298f01e5b76bf0f96c492e5 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Carsten Pfeiffer [Wed, 16 Mar 2011 15:57:35 +0000 (16:57 +0100)]
Support reading first SHA-1 from large FETCH_HEAD files
When reading refs, avoid reading huge files that were put there
accidentally, but still read the top of e.g. FETCH_HEAD, which
may be longer than our limit. We're only interested in the first line
anyway.
The 'Counting objects' phase of PackWriter requires good hit rates
from the DeltaBaseCache while walking trees, the deltas need to find
their bases in the cache in order to inflate in a reasonable time.
If JGit is running in a multi-threaded server, such as Gerrit Code
Review, each thread needs its own DeltaBaseCache to prevent one thread
from evicting the other thread's relevant bases. Move the cache to be
per-ObjectReader, lazily allocated when required by a PackFile.
Change-Id: If9d5ed06728e813632ae96dcfb811f4860b276e8 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Fix ReceivePack connectivity validation with alternates
If a repository has an alternate object database, the alternate has
its references advertised as ".have" lines, which permits the client
to use these as delta base candidates when generating the pack. If
setCheckReferencedObjectsAreReachable(true) is used, these additional
have lines need to be considered in addition to the advertised refs.
Change-Id: Ie39c6696f9d3ff147ef4405cd5624f6011700ce5 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
We sometimes face the problem that the file .git/index.lock
can't deleted causing JGit operations to fail. Problem is
that LockFile.unlock() simply deletes the lockfile and ignores the
return value of File.delete(). Instead use
FileUtils.delete() with retry option. This method will retry the
deletion of the file at most 10 times with sleeps inbetween.
Bug: 335959
Change-Id: I9598edea9f2304fe12e6f470301211b503434848 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
There should be a way to explictly refresh the refs cached in the
RefDirectory. Since commit c261b28 (use of FileSnapshot) this is
not needed anymore for storage in the filesystem. But for DHT based
storage an explicit refresh may be needed.
Change-Id: I7d30c3496c05e1fb6e9519f3af9f23c6adb93bf9 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Shawn O. Pearce [Mon, 13 Dec 2010 21:41:19 +0000 (13:41 -0800)]
RefDirectory: Use FileSnapshot for packed-refs
Instead of tracking the length and modification time by hand, rely
on FileSnapshot to tell RefDirectory when the $GIT_DIR/packed-refs
file has been changed or should be re-read from disk.
Change-Id: I067d268dfdca1d39c72dfa536b34e6a239117cc3 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 17 Mar 2011 02:19:15 +0000 (19:19 -0700)]
smart HTTP: Return errors inside payload
When the client is clearly making a smart HTTP request to our smart
HTTP server, return any errors like RepositoryNotFoundException or
ServiceNotEnabledException inside of the payload as a Git level ERR
message, rather than an HTTP error code.
This prevents the C Git command line client from retrying a failed
"$URL/info/refs?service=git-upload-pack" request without the smart
service URL, only to fail again with "403 Forbidden" when the dumb
as-is service has been disabled by the server configuration, or is
unavailable because the repository is not on the local filesystem.
Change-Id: I57e8756d5026e885e0ca615979bfcd729703be6c Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 16 Mar 2011 20:46:53 +0000 (13:46 -0700)]
UploadPack: Add a PreUploadHook to monitor and control behavior
Embedding applications can use this hook to watch actions within
UploadPack and possibly reject them. This could be useful to prevent
clones of a large repository from this server, or to stop abusive
negotiation rounds that offer thousands of objects in a single batch.
Change-Id: Id96f1885ac4d61f22c80b6418fff54184b7348ba Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 15 Mar 2011 21:00:43 +0000 (14:00 -0700)]
Allow application filters on smart HTTP operations
Permit applications embedding GitServlet to wrap the
info/refs?service=$name and /$name operations with a
servlet Filter.
To help applications inspect state of the operation,
expose the UploadPack or ReceivePack object into a
request attribute. This can be useful for logging,
or to implement throttling of requests like Gerrit
Code Review uses to prevent server overload.
Change-Id: Ib8773c14e2b7a650769bd578aad745e6651210cb Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 18 Mar 2011 17:23:21 +0000 (10:23 -0700)]
PackWriter: Fix the way delta chain cycles are prevented
Take a very simple approach to avoiding delta chains during object
reuse: objects are now always selected from the oldest pack that
contains them. This prevents cycles because a pack must not have
a cycle in the delta chain. If both objects A and B are chosen
out of the same source pack then there cannot be an A->B->A cycle.
The oldest pack is also the most likely to have the smallest deltas.
Its the biggest pack in the system and probably came from the clone
(or last GC) of this repository, where all objects were previously
considered and packed tightly together. If an object appears again
(for example due to a revert and a push into this repository) the
newer copy of won't be nearly as small as the older delta version
of it, even if the newer one is also itself a delta.
ObjectDirectory already enumerates objects during selection in this
newest->oldest order, so it already is supplying these assumptions
to PackWriter. Taking advantage of this can speed up selection by
a tiny amount by avoiding some tests, but can also help to prevent
a cycle needing to be broken on the fly during writing.
The previous cycle breaking logic wasn't fully correct either.
If a different delta base was chosen, the new delta base might not
have been written into the output pack before the current object,
forcing the use of REF_DELTA when OFS_DELTA is always smaller.
This logic has now been reworked to always re-check the delta base
and ensure it gets written before the current object.
If a cycle occurs, it gets broken the same way as before, by
disabling delta reuse and finding an alternative form of the
object, which may require inflating/deflating in whole format.
Change-Id: I9953ab8be54ceb8b588e1280d6f7edd688887747 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 18 Mar 2011 16:10:07 +0000 (09:10 -0700)]
PackWriter: Combine small reuse batches together
If the total number of objects to look for reuse on is under 4096
this is really close to a reasonable batch size for the DHT storage
system to lookup at once. Combine all of the objects into a single
temporary list, perform reuse, and then prune the main lists if any
duplicate objects were detected from a selected CachedPack.
The intention here is to try and avoid 4 tiny sequential lookups
on the storage system when the time to wait for each of those to
finish is higher than the CPU time required to build (and later
GC) this temporary list.
Change-Id: I528daf9d2f7744dc4a6281750c2d61d8f9da9f3a Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 18 Mar 2011 15:37:28 +0000 (08:37 -0700)]
PackWriter: Remove dummy list 0
Instead of looping over the objectsLists array, always set slot 0 to
null and explicitly work on the 4 indexes that matter. This kills
some loops and increases the length of the code slightly, but I've
always really disliked that dummy 0 slot.
Change-Id: I5ad938501c1c61f637ffdaff0d0d88e3962d8942 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 18 Mar 2011 15:29:26 +0000 (08:29 -0700)]
PackWriter: Speed up pruning of objects from cached packs
During object enumeration for the thin pack, very few objects come
out that are duplicated with the cached pack. Typically these are
only cases where a blob or tree was cherry-picked forward, got a
copy or rename, or was reverted... all relatively infrequent events.
Speed up pruning of the thin pack object list by combining the phase
with the object representation selection. Implementers should already
be offering to reuse the object from the cached pack if it is stored
there, at which point the implementation can perform a very fast type
of containment test using the cached pack's identity rather than yet
another index lookup. For the local disk case this is probably not a
big improvement, but it does help on the DHT implementation where the
two passes combined into one reduces latency.
Change-Id: I6a07fc75d9075bf6233e967360b6546f9e9a2b33 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This merge command accepts the merge strategy as option and uses the
resolve strategy as default. It expects exactly one other
revision which is merged with current head.
Change-Id: Ia8c188b93ade4afabe6a9ccf267faf045f359a3a Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Philipp Thun [Mon, 28 Mar 2011 15:54:24 +0000 (17:54 +0200)]
Fix possible NPE in DirCache.isModified()
The snapshot field of a DirCache object for a newly created repository
can be null. This fix prevents a NPE when isModified() is called in
such a situation.
Change-Id: I61a1b45db2d60fdcc0f87373ac6fd75ac4c4a202 Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Philipp Thun [Mon, 28 Mar 2011 14:01:44 +0000 (16:01 +0200)]
Do not categorize merge failures as 'abnormal'
This change contains a simple renaming. Instead of using the
expression 'abnormal failure', we just treat this kind of situation
as 'failure'. This is specific enough as conflicts are already handled
separately.
Change-Id: I535acdc7d022543ed0f5ac6151b09a6985f4ef38 Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Robin Rosenberg [Sun, 27 Mar 2011 20:42:57 +0000 (22:42 +0200)]
Do not normalize URIishes
We used to normalize URI's since it seems simple. This however causes
inconsistencies to the user and to out tests. Just pass backslashes
through and make sure our parser can handle them.
Bug: 341062 Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Change-Id: I2c8e917a086faabcd8749160c2acc9dd05a42838
Sasa Zivkov [Fri, 25 Mar 2011 13:28:56 +0000 (14:28 +0100)]
Detaching HEAD when checking out the same commit.
Detaching HEAD didn't work in some corner checkout cases. If, for example,
HEAD is symbolic ref to refs/heads/master, refs/heads/master is ref to commit
c0ffee... then:
checkout c0ffee...
would leave the HEAD unchanged.
The same symptom occurs when checking out a remote tracking branch or a tag
that references the same commit as refs/heads/master.
In the above case, the RefUpdate class didn't have enough information to decide
if the update needed to detach symbolic ref because it dealt only with new/old
objectIDs. Therefore, this fix introduced the RefUpdate.detachingSymbolicRef
flag.
Bug: 315166
Change-Id: I085c98b77ea8f9104a213978ea0d4ac6fd58f49b Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Sasa Zivkov [Thu, 24 Mar 2011 10:13:16 +0000 (11:13 +0100)]
Registering the Checkout command and fixed a typo.
The Checkout command line command was added to JGit but it wasn't
registered in the list of available commands.
Additionally, the 'force' option was named '---force' (triple '-').
Change-Id: I259773932fa9aec3bb29e215740e67c834566f6f Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Allow users of the GIT api to get to know the state of their
workingtree and index by adding a status command. The implementation
is mainly a wrapper around IndexDiff class. Better support for multiple
stages in the index (conflict situations) is still missing. An
appropriate change to IndexDiff and StatusCommand will come in a
subsequent commit.
Bug: 337296
Change-Id: Idb390375a68611853c1c903299ec678c89b081dc Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Roland Schulz [Sat, 5 Mar 2011 00:50:14 +0000 (19:50 -0500)]
Create RemoteSession interface
The RemoteSession interface operates like a simplified version of
java.lang.Runtime with a single exec method (and a disconnect
method). It returns a java.lang.Process, which should begin execution
immediately. Note that this greatly simplifies the interface for
running commands. There is no longer a connect method, and most
implementations will contain the bulk of their code inside
Process.exec, or a constructor called by Process.exec. (See the
revised implementations of JschSession and ExtSession.)
Implementations can now configure their connections properly without
either ignoring the proper use of the interface or trying to adhere
to an overly strict interface with odd rules about what methods are
called first. For example, Jsch needs to create the output stream
before executing, which it now does in the process constructor. These
changes should make it much easier to add alternate session
implementations in the future.
Also-by: John D Eblen <jdeblen@comcast.net>
Bug: 336749
CQ: 5004
Change-Id: Iece43632086afadf175af6638255041ccaf2bfbb Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Philipp Thun [Wed, 23 Mar 2011 09:24:14 +0000 (10:24 +0100)]
Introduce CherryPickResult
In order to distinguish cherry-pick failures caused by conflicts vs.
'abnormal failures' (e.g. due to unstaged changes or a dirty
worktree), a CherryPickResult class is introduced and returned by
CherryPickCommand.call() instead of a RevCommit. This new class is
similar to MergeResult and RebaseResult. The CherryPickResult contains
all necessary information, e.g. paths causing the cherry-pick (a merge
called within, respectively) to fail. This allows callers to better
react on failures.
Change-Id: I5db57b9259e82ed118e4bf4ec94463efe68b8c1f Signed-off-by: Philipp Thun <philipp.thun@sap.com> Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Marc Strapetz [Mon, 21 Mar 2011 07:33:40 +0000 (08:33 +0100)]
Fix: possible IndexOutOfBoundsException in ReflogReader
java.lang.IndexOutOfBoundsException
at java.nio.ByteBuffer.wrap(ByteBuffer.java:352)
at org.eclipse.jgit.util.RawParseUtils.decodeNoFallback(RawParseUtils.java:913)
at org.eclipse.jgit.util.RawParseUtils.decode(RawParseUtils.java:880)
at org.eclipse.jgit.util.RawParseUtils.decode(RawParseUtils.java:839)
at org.eclipse.jgit.storage.file.ReflogReader$Entry.<init>(ReflogReader.java:102)
at org.eclipse.jgit.storage.file.ReflogReader.getReverseEntries(ReflogReader.java:183)
at org.eclipse.jgit.storage.file.ReflogReader.getReverseEntries(ReflogReader.java:162)
Philipp Thun [Mon, 21 Mar 2011 11:33:58 +0000 (12:33 +0100)]
Improve MergeResult
Add paths causing abnormal merge failures (e.g. due to unstaged
changes) to the MergeResult returned by MergeCommand. This helps
callers to better handle (e.g. present) merge results.
Change-Id: Idb8cf04c5cecfb6a12cb880e16febfc3b9358564 Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Shawn O. Pearce [Fri, 18 Mar 2011 14:27:41 +0000 (07:27 -0700)]
PackWriter: Collect stats by object type
Frequently enough I'm wondering how much of a pack is commits vs.
trees, and the total line doesn't really tell us this because its
a gross total from the pack. Computing the counts per object type
is simple during packing, as PackWriter already has everything in
memory broken up by object type. Its virtually free to get these
values and track them.
Change-Id: Id5e6b1902ea909c72f103a0fbca5d8bc316f9ab3 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 18 Mar 2011 15:21:39 +0000 (08:21 -0700)]
PackFile: Cache the packName string
Instead of computing this on every request, compute it once and
hold onto the result. This improves performance for LocalCachedPack
which does a lot of tests against the pack name string.
Change-Id: I3803745e3a5dda7b5f0faf39aae9423e2c777e7f Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Philipp Thun [Thu, 17 Mar 2011 09:48:44 +0000 (10:48 +0100)]
Abort merge when file to be checked out is dirty
In case a file needs to be checked out (from THEIRS) during a merge
operation, it has to be checked if the worktree version of this file
is dirty. If this is true, merge shall fail.
Change-Id: I17c24845584700aad953c3d4f2bea77a0d665ec4 Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Philipp Thun [Fri, 18 Mar 2011 12:33:36 +0000 (13:33 +0100)]
Refactor ResolveMerger
1. Perform an explicit check for untracked files.
2. Extract 'dirty checks' into separate methods
3. Clean up comments.
4. Tests: also check contents of files not affected by merge.
Change-Id: Ieb089668834d0a395c9ab192c555538917dfdc47 Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Moved tests for commit -o option to own test class
We test the -o option of the commit command very accurate by
writing tests for each line of a decision table. In order to
still be able to point new jgit users to the CommitAndLogCommandTest
to find out how to use log() and commit() I factored out these 1200
lines of very specific tests into their own class.
Change-Id: Icf7c517f790a8fa79c8afd9b7f4a2805cf79196e Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Shawn O. Pearce [Tue, 15 Mar 2011 01:52:00 +0000 (18:52 -0700)]
Assume refs of alternates are reachable during fetch
When fetching from a remote peer, consider all of the refs of any
alternate repository to be reachable locally, in addition to the refs
of the local repository. This mirrors the push protocol and may avoid
unnecessary object transfer when the local repository is empty, but
its alternate and the remote share a lot of common history.
Junio C Hamano recently proposed a similar change to C Git's fetch
client, in order to work around a performance bug I identified when
fetching between two repositories that actually shared the same
alternate repository on the local system.
Change-Id: Iffb0b70e1223901ce2caac3b87ba7e0d6634d265 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 17 Mar 2011 04:50:42 +0000 (21:50 -0700)]
UploadPack: Report invalid want lines with ERR
Instead of aborting hard with a server-side exception, report an error
to the client with "ERR %s" in a context where the client is expecting
ACK/NAK. Older clients will report this text to the user, but newer
ones know how to format this message in a more user-friendly way.
Change-Id: I1879b38988ba66f648c069c10dbfa14c3f34adb2 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 17 Mar 2011 04:44:34 +0000 (21:44 -0700)]
Handle "ERR %s" when ACK/NAK is expected
If the remote peer replies with "ERR %s" instead of "ACK %s common" or
"NAK" during ancestor negotiation in the fetch-pack/upload-pack
protocol, treat that as an exception that aborts processing with the
error text as supplied by the remote system.
This matches behavior with "ERR %s" during the advertisements, which
is also a way for the remote to abort processing.
Change-Id: I2fe818e75c7f46156744ef4f703c40173cbc76d0 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 17 Mar 2011 04:33:25 +0000 (21:33 -0700)]
PacketLineIn: Reuse internal lineBuffer for small strings
Most "ACK %s continue", "ACK %s common", "NAK" strings that are read
by the readACK() method and readString() are shorter than the
lineBuffer already available. Reuse that buffer when reading from
the network stream and converting to a string with RawParseUtils to
avoid unnecessary temporary byte array allocations.
Change-Id: Ibc778d9f7721943a065041d80fc427ea50d90fff Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Matthias Sohn [Wed, 16 Mar 2011 15:23:14 +0000 (16:23 +0100)]
Expose if name or email is based on a guess
This enables applications to differentiate between explicitly set
configuration parameters and best effort attempts to guess these
parameters from the operating system.
Change-Id: I67cc4099238a40c6dca795e64f0155ced6008ef1 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Philipp Thun [Wed, 16 Mar 2011 00:36:56 +0000 (01:36 +0100)]
Use parent directory in InitCommand if directory is "."
If no directory is set before executing an InitCommand, the current
directory (".") is used by default. By calling File.getParentFile() we
get the actual directory this points to. Using this directory makes it
easier to read paths.
Change-Id: I6245941395dae920e4f90b8985be6ef3cce570d3 Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Shawn O. Pearce [Mon, 14 Mar 2011 15:14:46 +0000 (08:14 -0700)]
PushCommand: Allow adding any reference string
The simplified form of add(String) makes it easier for applications
to pass down user input and allow PushCommand to convert it to the
internal RefSpec object.
Change-Id: Ibd2e95852db0e52ea4a36032942c4c42a7fb4261 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Philipp Thun [Fri, 11 Mar 2011 14:49:27 +0000 (15:49 +0100)]
Make --git-dir optional for 'jgit init'
For compatibility reasons with regards to native git and also to
make the init command easier to use from the command line,
argument --git-dir should not be required.
Additionally the path created in case --git-dir is not supplied now is
canonical and thus easier to read.
Change-Id: Idb7d77e983a78c4b21fbf232fc1e75ef581e5ed1 Signed-off-by: Philipp Thun <philipp.thun@sap.com>
Shawn O. Pearce [Mon, 14 Mar 2011 22:36:17 +0000 (15:36 -0700)]
Improve native Git transport when following repository
If the client is only following the remote repository and has not
created any new non-common commits, the client will wind up sending
a "have %s" line for each tag in the repository. For some projects
like git.git, this is 339 tags and growing, resulting in more than
16 KiB needing to be POSTed over 12 HTTP requests.
Teach UploadPack (server side) to always execute the okToGiveUp()
logic at least once per negotiation round to determine if the server
can compute a pack right now. If it can, shove in an "ACK %s ready"
message to tell the client this and try to prevent receiving ancient
tags in future negotiation rounds.
Teach BasePackFetchConnection (client side) to honor a "ACK %s ready"
from the remote and break out of its SEND_HAVE loop once the remote
knows it can create a pack. This avoids sending the remaining 307
tags of git.git.
These two changes together reduce the number of HTTP RPCs from 13
down to 3 in order to fetch from git.git over smart HTTP. If either
side is missing the change, the older behavior (and its 13 RPCs)
is used.
Change-Id: I64736318fd0abf9ee5e56bd0b737707adb580b37 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 14 Mar 2011 16:17:39 +0000 (09:17 -0700)]
FS: Allow userHome to be set and cached
This permits callers to modify the meaning of userHome, which
may be useful if their application allows the user to select
different user settings locations.
Bug: 337101
Change-Id: I076815edeec1c20dea028f7840be3930337dff77 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 14 Mar 2011 16:07:53 +0000 (09:07 -0700)]
FS: Allow gitPrefix to be set and cached
This permits callers to modify the meaning of gitPrefix, which
may be useful if their application allows the user to select
the location where C Git is installed.
Bug: 337101
Change-Id: I07362a5772da4955e01406bdeb8eaf87416be1d6 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 14 Mar 2011 15:53:41 +0000 (08:53 -0700)]
Always fetch tags during clone
C Git always fetches tags during clone, even if the tag doesn't
point to an object that was fetched by the branch specifications.
Match that behavior, as users expect it.
Bug: 326611
Change-Id: I81a82b7359a9649f18a172219da44ed54e77ca2f Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 14 Mar 2011 14:06:35 +0000 (07:06 -0700)]
Fix dumb transport push
PackWriter incorrectly returned 0 from getObjectsNumber() when the
pack has not been written yet. This caused dumb transports like
amazon-s3:// and sftp:// to abort early and never write out a pack,
under the assumption that the pack had no objects.
Until the pack header is written to the output stream, compute the
current object count each time it is requested. Once the header is
started, use the object count from the stats object.
Change-Id: I041a2368ae0cfe6f649ec28658d41a6355933900 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 10 Mar 2011 23:42:32 +0000 (15:42 -0800)]
ObjectIdOwnerMap: More lightweight map for ObjectIds
OwnerMap is about 200 ms faster than SubclassMap, more friendly to the
GC, and uses less storage: testing the "Counting objects" part of
PackWriter on 1886362 objects:
The major difference with OwnerMap is entries must extend from
ObjectIdOwnerMap.Entry, where the OwnerMap has injected its own
private "next" field into each object. This allows the OwnerMap to use
a singly linked list for chaining collisions within a bucket. By
putting collisions in a linked list, we gain the entire table back for
the SHA-1 bits to index their own "private" slot.
Unfortunately this means that each object can appear in at most ONE
OwnerMap, as there is only one "next" field within the object instance
to thread into the map. For types that are very object map heavy like
RevWalk (entity RevObject) and PackWriter (entity ObjectToPack) this
is sufficient, these entity types are only put into one map by their
container. By introducing a new map type, we don't break existing
applications that might be trying to use ObjectIdSubclassMap to track
RevCommits they obtained from a RevWalk.
The OwnerMap uses less memory. Each object uses 1 reference more (so
we're up 1,886,362 references), but the table is 1/2 the size (2^20
rather than 2^21). The table itself wastes only 210,790 slots, rather
than 2,307,942. So OwnerMap is wasting 200k fewer references.
OwnerMap is more friendly to the GC, because it hardly ever generates
garbage. As the map reaches its 100% load factor target, it doubles in
size by allocating additional segment arrays of 2048 entries. (So the
first grow allocates 1 segment, second 2 segments, third 4 segments,
etc.) These segments are hooked into the pre-allocated directory of
1024 spaces. This permits the map to grow to 2 million objects before
the directory itself has to grow. By using segments of 2048 entries,
we are asking the GC to acquire 8,204 bytes in a 32 bit JVM. This is
easier to satisfy then 2,307,942 bytes (for the 512k table that is
just an intermediate step in the SubclassMap). By reusing the
previously allocated segments (they are re-hashed in-place) we don't
release any memory during a table grow.
When the directory grows, it does so by discarding the old one and
using one that is 4x larger (so the directory goes to 4096 entries on
its first grow). A directory of size 4096 can handle up to 8 millon
objects. The second directory grow (16384) goes to 33 million objects.
At that point we're starting to really push the limits of the JVM
heap, but at least its many small arrays. Previously SubclassMap would
need a table of 67108864 entries to handle that object count, which
needs a single contiguous allocation of 256 MiB. That's hard to come
by in a 32 bit JVM. Instead OwnerMap uses 8192 arrays of about 8 KiB
each. This is much easier to fit into a fragmented heap.
Change-Id: Ia4acf5cfbf7e9b71bc7faa0db9060f6a969c0c50 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>