This allows callers to force the iterator back to its starting point,
so it can be traversed again. The default way to do this is to use
back(1) until first() is true, but this isn't very efficient for any
iterator. All current implementations have better ways to implement
reset without needing to seek backwards.
Change-Id: Ia26e6c852fdac8a0e9c80ac72c8cca9d897463f4 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When we are asked to create a difference between two files the caller
really wants to see that output. Instead of punting because a file
is too big to process, consider it to be binary. This reduces the
accuracy of our output display, but makes it a lot more likely that
the formatter can still generate something semi-useful.
We set our default binary threshold to 50 MiB, which is the same
threshold that PackWriter uses before punting and deciding a file
is too big to delta compress. Anything under this size we try to
load and process, anything over that size (or that won't allocate
in the heap) gets tagged as binary.
Change-Id: I69553c9ef96db7f2058c6210657f1181ce882335 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When adding or deleting a file, we shouldn't ever prefix /dev/null
with the a/ or b/ prefixes. Doing so is a mistake and confuses a
patch parser which handles /dev/null magically, while a/dev/null is
a file called null in the dev directory of the project.
Also when adding or deleting the "diff --git" line has the "real"
path on both sides, so we should see the following when adding the
file called foo:
diff --git a/foo b/foo
--- /dev/null
+++ b/foo
The --- and +++ lines do not appear in a pure rename or copy delta,
C Git diff seems to omit these, so we now omit them as well. We also
omit the index line when the ObjectIds are exactly equal.
Change-Id: Ic46892dea935ee8bdee29088aab96307d7ec6d3d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Instead of trying to stream out the header, we can drop a redundant
code path by formatting the header into a temporary buffer and then
streaming out the actual line differences later.
Its a small amount of unnecessary work to buffer the file header,
but these are typically very tiny so the cost to format and reparse
is relatively low.
Change-Id: Id14a527a74ee0bd7e07f46fdec760c22b02d5bdf Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Marc Strapetz [Tue, 31 Aug 2010 11:26:29 +0000 (13:26 +0200)]
Partial support for index file format "3".
Extended flags are processed and available via DirCacheEntry's
new isSkipWorkTree() and isIntentToAdd() methods. "resolve-undo"
information is completely ignored since its an optional extension.
Improve MergeAlgorithm to produce smaller conflicts
The merge algorithm was reporting conflicts which where to big.
Example: The common base was "ABC", the "ours" version contained
"AB1C" (the addition of "1" after pos 2) and the "theirs" version also
contained "AB1C". We have two potentially conflicting edits in the
same region which happen to bring in exactly the same content. This
should not be a conflict - but was previously reported as
"AB<<<1===1>>>C".
This is fixed by checking every conflicting chunk whether the
conflicting regions have a common prefix or suffix and by removing
this regions from the conflict.
Change-Id: I4dc169b8ef7a66ec6b307e9a956feef906c9e15e Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
This adds the first merge strategy to JGit which does real
content-merges if necessary. The new merge strategy "resolve" takes as
input three commits: a common base, ours and theirs. It will simply takeover
changes on files which are only touched in ours or theirs. For files
touched in ours and theirs it will try to merge the two contents
knowing taking into account the specified common base.
Rename detection has not been introduced for now.
Change-Id: I49a5ebcdcf4f540f606092c0f1dc66c965dc66ba Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Matthias Sohn [Sun, 29 Aug 2010 21:41:10 +0000 (23:41 +0200)]
Wait for JIT optimization before measuring diff performance
On Mac OS X MyerDiffPerformanceTest was failing since during the
first few tests the JIT compiler is running in parallel slowing down
the tests. When setting the JVM option -Xbatch forcing the JIT to do
its work prior to running the code this effect can be avoided. Instead
we chose to run some tests without recording prior to the recorded
tests since relying on -X JVM parameters isn't portable across JVMs.
Use 10k * powers of 2 as sample size instead of odd numbers used
before and also improve formatting of performance readings.
Bug: 323766
Change-Id: I9a46d73f81a785f399d3cf5a90c8c0516526e048 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Matthias Sohn [Fri, 27 Aug 2010 23:18:20 +0000 (01:18 +0200)]
Generate source plugin and source feature for jgit
Change-Id: Ibc5a302078bfc01d9ee45a4c0ab0b79b2abd185a Signed-off-by: Matthias Sohn <matthias.sohn@sap.com> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Shawn O. Pearce [Mon, 30 Aug 2010 18:53:25 +0000 (11:53 -0700)]
Improve LargeObjectException reporting
Use 3 different types of LargeObjectException for the 3 major ways
that we can fail to load an object. For each of these use a unique
string translation which describes the root cause better than just
the ObjectId.name() does.
Change-Id: I810c98d5691b74af9fc6cbd46fc9879e35a7bdca Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 30 Aug 2010 18:01:33 +0000 (11:01 -0700)]
IndexPack: Use byte limited form of getCachedBytes
Currently our algorithm requires that we have the delta base as
a contiguous byte array... but getCachedBytes() might not work
if the object is considered to be large by its underlying loader.
Use the limited form to obtain the object as a byte array instead.
Change-Id: I33f12a8811cb6a4a67396174733f209db8119b42 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 30 Aug 2010 17:58:19 +0000 (10:58 -0700)]
Undo translation of protocol string 'unpack error'
This string is part of the network protocol, and isn't meant to
be translated into another language. Clients actually scan for
the string "unpack error " off the wire and react magically to
this information. If it were translated, they would instead have
a protocol exception, which isn't very useful when there is already
an error occurring.
Change-Id: Ia5dc8d36ba65ad2552f683bb637e80b77a7d92f0 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 27 Aug 2010 20:28:14 +0000 (13:28 -0700)]
Buffer very large delta streams to reduce explosion of CPU work
Large delta streams are unpacked incrementally, but because a delta
can seek to a random position in the base to perform a copy we may
need to inflate the base repeatedly just to complete one delta.
So work around it by copying the base to a temporary file, and then
we can read from that temporary file using random seeks instead.
Its far more efficient because we now only need to inflate the
base once.
This is still really ugly because we have to dump to a temporary
file, but at least the code can successfully process a large
file without throwing OutOfMemoryError. If speed is an
issue, the user will need to increase the JVM heap and ensure
core.streamFileThreshold is set to a higher value, so we don't use
this code path as often.
Unfortunately we lose the "optimization" of skipping over portions
of a delta base that we don't actually need in the final result.
This is going to cause us to inflate and write to disk useless
regions that were deleted and do not appear in the final result.
We could later improve on our code by trying to flatten delta
instruction streams before we touch the bottom base object, and
then only store the portions of the base we really need for the
final result and that appear out-of-order. Since that is some
pretty complex code I'm punting on it for now and just doing this
simple whole-object buffering.
Because the process umask might be permitting other users to read
files we create, we put the temporary buffers into $GIT_DIR/objects.
We can reasonably assume that if a reader can read our temporary
buffer file in that directory, they can also read the base pack
file we are pulling it from and therefore its not a security breach
to expose the inflated content in a file. This requires a reader
to have write access to the repository, but only if the file is
really big. I'd rather err on the side of caution here and refuse
to read a very big file into /tmp than to possibly expose a secured
content because the Java 5 JVM won't let us create a protected
temporary file that only the current user can access.
Change-Id: I66fb80b08cbcaf0f65f2db0462c546a495a160dd Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Chris Aniszczyk [Thu, 12 Aug 2010 16:25:31 +0000 (11:25 -0500)]
Add TagCommand
A tag command is added to the Git porcelain API. Tests were
also added to stress test the tag command.
Change-Id: Iab282a918eb51b0e9c55f628a3396ff01c9eb9eb Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Implementation of a checkout (or 'git read-tree') operation which
works together with DirCache. This implementation does similar things
as WorkDirCheckout which main problem is that it works with deprecated
GitIndex. Since GitIndex doesn't support multiple stages of a file
which is required in merge situations this new implementation is
required to enable merge support.
Change-Id: I13f0f23ad60d98e5168118a7e7e7308e066ecf9c Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
The Merger was was only exposing the merge base as an
AbstractTreeIterator. Since we need the merge base as
RevCommit to generate the merge result I expose it here.
Change-Id: Ibe846370a35ac9bdb0c97ce2e36b2287577fbcad Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Marc Strapetz [Thu, 19 Aug 2010 13:34:44 +0000 (15:34 +0200)]
Fix parsing of multiple authors in PersonIdent.
PersonIdent should be parsable for an invalid commit which
contains multiple authors, like "A <a@a.org>, B <b@b.org>".
PersonIdent(String) constructor now delegates to
RawParseUtils.parsePersonIdent().
Shawn O. Pearce [Thu, 26 Aug 2010 00:05:31 +0000 (17:05 -0700)]
Increase the default streaming threshold to 15 MiB
Applying deltas in the large streaming mode is horrifically slow.
Trying to pack icu4c is impossible because a single 11 MiB file
sits on top of a 15 MiB file though a 10 deep delta chain, which
results in this very slow inflate process.
Upping the default limit to 15 MiB lets us process this large in a
reasonable time, but its still sufficiently low enough to prevent
exploding the heap of a very large process like Eclipse or Gerrit
Code Review.
We have to revisit the streaming delta application process and do
something much smarter, like flatten the delta chain before we apply
it to the base. But even that is ugly, I've seen a 155 MiB delta
sitting on top of a 450 MiB file to produce a 300 MiB result object.
If the chain is deep, we may have trouble flatting it down.
Change-Id: If5a0dcbf9d14ea683d75546f104b09bb8cd8fdbb Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 25 Aug 2010 23:46:43 +0000 (16:46 -0700)]
Fix reuse from pack file for REF_DELTA types
We miscomputed the CRC32 checksum for a REF_DELTA type of object, by
not including the full 20 byte ObjectId of the delta base in the CRC
code we use when the delta is too large to go through our two faster
small reuse code paths. This resulted in a corruption error during
packing, where the PackFile erroneously suspected the data was wrong
on the local filesystem and aborted writing, because the CRC didn't
match what we had read from the index.
Change-Id: I7d12cdaeaf2c83ddc11223ce0108d9bd6886e025 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 25 Aug 2010 00:35:52 +0000 (17:35 -0700)]
Support parsing git describe style output
We now match on the -gABBREV style output created by git describe
when its describing a non-tagged commit, and resolve that back to
the full ObjectId using the abbreviation resolution feature that
we already support.
Change-Id: Ib3033f9483d9e1c66c8bb721ff48d4485bcdaef1 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 24 Aug 2010 23:26:40 +0000 (16:26 -0700)]
Throw AmbiguousObjectException during resolve if its ambiguous
Its wrong to return null if we are resolving an abbreviation and we
have proven it matches more than one object. We know how to resolve
it if we had more nybbles, as there are two or more objects with the
same prefix. Declare that to the caller quite clearly by giving them
an AmbiguousObjectException.
Change-Id: I01bb48e587e6d001b93da8575c2c81af3eda5a32 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 24 Aug 2010 22:50:36 +0000 (15:50 -0700)]
Complete an abbreviation when formatting a patch
If we are given a DiffEntry header that already has abbreviated
ObjectIds on it, we may still be able to resolve those locally and
output the difference. Try to do that through the new resolve API
on ObjectReader.
Change-Id: I0766aa5444b7b8fff73620290f8c9f54adc0be96 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 24 Aug 2010 19:50:59 +0000 (12:50 -0700)]
Use limited getCachedBytes in RevWalk
Parsing is rewritten to use the size limited form of getCachedBytes,
thus freeing the revwalk infrastructure from needing to care about
a large object vs. a small object when it gets an ObjectLoader.
Right now we hardcode our upper bound for a commit or annotated
tag to be 15 MiB. I don't know of any that is more than 1 MiB in
the wild, so going 15x that should give us some reasonable headroom.
Change-Id: If296c211d8b257d76e44908504e71dd9ba70ffa8 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 24 Aug 2010 19:46:56 +0000 (12:46 -0700)]
Add brute force byte array loading to ObjectLoader
Some algorithms are coded in a way that requires us to provide them
the entire object contents as a contiguous byte array. The parsers
in RevCommit and RevTag, or our RawText objects are really good
examples of these.
Instead of duplicating this logic everywhere, lets put it into the
base ObjectLoader type. That way the caller only needs to give us
their upper size bound, and we'll do the rest of the heavy work to
figure out if the object still fits within that bound, and get them
an array that has the complete contents.
Change-Id: Id95a7f79d2b97e39f6949370ccca2f2c9cfb1a0f Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Shawn O. Pearce [Tue, 24 Aug 2010 23:11:41 +0000 (16:11 -0700)]
Add ObjectId to the LargeObjectException
A chunk of code that throws LargeObjectException may or may not have
the specific ObjectId on hand when its thrown. If it does, we want
to cache it in the exception, and put that in the message. If it is
missing we want to be able to set it later from a higher level stack
frame that does have the object handy.
Change-Id: Ife25546158868bdfa886037e4493ef8235ebe4b9 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Shawn O. Pearce [Tue, 24 Aug 2010 19:59:10 +0000 (12:59 -0700)]
Don't copy more than the object size
If the loader's stream is broken and returns to us more content
than it originally declared as the size of the object, don't
copy that onto the output stream. Instead throw EOFException
and abort fast. This way we don't follow an infinite stream,
but instead will at least stop when the size was reached.
Change-Id: I7ec0c470c875f03b1f12a74a9b4d2f6e73b659bb Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 24 Aug 2010 19:56:47 +0000 (12:56 -0700)]
Use the ObjectStream size during copyTo
If the stream is a delta decompression stream, getting the size
can be expensive. Its cheaper to get it from the stream itself
rather than from the object loader.
Change-Id: Ia7f0af98681f6d56ea419a48c6fa8eea09274b28 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 25 Aug 2010 00:20:50 +0000 (17:20 -0700)]
Fix ObjectDirectory abbreviation resolution to notice new packs
If we can't resolve an abbreviation, it might be because there is
a new pack file we haven't picked up yet. Try scanning the packs
again and recheck each pack if there were differences from the last
scan we did.
Because of this, we don't have to open a pack during the test where
we generate a pack on the fly. We'll miss on the first loop during
which the PackList is the NO_PACKS magic initialization constant,
and pick up the newly created index during this retry logic.
Change-Id: I7b97efb29a695ee60c90818be380f7ea23ad13a3 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 23 Aug 2010 22:53:11 +0000 (15:53 -0700)]
Fully implement SHA-1 abbreviations
ObjectReader implementations are now responsible for creating the
unique abbreviation of an ObjectId, or for resolving an abbreviation
back to its full form. In this latter case the reader can offer up
multiple candidates to the caller, who may be able to disambiguate
them based on context.
Repository.resolve() doesn't take multiple candidates into account
right now, but it could in the future by looking for a remaining
^0 or ^{commit} suffix and take an expansion if there is only one
commit that matches the input abbreviation. It could also use
the distance from an annotated tag to resolve "tag-NNN-gcommit"
style strings that are often output by `git describe`.
Change-Id: Icd3250adc8177ae05278b858933afdca0cbbdb56 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 23 Aug 2010 17:30:58 +0000 (10:30 -0700)]
Add openEntryStream to WorkingTreeIterator
This makes it easier for abstract tools like AddCommand to open the
file from the working tree, without knowing internal details about
how the tree is managed.
Change-Id: Ie64a552f07895d67506fbffb3ecf1c1be8a7b407 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 23 Aug 2010 17:29:50 +0000 (10:29 -0700)]
Add setLength(long) to DirCacheEntry
Applications should favor the long style interface, especially when
their source input is a long type, e.g. coming from java.io.File.
This way when the index format is later changed to support a
larger file size than 2 GiB we can handle it by just changing the
entry code, and not need to fix a lot of applications.
Change-Id: I332563caeb110014e2d544dc33050ce67ae9e897 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 23 Aug 2010 17:13:25 +0000 (10:13 -0700)]
Move commit and tag formatting to CommitBuilder, TagBuilder
These objects should be responsible for their own formatting,
rather than delegating it to some obtuse type called ObjectInserter.
While we are at it, simplify the way we insert these into a database.
Passing in the type and calling format in application code turned
out to be a huge mistake in terms of ease-of-use of the insert API.
Change-Id: Id5bb95ee56aa2a002243e9b7853b84ec8df1d7bf Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 23 Aug 2010 16:46:14 +0000 (09:46 -0700)]
Rename Commit, Tag to CommitBuilder, TagBuilder
Since these types no longer support reading, calling them a Builder
is a better description of what they do. They help the caller to
build a commit or a tag object.
Change-Id: I53cae5a800a66ea1721b0fe5e702599df31da05d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 23 Aug 2010 16:40:41 +0000 (09:40 -0700)]
Add documentation explaining how to read Commit and Tag
Since we stopped supporting these types for reading, but their
name is a natural candidate for someone to try and use in code,
explain where they should be looking instead.
Change-Id: I091a1b0ef71b842016020f938ba3161431aab9c9 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Marc Strapetz [Thu, 29 Jul 2010 14:21:37 +0000 (16:21 +0200)]
Perform automatic CRLF to LF conversion during WorkingTreeIterator
WorkingTreeIterator now optionally performs CRLF to LF conversion for
text files. A basic framework is left in place to support enabling
(or disabling) this feature based on gitattributes, and also to
support the more generic smudge/clean filter system. As there is
no gitattribute support yet in JGit this is left unimplemented,
but the mightNeedCleaning(), isBinary() and filterClean() methods
will provide reasonable places to plug that into in the future.
[sp: All bugs inside of WorkingTreeIterator are my fault, I wrote
most of it while cherry-picking this patch and building it on
top of Marc's original work.]
Shawn O. Pearce [Wed, 4 Aug 2010 00:40:01 +0000 (17:40 -0700)]
Expose pack fetch/push connections for subclassing
These classes need to be visible if an application wants to define
its own native pack based protocol embedded within another layer,
much like we already support for smart HTTP.
Change-Id: I7e2ac3ad01d15b94d340128a395fe0b2f560ff35 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 3 Aug 2010 00:00:35 +0000 (17:00 -0700)]
Allow ObjectReuseAsIs to have more control over write ordering
The reuse system used by an object database may be able to benefit
from knowing what objects are coming next, and even improve data
throughput by delaying (or moving up) objects that are stored near
each other in the source database.
Pushing the iteration down into the reuse code makes it possible
for a smarter implementation to aggregate reuse. But for the
standard pack file format on disk we don't bother, its quite
efficient already.
Change-Id: I64f0048ca7071a8b44950d6c2a5dfbca3be6bba6 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 2 Aug 2010 19:26:01 +0000 (12:26 -0700)]
Allow ObjectToPack subclasses to use up to 4 bits of flags
Some instances may benefit from having access to memory efficient
storage for some small values, like single flag bits. Give up a
portion of our delta depth field to make 4 bits available to any
subclass that wants it.
This still gives us room for delta chains of 1,048,576 objects,
and that is just insane. Unpacking 1 million objects to get to
something is longer than most users are willing to wait for data
from Git.
Change-Id: If17ea598dc0ddbde63d69a6fcec0668106569125 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sun, 1 Aug 2010 03:08:10 +0000 (20:08 -0700)]
Implement async/batch lookup of object data
An ObjectReader implementation may be very slow for a single object,
but yet support bulk queries efficiently by batching multiple small
requests into a single larger request. This easily happens when the
reader is built on top of a database that is stored on another host,
as the network round-trip time starts to dominate the operation cost.
RevWalk, ObjectWalk, UploadPack and PackWriter are the first major
users of this new bulk interface, with the goal being to support an
efficient way to pack a repository for a fetch/clone client when the
source repository is stored in a high-latency storage system.
Processing the want/have lists is now done in bulk, to remove
the high costs associated with common ancestor negotiation.
PackWriter already performs object reuse selection in bulk, but it
now can also do the object size lookup and object counting phases
with higher efficiency. Actual object reuse, deltification, and
final output are still doing sequential lookups, making them a bit
more expensive to perform.
Change-Id: I4c966f84917482598012074c370b9831451404ee Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
By giving the reader information about the roots of a revision
traversal, some readers may be able to prefetch information from
their backing store using background threads in order to reduce
data access latency. However this isn't typically necessary so
the default reader implementation doesn't react to the advice.
Change-Id: I72c6cbd05cff7d8506826015f50d9f57d5cda77e Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
ObjectReader implementations may wish to use multiple threads in
order to evaluate object reuse faster. Let the reader make that
decision by passing the iteration down into the reader.
Because the work is pushed into the reader, it may need to locate a
given ObjectToPack given its ObjectId. This can easily occur if the
reader has sent a list of ObjectIds to the object database and gets
back information keyed only by ObjectId, without the ObjectToPack
handle. Expose lookup using the PackWriter's own internal map,
so the reader doesn't need to build a redundant copy to track the
assocation of ObjectId back to ObjectToPack.
Change-Id: I0c536405a55034881fb5db92a2d2a99534faed34 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 2 Aug 2010 17:17:21 +0000 (10:17 -0700)]
Flush the pack header as soon as its ready
When the output stream is deeply buffered (e.g. 1 MiB or more in
an HTTP servlet on some containers) trying to kick out the header
earlier will prevent the client from stalling hard while the first
1 MiB is received and it can process the pack header. Forcing a
flush here lets the client see the header and start its progress
monitor for "Receiving objects: (1/N)" so the user knows there
is still activity occurring, even though the buffering may cause
there to be some lag as the buffer fills up on the sending side.
Change-Id: I3edf39e8f703fe87a738dc236d426b194db85e3a Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Storage implementations may find this useful when implementing the
ObjectReuseAsIs interface on their ObjectReader. Expose it so we
don't force them to create a redundant copy of the information.
Change-Id: I802ec8113c00884fccde5d0e92b9849716316f62 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 20 Aug 2010 22:41:49 +0000 (15:41 -0700)]
Add a public RevTag.parse() method
Callers might have a canonical tag encoding on hand that they
wish to convert into a clean structure for presentation purposes,
and the object may not be available in a repository. (E.g. maybe
its a "draft" tag being written in an editor.)
Change-Id: I387a462afb70754aa7ee20891e6c0262438fdf32 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 20 Aug 2010 16:32:07 +0000 (09:32 -0700)]
Add a public RevCommit.parse() method
Callers might have a canonical commit encoding on hand that they
wish to convert into a clean structure for presentation purposes,
and the object may not be available in a repository. (E.g. maybe
its a "draft" commit being written in an editor.)
Change-Id: I21759cff337cbbb34dbdde91aec5aa4448a1ef37 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 20 Aug 2010 22:18:25 +0000 (15:18 -0700)]
Make Tag class only for writing
The Tag class now only supports the creation of an annotated tag
object. To read an annotated tag, applictions should use RevTag.
This permits us to have exactly one implementation, and RevTag's
is faster and more bug-free.
Change-Id: Ib573f7e15f36855112815269385c21dea532e2cf Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 20 Aug 2010 15:53:46 +0000 (08:53 -0700)]
Make Commit class only for writing
The Commit class now only supports the creation of a commit object.
To read a commit, applictions should use RevCommit. This permits
us to have exactly one implementation, and RevCommit's is faster
and more bug-free.
Change-Id: Ib573f7e15f36855112815269385c21dea532e2cf Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 20 Aug 2010 22:42:52 +0000 (15:42 -0700)]
Correct PersonIdent hashCode() and equals() to ignore milliseconds
Git doesn't store millisecond accuracy in person identity lines,
so a line that we create in Java and round-trip through a Git object
wouldn't compare as being equal. Truncate to seconds when comparing
values to ensure the same identity is equal.
Change-Id: Ie4ebde64061f52c612714e89ad34de8ac2694b07 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 20 Aug 2010 16:25:47 +0000 (09:25 -0700)]
Try really hard to load a commit or tag
When we need the canonical form of a commit or a tag in order to
parse it into our RevCommit or RevTag fields, we really need it as a
single contiguous byte array. However the ObjectDatabase may choose
to give us a large loader. In general commits or tags are always
under the several MiB limit, so even if the loader calls it "large"
we should still be able to afford the JVM heap memory required to
get a single byte array. Coerce even large loaders into a single
byte array anyway.
Change-Id: I04efbaa7b31c5f4b0a68fc074821930b1132cfcf Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
ReadTreeTests relied on Repository.getIndex() which on
platforms which coarse FileSystemTimers failed to detect
index modifications. By explicitly reloading and writing
the index this problem is solved.
Change-Id: I0a98babfc2068a3b6b7d2257834988e1154f5b26 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Shawn O. Pearce [Thu, 19 Aug 2010 18:50:20 +0000 (11:50 -0700)]
Make ObjectId.compareTo final
Since equals() is now final and does not permit being overridden,
we should do the same thing with compareTo() to prevent different
subclasses from having different ordering behaviors. This could
lead to the same mess that we had with different equals() behaviors.
Change-Id: I35a849b6efccee5fe74cc5788a3566a1516004b7 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 19 Aug 2010 18:46:22 +0000 (11:46 -0700)]
Make ObjectId.hashCode final too
Since equals() is now final and does not permit being overridden,
we should do the same thing with hashCode() to prevent different
subclasses from having different hashing behaviors. This could
lead to the same mess that we had with different equals() behaviors.
Change-Id: I35a849b6efccee5fe74cc5788a3566a1516004b7 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 19 Aug 2010 18:43:39 +0000 (11:43 -0700)]
Remove unnecessary ObjectId.copy() calls
When RevObject overrode equals() to provide only reference equality
we used to need to convert a RevObject into an ObjectId by copy()
just to use standard Java tools like JUnit assertEquals(), or to
use contains() or get() on standard java.util collection types.
Now that we have removed this override and made ObjectId's equals()
final (preventing any of this mess in the future), some copy()
calls are unnecessary. Anytime the value is being used as an input
to a lookup routine, or to an equals, we can avoid the copy().
However we still want to use copy() anytime we are given an ObjectId
that may exist long-term, where we don't want the high cost of the
additional storage from a RevCommit extension. So we can't remove
all uses of copy(), just some of them.
Change-Id: Ief275dace435c0ddfa362ac8e5d93558bc7e9fc3 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The MergeResult class is enhanced to report more data about a
three-way merge. Information about conflicts and the base, ours,
theirs commits can be retrived.
Change-Id: Iaaf41a1f4002b8fe3ddfa62dc73c787f363460c2 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Chris Aniszczyk [Thu, 19 Aug 2010 01:56:27 +0000 (20:56 -0500)]
Remove getter and setter for author in Tag
There was a duplicated getter and setter for tagger in Tag.
There's no needed to have two getters and setters that represent
the same things. The appropriate tests were updated also.
Change-Id: If46dc00c4c0f31ea4234c6d3bda3c03e6ebbafac Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
indexState() encodes the complete state of the index
into one readable String. This helps to write tests
against the index. indexState() is enhanced to optionally
also contain the content of the files in the index.
Change-Id: Ie988f93768d864f4cbd55809a786bd5759fc24a5 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Added a utility method to set the reset an index to match exactly
some content in the filesystem. This can be used by tests to prepare
commits in the working-tree and set the index in one shot.
[sp: Cleaned up formatting, added getEntryFile(), released inserter.]
Change-Id: If38b1f7cacaaf769f51b14541c5da0c1e24568a5 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Add a convenience API in FileRepository to pass in a String that
points to the GIT_DIR location. This is converted to a File and
sent through the usual constructor.
Matthias Sohn [Mon, 16 Aug 2010 15:42:41 +0000 (17:42 +0200)]
Set default file encoding used for JGit tests to UTF-8
This patch fixes the problem that JGit tests run from Maven fail on
Mac OS X [1]. In Eclipse the tests succeed since we set Eclipse
workspace encoding to UTF-8 via "Preferences > General > Workspace
> Text file encoding", checked via JConsole that this setting changes
the JVM system property of the test run. This change copies this
setting to the Maven test environment so that we get consistent test
results on all platforms.