Nico Sallembien [Tue, 9 Feb 2010 23:01:27 +0000 (15:01 -0800)]
Allow users of ReceivePack access to the objects being sent
When implementing branch read access, we need to prove that the
newly created reference(s) point to objects that the user can see.
There are two ways that an object is reachable:
1) It's reachable from a branch or change the user can see
2) It was uploaded as part of the pack file the user sent us
This change adds additional methods in ReceivePack that will allow a
server to check the above conditions, in order to ensure that a user
is not trying to create a reference that they cannot see, or that a
malicious user isn't attempting to forge the SHA-1 of an object that
they cannot see in order to base a change off of it.
Nico Sallembien [Tue, 9 Feb 2010 17:53:53 +0000 (09:53 -0800)]
Add a RefFilter interface to ReceivePack and UploadPack
When a user of ReceivePack or UploadPack wants to control what refs
are sent to the client, for instance when some refs should be hidden
from some clients, this interface can be extended to provide a fine
grained control over what refs are sent to the client.
Shawn O. Pearce [Tue, 9 Feb 2010 23:58:37 +0000 (15:58 -0800)]
Remove pointless boolean during native push
The boolean field sentCommand is always true at this point, as it
was assigned just 5 lines above. So we always set the status of
the update command object to AWAITING_REPORT.
Simplify the logic by dropping the ?: operator. I assume this is
older code from an attempt to manage dry-run push support within
the native connection, but in fact dry-run support is done higher
up inside of PushProcess.
Change-Id: I450d491bbbb5afecdbf5444ab7169222e856a3bb Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Robin Rosenberg [Thu, 4 Feb 2010 06:17:18 +0000 (07:17 +0100)]
Intermediate workaround for JGit's lack of core.autocrlf support
Windows users by default have core.autocrlf set to true. JGit
does not recognize the flags and thus works as if it is set. In order
to make JGit more compatible with msysgit we set the flag to false
in repositories that JGit creates.
Bug: 301775
Change-Id: I7ea462fe3516e5060b87aa1f7ed63689936830c2 Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Use keep(1) instead of add() when skipping an entry
Doing a keep call with a length of 1 will copy the current entry just
like the previous add was doing, but it avoids doing any validation
on the entry. This is sane because the entry can be assumed to be
already valid, since its originating from the destination index.
Change-Id: I250d902fc98580444af1ba4b8fedceb654541451
Originally: http://thread.gmane.org/gmane.comp.version-control.git/128214/focus=128213 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
A 0 file mode in a DirCacheEntry is not a valid mode. To C git
such a value indicates the record should not be present. We already
were catching this bad state and exceptioning out when writing tree
objects to disk, but we did not fail when writing the dircache back
to disk. This allowed JGit applications to create a dircache file
which C git would not like to read.
Instead of checking the mode during writes, we now check during
mutation. This allows application bugs to be detected sooner and
closer to the cause site. It also allows us to avoid checking most
of the records which we read in from disk, as we can assume these
are formatted correctly.
Some of our unit tests were not setting the FileMode on their test
entry, so they had to be updated to use REGULAR_FILE.
Change-Id: Ie412053c390b737c0ece57b8e063e4355ee32437
Originally: http://thread.gmane.org/gmane.comp.version-control.git/128214/focus=128213 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> CC: Adam W. Hawks <awhawks@writeme.com>
A dircache record must not use a path string like "/a" or "a//b"
as this results in a tree entry being written with a zero length
name component in the record. C git does not support an empty name,
and neither does any modern filesystem.
A record also must not have a stage outside of the standard 0-3
value range, as there are only 2 bits of space available in the
on-disk format of the record to store the stage information.
Any other values would be truncated into this space, storing a
different value than the caller expected.
If an application tries to create a DirCache record with either of
these wrong values, we abort with an IllegalArgumentException.
Change-Id: I699de149efdfccd85d8adde07d3efd080e3b49c2
Originally: http://thread.gmane.org/gmane.comp.version-control.git/128214 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> CC: Adam W. Hawks <awhawks@writeme.com>
Shawn O. Pearce [Wed, 3 Feb 2010 16:23:34 +0000 (08:23 -0800)]
Ensure RawText closes the FileInputStream when read is complete
Rather than implementing the file reading logic ourselves, and
wind up leaking the FileInputStream's file descriptor until the
next GC, use IO.readFully(File) which wraps the read loop inside
of a try/finally to ensure the stream is closed before it exits.
Change-Id: I85a3fe87d5eff88fa788962004aebe19d2e91bb4 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Reviewed-by: Roland Grunberg <rgrunber@redhat.com>
Shawn O. Pearce [Wed, 3 Feb 2010 04:03:03 +0000 (20:03 -0800)]
Cleanup OSGi Import-Package specifications to use versions
Actually set the range of versions we are willing to accept for
each package we import, lest we import something in the future
that isn't compatible with our needs.
Change-Id: I25dbbb9eaabe852631b677e0c608792b3ed97532 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 2 Feb 2010 22:21:27 +0000 (14:21 -0800)]
Micro-optimize CanonicalTreeParser next() for ObjectWalk
ObjectWalk is invoking next() for each record we consider in a tree.
Rather than doing several method calls against the current parser,
and testing if we are at eof() at least twice per next() invocation,
do it only once and inline the logic to move the parser forward.
Change-Id: If5938f5d7b3ca24f500a184c9bd2ef193015414e Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
In this tree, "B" was being skipped because "A/A" as an empty tree
was immediately followed by "A/B", also an empty tree, but the
ObjectWalk broke out too early and never visited "B".
Bug: 286653
Change-Id: I25bcb0bc99d0cbbbdd9c2bd625ad6a691a6d0335 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 2 Feb 2010 19:39:24 +0000 (11:39 -0800)]
Ensure the tree parser resets in ObjectWalk
During dispose() or reset() we are suppose to be restoring the
ObjectWalk instance back to the original pre-walk state, but we
failed to reset the tree parser. This can lead to confusing state
if the ObjectWalk was reused by the caller, as entries from the
old walk might be reported as part of the new walk.
Change-Id: I6237bae7bfd3794e8b9a92b4dd475559cc72e634 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 2 Feb 2010 17:09:26 +0000 (09:09 -0800)]
Correctly skip over unrecognized optional dircache extensions
We didn't skip the correct number of bytes when we skipped over an
unrecognized but optional dircache extension. We missed skipping
the 8 byte header that makes up the extension's name and length.
We also didn't include the skipped extension's payload as part of
our index checksum, resuting in a checksum failure when the index
was done reading. So ensure we always scan through a skipped
section and include it in the checksum computation.
Add a test case for a currently unsupported index extension, 'ZZZZ',
to verify we can still read the DirCache object even though we
don't know what 'ZZZZ' is supposed to mean.
Bug: 301287
Change-Id: I4bdde94576fffe826d0782483fd98cab1ea628fa Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 2 Feb 2010 16:46:13 +0000 (08:46 -0800)]
Remove RepositoryTestCase from DirCacheCGitCompatabilityTest
This test doesn't actually depend upon the large data set we have
in the RepositoryTestCase, so drop that from the dependency and
use the more simple LocalDiskRepositoryTestCase instead.
Change-Id: I0fd4affe1dd5ec86e8c3253db42df11d3b612e36 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Fix .classpath to make jgit easily runnable from inside eclipse
When running jgit from inside Eclipse (e.g. rightclick on project
org.eclipse.jgit.pgm and select Run as->Java application) no commands
are found. This is because the commands are loaded from a resource file
/META-INF/services/org.eclipse.jgit.pgm.TextBuiltin and this file is
not anymore on the classpath.
I fixed this by modifying .classpath to contain the META-INF directory.
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Shawn O. Pearce [Thu, 28 Jan 2010 19:13:11 +0000 (11:13 -0800)]
Generate an Eclipse IP log with jgit eclipse-iplog
The new plugin contains the bulk of the logic to scan a Git repository,
and query IPZilla, in order to produce an XML formatted IP log for the
requested revision of any Git based project. This plugin is suitable
for embedding into a servlet container, or into the Eclipse workbench.
The command line pgm package knows how to invoke this plugin through
the eclipse-iplog subcommand, permitting storage of the resulting
log as a local XML file.
Change-Id: If01d9d98d07096db6980292bd5f91618c55d00be Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 25 Jan 2010 22:51:56 +0000 (14:51 -0800)]
Fix racy HTTP tests by waiting for requests to finish
Ensure the background Jetty threads have been able to write the
request log record before the JUnit thread tries to read the set
of requests back. This wait is necessary because the JUnit thread
may be able to continue as soon as Jetty has finished writing
the response onto the socket, and hasn't necessarily finished the
post-response logging activity.
By using a semaphore with a fixed number of resources, and using
one resource per request, but all of them when we want to read the
log, we implement a simple lock that requires there be no active
requests when we want to get the log from the JUnit thread.
Change-Id: I499e1c96418557185d0e19ba8befe892f26ce7e4 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sun, 24 Jan 2010 01:17:23 +0000 (17:17 -0800)]
Don't confuse empty configuration variables with booleans
Config was confusing the following two variables when writing the
file back to text format:
[my]
empty =
enabled
When parsed, we say that my.empty has 1 value, null, and my.enabled
is an empty string value that in boolean context should be evaluated
as true.
Saving this configuration file back to text format was ignoring the
null value for my.empty, producing a completely different file than
what Config read:
[my]
empty
enabled
Instead handle the writing differently to ensure the original format
is output. New tests cases cover the expected behavior and return
values from accessor methods.
Change-Id: Id37379ce20cb27e3330923cf989444dd9f2bdd96 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sat, 23 Jan 2010 21:11:58 +0000 (13:11 -0800)]
Check for remote server exec failures and report
If remote.name.uploadpack or .receivepack is misconfigured and points
to a non-existent command on the remote system, we should receive back
exit status 127. Report this case specially with the command we used
so the user knows what is going.
Bug: 293703
Change-Id: I7504e7b6238d5d8e698d37db7411c4817a039d08 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sat, 23 Jan 2010 20:11:38 +0000 (12:11 -0800)]
Relax ObjectChecker to permit missing tagger lines
Annotated tags created with C Git versions before the introduction
of c818566 ([PATCH] Update tags to record who made them, 2005-07-14),
do not have a "tagger" line present in the object header. This line
did not appear in C Git until v0.99.1~9.
Ancient projects such as the Linux kernel contain such tags, for
example Linux 2.6.12 is older than when this feature first appeared
in C Git. Linux v2.6.13-rc4 in late July 2005 is the first kernel
version tag to actually contain a tagger line.
It is therefore acceptable for the header to be missing, and for
the RevTag.getTaggerIdent() method to return null.
Since the Javadoc for getTaggerIdent() already explained that the
identity may be null, we just need to test that this is true when
the header is missing, and allow the ObjectChecker to pass anyway.
Change-Id: I34ba82e0624a0d1a7edcf62ffba72260af6f7e5d
See: http://code.google.com/p/gerrit/issues/detail?id=399 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sat, 23 Jan 2010 19:40:31 +0000 (11:40 -0800)]
Correct bundle, provider names to be consistent
Technically our project name is "JGit", not "Java Git". In fact
there is already another project called "JavaGit" (no space) that we
don't want to become confused with. Ensure we always call ourselves
"JGit" in user visible assets, like the bundle name.
Other Eclipse products list their provider as "Eclipse.org",
not "eclipse.org". So list ourselves that way in all of our
plugin.properties files.
Change-Id: Ibcea1cd6dda2af757a8584099619fc23b7779a84 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sat, 23 Jan 2010 19:11:06 +0000 (11:11 -0800)]
Merge branch 'ref-abstract'
* ref-abstract:
Optimize RefAdvertiser performance by avoiding sorting
branch: Add -m option to rename a branch
Replace writeSymref with RefUpdate.link
Rewrite reference handling to be abstract and accurate
Create new RefList and RefMap utility types
Shawn O. Pearce [Sat, 23 Jan 2010 02:42:12 +0000 (18:42 -0800)]
Optimize RefAdvertiser performance by avoiding sorting
Don't copy and sort the set of references if they are passed through
in a RefMap or a SortedMap using the key's natural sort ordering.
Either map is already in the order we want to present the items
to the client in, so copying and sorting is a waste of local CPU
and memory.
Change-Id: I49ada7c1220e0fc2a163b9752c2b77525d9c82c1 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sun, 10 Jan 2010 02:56:45 +0000 (18:56 -0800)]
Replace writeSymref with RefUpdate.link
By using RefUpdate for symbolic reference creation we can reuse
the logic related to updating the reflog with the event, without
needing to expose something such as the legacy ReflogWriter class
(which we no longer have).
Applications using writeSymref must update their code to use the
new pattern of changing the reference through the updateRef method:
String refName = "refs/heads/master";
RefUpdate u = repository.updateRef(Constants.HEAD);
u.setRefLogMessage("checkout: moving to " + refName, false);
switch (u.link(refName)) {
case NEW:
case FORCED:
case NO_CHANGE:
// A successful update of the reference
break;
default:
// Handle the failure, e.g. for older behavior
throw new IOException(u.getResult());
}
Change-Id: I1093e1ec2970147978a786cfdd0a75d0aebf8010 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 22 Jan 2010 22:54:40 +0000 (14:54 -0800)]
Rewrite reference handling to be abstract and accurate
This commit actually does three major changes to the way references
are handled within JGit. Unfortunately they were easier to do as
a single massive commit than to break them up into smaller units.
Reporting a symbolic reference such as HEAD as though it were
any other normal reference like refs/heads/master causes subtle
programming errors. We have been bitten by this error on several
occasions, as have some downstream applications written by myself.
Instead of reporting HEAD as a reference whose name differs from
its "original name", report it as an actual SymbolicRef object
that the application can test the type and examine the target of.
With this change, Ref is now an abstract type with different
subclasses for the different types.
In the classical example of "HEAD" being a symbolic reference to
branch "refs/heads/master", the Repository.getAllRefs() method
will now return:
Map<String, Ref> all = repository.getAllRefs();
SymbolicRef HEAD = (SymbolicRef) all.get("HEAD");
ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master");
A nice side-effect of this change is the storage type of the
symbolic reference is no longer ambiguous with the storge type
of the underlying reference it targets. In the above example,
if master was only available in the packed-refs file, then the
following is also true:
(Prior to this change we returned the ambiguous storage of
LOOSE_PACKED for HEAD, which was confusing since it wasn't
actually true on disk).
Another nice side-effect of this change is all intermediate
symbolic references are preserved, and are therefore visible
to the application when they walk the target chain. We can
now correctly inspect chains of symbolic references.
As a result of this change the Ref.getOrigName() method has been
removed from the API. Applications should identify a symbolic
reference by testing for isSymbolic() and not by using an arcane
string comparsion between properties.
Abstract the RefDatabase storage:
---------------------------------
RefDatabase is now abstract, similar to ObjectDatabase, and a
new concrete implementation called RefDirectory is used for the
traditional on-disk storage layout. In the future we plan to
support additional implementations, such as a pure in-memory
RefDatabase for unit testing purposes.
Optimize RefDirectory:
----------------------
The implementation of the in-memory reference cache, reading, and
update routines has been completely rewritten. Much of the code
was heavily borrowed or cribbed from the prior implementation,
so copyright notices have been left intact as much as possible.
The RefDirectory cache no longer confuses symbolic references
with normal references. This permits the cache to resolve the
value of a symbolic reference as late as possible, ensuring it
is always current, without needing to maintain reverse pointers.
The cache is now 2 sorted RefLists, rather than 3 HashMaps.
Using sorted lists allows the implementation to reduce the
in-memory footprint when storing many refs. Using specialized
types for the elements allows the code to avoid additional map
lookups for auxiliary stat information.
To improve scan time during getRefs(), the lists are returned via
a copy-on-write contract. Most callers of getRefs() do not modify
the returned collections, so the copy-on-write semantics improves
access on repositories with a large number of packed references.
Iterator traversals of the returned Map<String,Ref> are performed
using a simple merge-join of the two cache lists, ensuring we can
perform the entire traversal in linear time as a function of the
number of references: O(PackedRefs + LooseRefs).
Scans of the loose reference space to update the cache run in
O(LooseRefs log LooseRefs) time, as the directory contents
are sorted before being merged against the in-memory cache.
Since the majority of stable references are kept packed, there
typically are only a handful of reference names to be sorted,
so the sorting cost should not be very high.
Locking is reduced during getRefs() by taking advantage of the
copy-on-write semantics of the improved cache data structure.
This permits concurrent readers to pull back references without
blocking each other. If there is contention updating the cache
during a scan, one or more updates are simply skipped and will
get picked up again in a future scan.
Writing to the $GIT_DIR/packed-refs during reference delete is
now fully atomic. The file is locked, reparsed fresh, and written
back out if a change is necessary. This avoids all race conditions
with concurrent external updates of the packed-refs file.
The RefLogWriter class has been fully folded into RefDirectory
and is therefore deleted. Maintaining the reference's log is
the responsiblity of the database implementation, and not all
implementations will use java.io for access.
Future work still remains to be done to abstract the ReflogReader
class away from local disk IO.
Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sat, 23 Jan 2010 00:27:03 +0000 (16:27 -0800)]
Create new RefList and RefMap utility types
These types can be used by RefDatabase implementations to manage
the collection.
A RefList stores items sorted by their name, and is an immutable
type using copy-on-write semantics to perform modifications to
the collection. Binary search is used to locate an existing item
by name, or to locate the proper insertion position if an item does
not exist.
A RefMap can merge up to 3 RefList collections at once during its
entry iteration, allowing items in the resolved or loose RefList
to override items by the same name in the packed RefList.
The RefMap's goal is O(log N) lookup time, and O(N) iteration time,
which is suitable for returning from a RefDatabase. By relying on
the immutable RefList we might be able to make map construction
nearly constant, making Repository.getAllRefs() an inexpensive
operation if the caches are current. Since modification is not
common, changes require up to O(N + log N) time to copy the internal
list and collapse or expand the list's array. As most changes
are made to the loose collection and not the packed collection,
in practice most changes would require less than the full O(N)
time, due to a significantly smaller N in the loose list.
Almost complete test coverage is included in the corresponding
unit tests. A handful of methods on RefMap are not tested in this
change, as writing the proper test depends on a future refactoring
of how the Ref class represents symbolic reference names.
Change-Id: Ic2095274000336556f719edd75a5c5dd6dd1d857 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Added caching for loose object lookup during pack indexing
On Windows systems, file system lookup is a slow operation, so
checking each object if it exists during indexing (after receiving
the pack) could take a siginificant time. This patch introduces
CachedObjectDirectory that pre-caches lookup results.
Robin Rosenberg [Thu, 14 Jan 2010 22:53:11 +0000 (23:53 +0100)]
Introduce a named constant for the .git directory.
Not all occurrences of ".git" are replaced by this constant, only
those where it actually refers to the directory with that name, i.e
not the ".git" directory suffix.
Asserts and comment are also excluded from replacement.
Change-Id: I65a9da89aedd53817f2ea3eaab4f9c2bed35d7ee Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Shawn O. Pearce [Wed, 6 Jan 2010 18:21:05 +0000 (10:21 -0800)]
client side smart HTTP
During fetch over http:// clients now try to take advantage of
the info/refs?service=git-upload-pack URL to determine if the
remote side will support a standard upload-pack command stream.
If so each block of 32 have lines is sent in one POST request,
prefixed by all of the 'want' lines and any previously discovered
common bases as 'have' lines.
During push over http:// clients now try to take advantage of
the info/refs?service=git-receive-pack URL to determine if the
remote side will support a standard receive-pack command stream.
If so, commands are sent along with their pack in a single HTTP
POST request.
Bug: 291002
Change-Id: I8c69b16ac15c442e1a4c3bd60b4ea1a47882b851 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 6 Jan 2010 19:13:25 +0000 (11:13 -0800)]
server side: smart fetch over HTTP
Clients can request smart fetch support by examining the info/refs URL
with the service parameter set to the magic git-upload-pack string:
GET /$GIT_DIR/info/refs?service=git-upload-pack HTTP/1.1
The response is formatted with the upload pack capabilities, using
the standard packet line formatter. A special header line is put
in front of the standard upload-pack advertisement to let clients
know the service was recognized and is supported.
If the requested service is disabled an authorization status code is
returned, allowing the user agent to retry once they have obtained
credentials from a human, in case authentication is required by
the configured UploadPackFactory implementation.
Change-Id: Ib0f1a458c88b4b5509b0f882f55f83f5752bc57a Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 6 Jan 2010 19:13:05 +0000 (11:13 -0800)]
server side: smart push over HTTP
Clients can request smart push support by examining the info/refs URL
with the service parameter set to the magic git-receive-pack string:
GET /$GIT_DIR/info/refs?service=git-receive-pack HTTP/1.1
The response is formatted with the receive pack capabilities, using
the standard packet line formatter. A special header block is put
in front of the standard receive-pack advertisement to let clients
know the service was recognized and is supported.
If the requested service is disabled an authorization status code is
returned, allowing the user agent to retry once they have obtained
credentials from a human, in case authentication is required by
the configured ReceivePackFactory implementation.
Change-Id: Ie4f6e0c7b68a68ec4b7cdd5072f91dd406210d4f Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 6 Jan 2010 20:26:54 +0000 (12:26 -0800)]
Simple dumb HTTP server for Git
This is a simple HTTP server that provides the minimum server side
support required for dumb (non-git aware) transport clients.
We produce the info/refs and objects/info/packs file on the fly
from the local repository state, but otherwise serve data as raw
files from the on-disk structure.
In the future we could better optimize the FileSender class and the
servlets that use it to take advantage of direct file to network
APIs in more advanced servlet containers like Jetty.
Our glue package borrows the idea of a micro embedded DSL from
Google Guice and uses it to configure a collection of Filters
and HttpServlets, all of which are matched against requests using
regular expressions. If a subgroup exists in the pattern, it is
extracted and used for the path info component of the request.
Change-Id: Ia0f1a425d07d035e344ae54faf8aeb04763e7487 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 7 Oct 2009 02:23:33 +0000 (19:23 -0700)]
Expose RefAdvertiser for reuse outside of the transport package
By making this class and its methods public, and the actual writing
abstract, we can reuse this code for other formats like writing an
info/refs file for HTTP transports.
Change-Id: Id0e349c30a0f5a8c1527e0e7383b80243819d9c5 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 7 Oct 2009 07:10:51 +0000 (00:10 -0700)]
Teach UploadPack how to use an RPC style interface
If biDirectionalPipe is false UploadPack does not start out with
the advertisement but instead assumes it should read one block of
want/have lines, process that, and write the ACK/NAKs out.
This means it only is doing one read through the input followed by
one write to the output, which fits with the HTTP request processing
model, and any other type of RPC system.
Change-Id: Ia9f7c46ee556f996367180f15d2caa8572cdd59f Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 7 Oct 2009 01:43:41 +0000 (18:43 -0700)]
Teach ReceivePack how to use an RPC style interface
If biDirectionalPipe is false ReceivePack does not start out with the
advertisement but instead assumes it should read the command set once,
process that, and write the status report out. This means it only is
doing one read through the input followed by one write to the output,
which fits with the HTTP request processing model, and any other type
of RPC system... assuming that the payload for input can be a very big
entity like the command stream followed by the pack file.
Change-Id: I6f31f6537a3b7498803a8a54e10b0622105718c1 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 27 Nov 2009 04:16:30 +0000 (20:16 -0800)]
Refactor TemporaryBuffer to support reuse in other contexts
Later we are going to add support for smart HTTP, which requires us to
buffer at least some of the request created by a client before we ship
it to the server. For many requests, we can fit it completely into a
1 MiB buffer, but if it doesn't we can drop back to using the chunked
transfer encoding to send an unknown stream length.
Rather than recoding the block based memory buffer, we refactor the
local file overflow strategy into a subclass, allowing the HTTP client
code to replace this portion of the logic with its own approach to
start the chunked encoding request.
Change-Id: Iac61ea1017b14e0ad3c4425efc3d75718b71bb8e Signed-off-by: Shawn O. Pearce <sop@google.com>
Shawn O. Pearce [Wed, 4 Nov 2009 02:00:50 +0000 (18:00 -0800)]
Implement multi_ack_detailed protocol extension
The multi_ack_detailed extension breaks out the "ACK %s continue" status
code into "ACK %s common" and "ACK %s ready" states, making it easier to
discover which objects are truely common, and which objects are simply
on a chain the server doesn't care learning about.
Change-Id: Ie8e907424cfbbba84996ca205d49eacf339f9d04 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 5 Jan 2010 19:44:52 +0000 (11:44 -0800)]
Abstract out utility functions for creating test commits
These routines create a fairly clean DSL for writing out the
structure of a repository in a test case. Abstract them into
a helper class that we can reuse in other test environments.
Change-Id: I55cce3d557e1a28afe2fdf37b3a5b67e2651c9f1 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 6 Jan 2010 23:16:05 +0000 (15:16 -0800)]
Fix PersonIdent to always use SystemReader
Under unit tests we want the when and timezone to come from the
MockSystemReader and be stable. We did this for the default
constructor based on the Repository, but failed to do it for the
name,emailAddress variant of the constructor.
Change-Id: I608ac7cf01673729303395e19b379b38fef136b3 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 4 Jan 2010 23:00:45 +0000 (15:00 -0800)]
Fix RefWriter creation of info/refs to omit HEAD
We really mean to omit HEAD here, but botched the difference between
getOrigName and getName on the Ref object. We tested on the wrong
value, picking up the target of the symbolic ref and therefore
included it twice.
Change-Id: If780c65166ccada2e63a4f42bbab752a56b16564 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 12 Jan 2010 19:41:35 +0000 (11:41 -0800)]
Finish removing Apache Felix maven-bundle-plugin
Since Robin reverted using the maven-bundle-plugin to produce the
OSGi manifest, there is no reason for us to reference it from our
build process anymore.
Also, when Robin reverted the to the Eclipse way of doing things,
we failed to update the ignore files to ignore our generated files
but not ignore our tracked .classpath.
Finally, we cannot delete the MANIFEST.MF file during a Maven build,
as this is once again a source file.
Change-Id: I53f77f2002cb4285f728968829560e835651e188 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Robin Rosenberg [Sun, 10 Jan 2010 12:46:33 +0000 (13:46 +0100)]
Partial revert "Switch build to Apache Felix maven-bundle-plugin"
This restores the ability to build using just Eclipse without
strange procedures, extra plugins and it is again possible to
work on both JGit and EGit in the same Eclipse workspace with
ease.
Change-Id: I0af08127d507fbce186f428f1cdeff280f0ddcda Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Igor Fedorenko [Thu, 7 Jan 2010 02:18:44 +0000 (21:18 -0500)]
Explicitly release resources used by java.util.zip.Deflater
Deflater can use significant amount of native (i.e. C) heap
space. Failure to promptly release this memory results
in native memory leak in some cases, particularly severe for
VMs with large java max heap size. For example, running
Team->Commit in one of my EGit workspaces results in ~500M
java process size increase without any significant change
to amount of used java heap when JVM is started with -Xmx1024m.
Change-Id: I649679a8df5683ebedd9380d703513d31c625932 Signed-off-by: Igor Fedorenko <igor@ifedorenko.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Igor Fedorenko [Wed, 6 Jan 2010 23:51:39 +0000 (18:51 -0500)]
Use build timestamp as OSGi version qualifier for SNAPSHOT builds
Default maven-bundle-plugin behaviour results in use of the same
.SNAPSHOT OSGi bundle version qualifier for all snapshot builds.
This causes problems for eclipse update manager and other consumers
that rely on OSGi bundle metadata to select "newer" or "best
matching" version of jgit bundle.
To solve the problem, maven-bundle-plugin is configured to replace
.SNAPSHOT with build timestamp in format like 20100106-1234.
Change-Id: I0999c7bd68aa2ee74dffaed54a8dc4e1b67cf80d Signed-off-by: Igor Fedorenko <igor@ifedorenko.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 6 Jan 2010 17:53:45 +0000 (09:53 -0800)]
Merge branch 'cq-diff'
Per CQ 3559 "JGit - Eugene Myers O(ND) difference algorithm" we
have approval to check this into our master branch.
* cq-diff:
Add file content merge algorithm
Add performance tests for MyersDiff
Add javadoc comments, remove unused code, shift comments to correct place
Fixed MyersDiff to be able to handle more than 100k
Fix some warnings regarding unnecessary imports and accessing static methods
Add the "jgit diff" command
Prepare RawText for diff-index and diff-files
Add a test class for Myers' diff algorithm
Add Myers' algorithm to generate diff scripts
Add set to IntList
Adds the file content merge alorithm and tests for merge to jgit.
The merge algorithm:
- Gets as input parameters the common base, the two new contents
called "ours" and "theirs".
- Computes the Edits from base to ours and from base to theirs with
the help of MyersDiff.
- Iterates over the edits.
- Independent edits from ours or from theirs will just be applied
to the result.
- For conflicting edits we first harmonize the ranges of the edits
so that in the end we have exactly two edits starting and ending
at the same points in the common base. Then we write the two
conclicting contents into the result stream.
Change-Id: I411862393e7bf416b6f33ca55ec5af608ff4663 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
[sp: Fixed up two awkard comments in documentation.] Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sat, 28 Nov 2009 01:22:40 +0000 (17:22 -0800)]
UnionInputStream: combines sequential InputStreams into one
The UnionInputStream utility class combines multiple sequential
InputStreams so they appear to the caller as a single stream with
no gaps. This can be used to concentate streams coming from multiple
independent HTTP connections (for example).
The companion unit test covers the class's full functionality.
Change-Id: I0676c7b5e082a5886bf0e8f43f9fd6c46a666228 Signed-off-by: Shawn O. Pearce <sop@google.com>
Shawn O. Pearce [Mon, 28 Dec 2009 20:01:19 +0000 (12:01 -0800)]
Switch build to Apache Felix maven-bundle-plugin
Tycho isn't production ready for projects like JGit to be using as
their primary build driver. Some problems we ran into with Tycho
0.6.0 that are preventing us from using it are:
* Tycho can't run offline
The P2 artifact resolver cannot perform its work offline. If the
build system has no network connection, it cannot compile a
project through Tycho. This is insane for a distributed version
control system where developers are used to being offline during
development and local testing.
* Magic state in ~/.m2/repository/.meta/p2-metadata.properties
Earlier iterations of this patch tried to use a hybrid build,
where Tycho was only used for the Eclipse specific feature and P2
update site, and maven-bundle-plugin was used for the other code.
This build seemed to work, but only due to magic Tycho specific
state held in my local home directory. This means builds are not
consistently repeatable across systems, and lead me to believe
I had a valid build, when in fact I did not.
* Manifest-first build produces incomplete POMs
The POM created by the manifest-first build format does not
contain the dependency chain, leading a downstream consumer to
not import the runtime dependencies necessary to execute the
bundle it has imported. In JGit's case, this means JSch isn't
included in our dependency chain.
* Manifest-first build produces POMs unreadable by Maven 2.x
JGit has existing application consumers who are relying on
Maven 2.x builds. Forcing them to step up to an alpha release
of Maven 3 is simply unacceptable.
* OSGi bundle export data management is tedious
Editing each of our pom.xml files to mark a new release is
difficult enough as it is. Editing every MANIFEST.MF file to
list our exported packages and their current version number is
something a machine should do, not a human. Yet the Tycho OSGi
way unfortunately demands that a human do this work.
* OSGi bundle import data management is tedious
There isn't a way in the MANIFEST.MF file format to reuse the
same version tags across all of our imports, but we want to have
a consistent view of our dependencies when we compile JGit.
After wasting more than 2 full days trying to get Tycho to work,
I've decided its a lost cause right now. We need to be chasing down
bugs and critical features, not trying to bridge the gap between
the stable Maven repository format and the undocumented P2 format
used only by Eclipse.
So, switch the build to use Apache Felix's maven-bundle-plugin.
This is the same plugin Jetty uses to produce their OSGi bundle
manifests, and is the same plugin used by the Apache Felix project,
which is an open-source OSGi runtime. It has a reasonable number
of folks using it for production builds, and is running on top of
the stable Maven 2.x code base.
With this switch we get automatically generated MANIFEST.MF files
based on reasonably sane default rules, which reduces the amount
of things we have to maintain by hand. When necessary, we can add
a few lines of XML to our POMs to tweak the output.
Our build artifacts are still fully compatible with Maven 2.x, so
any downstream consumers are still able to use our build products,
without stepping up to Maven 3.x. Our artifacts are also valid as
OSGi bundles, provided they are organized on disk into a repository
that the runtime can read.
With maven-bundle-plugin the build runs offline, as much as Maven
2.x is able to run offline anyway, so we're able to return to a
distributed development environment again.
By generating MANIFEST.MF at the top level of each project (and
therefore outside of the target directory), we're still compatible
with Eclipse's PDE tooling. Our projects can be imported as standard
Maven projects using the m2eclipse plugin, but the PDE will think
they are vaild plugins and make them available for plugin builds,
or while debugging another workbench.
This change also completely removes Tycho from the build.
Unfortunately, Tycho 0.6.0's pom-first dependency resolver is broken
when resolving a pom-first plugin bundle through a manifest-first
feature package, so bundle org.eclipse.jgit can't be resolved,
even though it might actually exist in the local Maven repository.
Rather than fight with Tycho any further, I'm just declaring it
plugina-non-grata and ripping it out of the build.
Since there are very few tools to build a P2 format repository, and
no documentation on how to create one without running the Eclipse
UI manually by poking buttons, I'm declaring that we are not going
to produce a P2 update site from our automated builds.
Change-Id: If7938a86fb0cc8e25099028d832dbd38110b9124 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Robin Rosenberg [Wed, 9 Dec 2009 08:54:08 +0000 (09:54 +0100)]
Recognize Git repository environment variables
This makes the jgit command line behave like the C Git implementation
in the respect.
These variables are not recognized in the core, though we add support
to do the overrides there. Hence other users of the JGit library, like
the Eclipse plugin and others, will not be affected.
GIT_DIR
The location of the ".git" directory.
GIT_WORK_TREE
The location of the work tree.
GIT_INDEX_FILE
The location of the index file.
GIT_CEILING_DIRECTORIES
A colon (semicolon on Windows) separated list of paths that
which JGit will not cross when looking for the .git directory.
GIT_OBJECT_DIRECTORY
The location of the objects directory under which objects are
stored.
GIT_ALTERNATE_OBJECT_DIRECTORIES
A colon (semicolon on Windows) separated list of object directories
to search for objects.
In addition to these we support the core.worktree config setting when
the git directory is set deliberately instead of being found.
Change-Id: I2b9bceb13c0f66b25e9e3cefd2e01534a286e04c Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Mon, 28 Dec 2009 23:55:49 +0000 (15:55 -0800)]
Use Constants.OBJECT_ID_STRING_LENGTH instead of LEN * 2
A few locations were doing OBJECT_ID_LENGTH * 2 on their own, as
the old STR_LEN constant wasn't visible. Replace them with the
new public constant OBJECT_ID_STRING_LENGTH.
Change-Id: Id39bddb52de8c65bb097de042e9d4ed99598201f Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Robin Rosenberg [Mon, 28 Dec 2009 15:54:43 +0000 (16:54 +0100)]
Get rid of a duplicate constant for SHA-1 length
Since Constants.OBJECT_ID_LENGTH is a compile time constant we
can be sure that it will always be inlined. The same goes for the
associated constant STR_LEN which is now refactored to the Constant
class and given a name better suited for wider use.
Change-Id: I03f52131e64edcd0aa74bbbf36e7d42faaf4a698 Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Added -crfl attribute for DiffFormatterReflowTest test data
The test data is expected to have unix new lines by tests, but it
is converted to crlf on Windows platform (with msys git). As result
DiffFormatterReflowTest tests fail. To prevent this problem,
crlf conversion is disbled for test data related to that test.
Nico Sallembien [Tue, 22 Dec 2009 19:02:24 +0000 (11:02 -0800)]
Fix typo in ReceivePack.java
The comment indicates that a well-behaved client should not have
sent an update for a ref that already exists, but this in a block
that corresponds to a create command.
Igor Fedorenko [Sat, 19 Dec 2009 00:06:38 +0000 (19:06 -0500)]
Use Tycho version 0.6.0
Changed Tycho version from 0.6.0-SNAPSHOT to 0.6.0 (i.e. release).
SNAPSHOT versions are transient and should only be used for testing
purposes only. Also removed now unnecessary <pluginRepositories/>
element from JGit parent pom.xml file.
Change-Id: Ie386b2dbcba43c1ccec10465978d12d6829c6150 Signed-off-by: Igor Fedorenko <igor@ifedorenko.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>