RefDirectory was not using FileSnapshot correctly in all places. This
is fixed with this commit. Additionally the constructors for the
different types of refs have been changed to take a FileSnapshot
instead of a modification time.
Change-Id: Ifb6a59e87e8b058a398c38cdfb9d648f0bad4bf8 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Shawn O. Pearce [Fri, 13 May 2011 16:16:58 +0000 (09:16 -0700)]
DHT: Add sequence RefData
RefData now uses a sequence number as part of the field, ensuring
that updates always increase the sequence number by one whenever
a reference is modified.
Attaching a sequence number to RefData will help with storing
reference log entries during updates. As the sequence number should
be unique within the reference name space, log entries can be keyed
by the sequence number and remain unique. Making this work over
reference delete-create cycles will require an additional RefTable
API to return the oldest sequence number previously used in the
reference log to seed the recreated reference.
Change-Id: I11cfff2a96ef962e57f29925a3eef41bdbf9f9bb Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Shawn O. Pearce [Fri, 13 May 2011 14:44:42 +0000 (07:44 -0700)]
DHT: Replace TinyProtobuf with Google Protocol Buffers
The standard Google distribution of Protocol Buffers in Java is better
maintained than TinyProtobuf, and should be faster for most uses. It
does use slightly more memory due to many of our key types being
stored as strings in protobuf messages, but this is probably worth the
small hit to memory in exchange for better maintained code that is
easier to reuse in other applications.
Exposing all of our data members to the underlying implementation
makes it easier to develop reporting and data mining tools, or to
expand out a nested structure like RefData into a flat format in a SQL
database table.
Since the C++ `protoc` tool is necessary to convert the protobuf
script into Java code, the generated files are committed as part of
the source repository to make it easier for developers who do not have
this tool installed to still build the overall JGit package and make
use of it. Reviewers will need to be careful to ensure that any edits
made to a *.proto file come in a commit that also updates the
generated code to match.
CQ: 5135
Change-Id: I53e11e82c186b9cf0d7b368e0276519e6a0b2893 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Shawn O. Pearce [Thu, 5 May 2011 18:18:54 +0000 (11:18 -0700)]
DHT: Remove per-process ChunkCache
Performance testing has indicated the per-process ChunkCache isn't
very effective for the DHT storage implementation. If a server is
using the DHT storage backend, it is most likely part of a larger
cluster where requests are distributed in a round-robin fashion
between the member servers.
In such a scenario there is insufficient data locality between
requests to get a good hit ratio on the per-process ChunkCache. A low
hit ratio means the cache is actually hurting performance by eating up
memory that could otherwise be used for transient request data, and
increasing pressure on the GC when it needs to find free space.
Remove all of the ChunkCache code. Installations that want to cache
(to reduce database usage) should wrap their Database with a
CacheDatabase and use a network based CacheServer.
I left the ChunkCache in the original DHT storage commit because I
wanted to document in the history of the project that its probably
worth *not* having, but leave open a door for someone to revert this
change if they find otherwise at a later date.
Change-Id: I364d0725c46c5a19f7443642a40c89ba4d3fdd29 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Stefan Lay [Tue, 24 May 2011 08:38:59 +0000 (01:38 -0700)]
Add a DiffFormatter which calculates a patch-id
Adds a class which can be used to calculates a SHA1 of the diff
associated with a patch, similar to git patch-id.
In this version whitespace is not ignored.
Change-Id: I421d15ea905e23df543082786786841cbe3ef10d Signed-off-by: Stefan Lay <stefan.lay@sap.com> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Let RefDirectory use FileSnapShot to handle fast updates
Since this change may affect performance and memory consumption on every
access to a loose ref I explicitly made it a RFC to collect opinions.
Previously RefDirectory.scanRef() was not detecting an update of a
loose ref when the update didn't changed the modification time of
the backing file. RefDirectory cached loose refs and the way to detect
outdated cache entries was to compare lastmodification timestamp on the
file representing the ref. If two updates to the same ref happen faster
than the filesystem-timer granularity (for linux this is 2 seconds)
there is the possiblity that we don't detect the update.
Because of this bug EGit's PushOperationTest only works with 2 second
sleeps inside.
This change let RefDirectory use FileSnapshot to detect such situations.
FileSnapshot helps to remember when a file was last read from disk and
therefore enables to decide when to load a file from disk although
modification time has not changed.
Change-Id: I03b9a137af097ec69c4c5e2eaa512d2bdd7fe080 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Bernard Leach [Sun, 22 May 2011 14:20:32 +0000 (00:20 +1000)]
Remove rebase temporary files on checkout failure
A checkout conflict during rebase setup should leave the repository
in SAFE state which means ensuring that the rebase temporary files
need to be removed.
Bug: 346813
Change-Id: If8b758fde73ed5a452a99a195a844825a03bae1a Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Bernard Leach [Fri, 13 May 2011 05:59:57 +0000 (15:59 +1000)]
Create a MergeResult for deleted/modified files
Change Ia2ab4f8dc95020f2914ff01c2bf3b1bc62a9d45d added merge
support for when OURS or THEIRS was simultaneously deleted
and modified. That changeset however did not add create an
entry in the conflicts table so clients would see a CONFLICTING
result but getConflicts() would return null.
This change creates a MergeResult for the conflicting file.
Bug: 345684
Change-Id: I52acb81c1729b49c9fb3e7a477c6448d8e55c317 Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Chris Aniszczyk [Wed, 18 May 2011 16:39:33 +0000 (11:39 -0500)]
Implement rebase ff for upstream branches with merge commits
Change Ib9898fe0f982fa08e41f1dca9452c43de715fdb6 added support for
the 'cherry-pick' fast forward case where the upstream commit history
does not include any merge commits. This change adds support for the
case where merge commits exist and the local branch has no changes.
Bug: 344779
Change-Id: If203ce5aa1b4e5d4d7982deb621b710e71f4ee10 Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Optimize MergeAlgorithm if ours or theirs is empty
Previously when merging two contents with a non-empty base and one of
the contents was empty (size == 0) and the other was modified there
was a potentially expensive calculation until we finally always come
to the same result -> the complete non-deleted content should collide
with the empty content.
This proposal adds an optimization to detect empty input content and
to produce the appropriate result immediatly.
Change-Id: Ie6a837260c19d808f0e99173f570ff96dd22acd3 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Shawn O. Pearce [Mon, 16 May 2011 18:28:23 +0000 (11:28 -0700)]
Fix diff bug on inserted line
For the following patch on the linux 2.6.32 tag:
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -685,6 +685,7 @@ static void enqueue_sleeper(struct cfs_rq *cfs_rq, struct sc
JGit produced an incorrect diff, attempting to add a new "}" instead
of the new "#endif" at the end of the hunk. This was caused by a prior
fix for bug 328895 where we wanted to "slide" a diff down in the file
when adding a new method/function and want to show the closing curly
brace as being added after the new method, rather than added onto the
end of the prior function or method just before the insertion point.
Bug: 345956
Change-Id: I32b9e24f1e2980258b1b39dd1807919ab1c5f9b2 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Change-Id: I9e54f3e7e96892b64546270cbdf0308046e1d40c Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Stefan Lay [Fri, 6 May 2011 08:42:51 +0000 (10:42 +0200)]
Fix getHumanishName broken for windows paths
Since d1718a the method getHumanishName was broken on windows since
the URIish is not normalized anymore. For a path like
"C:\gitRepositories\egit" the whole path was returned instead of
"egit".
Bug: 343519
Change-Id: I95056009072b99d32f288966302d0f8188b47836 Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Before this change any files in the conflicting set would
also be listed in the the other IndexDiff Sets which is
confusing. With this change a conflicting file will not
be included in any of the other sets.
Change-Id: Ife9f2652685220bcfddc1f9820423acdcd5acfdc Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Shawn O. Pearce [Wed, 2 Mar 2011 23:23:30 +0000 (15:23 -0800)]
Store Git on any DHT
jgit.storage.dht is a storage provider implementation for JGit that
permits storing the Git repository in a distributed hashtable, NoSQL
system, or other database. The actual underlying storage system is
undefined, and can be plugged in by implementing 7 small interfaces:
The storage provider interface tries to assume very little about the
underlying storage system, and requires only three key features:
* key -> value lookup (a hashtable is suitable)
* atomic updates on single rows
* asynchronous operations (Java's ExecutorService is easy to use)
Most NoSQL database products offer all 3 of these features in their
clients, and so does any decent network based cache system like the
open source memcache product. Relying only on key equality for data
retrevial makes it simple for the storage engine to distribute across
multiple machines. Traditional SQL systems could also be used with a
JDBC based spi implementation.
Before submitting this change I have implemented six storage systems
for the spi layer:
* Apache HBase[1]
* Apache Cassandra[2]
* Google Bigtable[3]
* an in-memory implementation for unit testing
* a JDBC implementation for SQL
* a generic cache provider that can ride on top of memcache
All six systems came in with an spi layer around 1000 lines of code to
implement the above 7 interfaces. This is a huge reduction in size
compared to prior attempts to implement a new JGit storage layer. As
this package shows, a complete JGit storage implementation is more
than 17,000 lines of fairly complex code.
A simple cache is provided in storage.dht.spi.cache. Implementers can
use CacheDatabase to wrap any other type of Database and perform fast
reads against a network based cache service, such as the open source
memcached[4]. An implementation of CacheService must be provided to
glue this spi onto the network cache.
Shawn O. Pearce [Wed, 4 May 2011 21:55:53 +0000 (14:55 -0700)]
Fix error handling in RepositoryFilter
The filter did not correctly match smart HTTP client requests,
so it always fell back on HTTP status codes for errors. This
usually causes a smart client to retry a dumb request, which
is not what the server wants.
Change-Id: I42592378dc42fbe308ef30a2923786c690f668a9 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 15 Mar 2011 01:18:49 +0000 (18:18 -0700)]
Implement the no-done capability
Smart HTTP clients may request both multi_ack_detailed and no-done in
the same request to prevent the client from needing to send a "done"
line to the server in response to a server's "ACK %s ready".
For smart HTTP, this can save 1 full HTTP RPC in the fetch exchange,
improving overall latency when incrementally updating a client that
has not diverged very far from the remote repository.
Unfortuantely this capability cannot be enabled for the traditional
bi-directional connections. multi_ack_detailed has the client sending
more "have" lines at the same time that the server is creating the
"ACK %s ready" and writing out the PACK stream, resulting in some race
conditions and/or deadlock, depending on how the pipe buffers are
implemented. For very small updates, a server might actually be able
to send "ACK %s ready", then the PACK, and disconnect before the
client even finishes sending its first batch of "have" lines. This
may cause the client to fail with a broken pipe exception. To avoid
all of these potential problems, "no-done" is restricted only to the
smart HTTP variant of the protocol.
Change-Id: Ie0d0a39320202bc096fec2e97cb58e9efd061b2d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* stable-0.12:
Fix sorting of names in RefDirectory
Make running static checks configurable in maven build
Add constants for gerrit change id configuration
RefDirectory did not correctly follow the contract of RefList. The
contract says if you use add() method of RefList builder, you MUST
sort() it afterwards, and later every other method assumes that list
is properly sorted (especially the binary search in the find() and
get() methods). Instead RefDirectory class tried to scan the refs
recursively while sorting every folder in the process before
processing and did not call sort().
For example, when scanning the contents of refs/tags project1 string
is smaller than project1-*, so it will recursively go into the folder
and add these tags first and only then will add project-* ones. This
will result in a broken list (any project1-* string is less than
project1/* one, but they all appear after them in the list), that's
why binary search will fail making loose RefList and the whole local
RefMap completely unusable.
Change-Id: Ibad90017e3b2435b1396b69a22520db4b1b022bb Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Stefan Lay [Thu, 14 Apr 2011 15:25:22 +0000 (17:25 +0200)]
Add constants for gerrit change id configuration
Change-Id: I22fc46dff6cc5dfd975f6e82161d265781778cde Signed-off-by: Stefan Lay <stefan.lay@sap.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Sat, 9 Apr 2011 23:02:29 +0000 (01:02 +0200)]
Create all test data in trash folder
This ensures that all test data is separated from project sources and
cleaned up after the test. Previously the cloned bare test repository
was created in org.eclipse.jgit.test/ and not deleted after the test
run.
Change-Id: I55110442e365fc8fe610f1c372f72a71ee6e1412 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Robin Stocker [Wed, 6 Apr 2011 19:10:16 +0000 (21:10 +0200)]
Refactor reading and writing heads in Repository
Add private methods which are used for reading and writing MERGE_HEAD
and CHERRY_PICK_HEAD files, as suggested in the comments on change
I947967fdc2f1d55016c95106b104c2afcc9797a1.
Change-Id: If4617a05ee57054b8b1fcba36a06a641340ecc0e Signed-off-by: Robin Stocker <robin@nibor.org>
Robin Stocker [Sun, 3 Apr 2011 17:37:55 +0000 (19:37 +0200)]
Add CHERRY_PICK_HEAD for cherry-pick conflicts
Add handling of CHERRY_PICK_HEAD file in .git (similar to MERGE_HEAD),
which is written in case of a conflicting cherry-pick merge.
It is used so that Repository.getRepositoryState can return the new
states CHERRY_PICKING and CHERRY_PICKING_RESOLVED. These states, as well
as CHERRY_PICK_HEAD can be used in EGit to properly show the merge tool.
Also, in case of a conflict, MERGE_MSG is written with the original
commit message and a "Conflicts" section appended. This way, the
cherry-picked message is not lost and can later be re-used in the commit
dialog.
Bug: 339092
Change-Id: I947967fdc2f1d55016c95106b104c2afcc9797a1 Signed-off-by: Robin Stocker <robin@nibor.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Stefan Lay [Wed, 6 Apr 2011 12:08:25 +0000 (14:08 +0200)]
Add parameters for timeout and branches to clone
The timeout is also used in the FetchCommand called by the
CloneCommand.
The possibility to provide a list of branches to fetch initially is a
feature offered by EGit. To implement it here is a prerequisite for
EGit to be able to use the CloneCommand.
Change-Id: I21453de22e9ca61919a7c3386fcc526024742f5f Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Stefan Lay [Wed, 6 Apr 2011 12:58:48 +0000 (14:58 +0200)]
Try to checkout branch after cloning
When no branch was specified in the clone command, HEAD pointed to a
commit after clone. Now the clone command tries to find a branch which
points to the same commit and checks out this branch.
Bug: 339354
Change-Id: Ie3844465329f213dee4a8868dbf434ac3ce23a08 Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Philipp Thun [Mon, 4 Apr 2011 16:04:21 +0000 (18:04 +0200)]
Fix DirCache.isModified()
Change I61a1b45db2d60fdcc0f87373ac6fd75ac4c4a202 fixed a possible NPE
occurring for newly created repositories - but in that case a wrong
value (false = not modified) was returned.
If a current version of the index file exists (liveFile), but there is
no snapshot, this means that there have been modifications (i.e. true
has to be returned).
Change-Id: I698f78112249f9924860fc58eb7eab7afdf87eb7 Signed-off-by: Philipp Thun <philipp.thun@sap.com>
When a new Git instance for an exisiting git repository should be
created there are two use-cases: either the application has already a
Repository instance in hand or the application knows where the
repository resides in the filesystem. Two methods are added to
explicitly support these use-cases: wrap(Repository db) and open(File
gitDir)
Change-Id: I2970e4aa8d4602cb1298f01e5b76bf0f96c492e5 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Carsten Pfeiffer [Wed, 16 Mar 2011 15:57:35 +0000 (16:57 +0100)]
Support reading first SHA-1 from large FETCH_HEAD files
When reading refs, avoid reading huge files that were put there
accidentally, but still read the top of e.g. FETCH_HEAD, which
may be longer than our limit. We're only interested in the first line
anyway.
The 'Counting objects' phase of PackWriter requires good hit rates
from the DeltaBaseCache while walking trees, the deltas need to find
their bases in the cache in order to inflate in a reasonable time.
If JGit is running in a multi-threaded server, such as Gerrit Code
Review, each thread needs its own DeltaBaseCache to prevent one thread
from evicting the other thread's relevant bases. Move the cache to be
per-ObjectReader, lazily allocated when required by a PackFile.
Change-Id: If9d5ed06728e813632ae96dcfb811f4860b276e8 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Fix ReceivePack connectivity validation with alternates
If a repository has an alternate object database, the alternate has
its references advertised as ".have" lines, which permits the client
to use these as delta base candidates when generating the pack. If
setCheckReferencedObjectsAreReachable(true) is used, these additional
have lines need to be considered in addition to the advertised refs.
Change-Id: Ie39c6696f9d3ff147ef4405cd5624f6011700ce5 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>