Send a detailed event on working tree modifications
Currently there is no way to determine the precise changes done
to the working tree by a JGit command. Only the CheckoutCommand
actually provides access to the lists of modified, deleted, and
to-be-deleted files, but those lists may be inaccurate (since they
are determined up-front before the working tree is modified) if
the actual checkout then fails halfway through. Moreover, other
JGit commands that modify the working tree do not offer any way to
figure out which files were changed.
This poses problems for EGit, which may need to refresh parts of the
Eclipse workspace when JGit has done java.io file operations.
Provide the foundations for better file change tracking: the working
tree is modified exclusively in DirCacheCheckout. Make it emit a new
type of RepositoryEvent that lists all files that were modified or
deleted, even if the checkout failed halfway through. We update the
'updated' and 'removed' lists determined up-front in case of file
system problems to reflect the actual state of changes made.
EGit thus can register a listener for these events and then knows
exactly which parts of the Eclipse workspace may need to be refreshed.
Two commands manage checking out individual DirCacheEntries themselves:
checkout specific paths, and applying a stash with untracked files.
Make those two also emit such a new WorkingTreeModifiedEvent.
Furthermore, merges may modify files, and clean, rm, and stash create
may delete files.
CQ: 13969
Bug: 500106
Change-Id: I7a100aee315791fa1201f43bbad61fbae60b35cb
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
jgit.storage.dht is a storage provider implementation for JGit that
permits storing the Git repository in a distributed hashtable, NoSQL
system, or other database. The actual underlying storage system is
undefined, and can be plugged in by implementing 7 small interfaces:
* Database
* RepositoryIndexTable
* RepositoryTable
* RefTable
* ChunkTable
* ObjectIndexTable
* WriteBuffer
The storage provider interface tries to assume very little about the
underlying storage system, and requires only three key features:
* key -> value lookup (a hashtable is suitable)
* atomic updates on single rows
* asynchronous operations (Java's ExecutorService is easy to use)
Most NoSQL database products offer all 3 of these features in their
clients, and so does any decent network based cache system like the
open source memcache product. Relying only on key equality for data
retrevial makes it simple for the storage engine to distribute across
multiple machines. Traditional SQL systems could also be used with a
JDBC based spi implementation.
Before submitting this change I have implemented six storage systems
for the spi layer:
* Apache HBase[1]
* Apache Cassandra[2]
* Google Bigtable[3]
* an in-memory implementation for unit testing
* a JDBC implementation for SQL
* a generic cache provider that can ride on top of memcache
All six systems came in with an spi layer around 1000 lines of code to
implement the above 7 interfaces. This is a huge reduction in size
compared to prior attempts to implement a new JGit storage layer. As
this package shows, a complete JGit storage implementation is more
than 17,000 lines of fairly complex code.
A simple cache is provided in storage.dht.spi.cache. Implementers can
use CacheDatabase to wrap any other type of Database and perform fast
reads against a network based cache service, such as the open source
memcached[4]. An implementation of CacheService must be provided to
glue this spi onto the network cache.
[1] https://github.com/spearce/jgit_hbase
[2] https://github.com/spearce/jgit_cassandra
[3] http://labs.google.com/papers/bigtable.html
[4] http://memcached.org/
Change-Id: I0aa4072781f5ccc019ca421c036adff2c40c4295
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Per CQ 3448 this is the initial contribution of the JGit project
to eclipse.org. It is derived from the historical JGit repository
at commit 3a2dd9921c.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>