mirrors/jgit - jgit - source @ dussan.org

Gráfico de commits

Autor	SHA1	Mensagem	Data
Thomas Wolf	471ad49546	TransportHttp: shared SSLContext during fetch or push TransportHttp makes several HTTP requests. The SSLContext and socket factory must be shared over these requests, otherwise authentication information may not be propagated correctly from one request to the next. This is important for authentication mechanisms that rely on client-side state, like NEGOTIATE (either NTLM, if the underlying HTTP library supports it, or Kerberos). In particular, SPNEGO cannot authenticate on a POST request; the authentication must come from the initial GET request, which implies that the POST request must use the same SSLContext and socket factory that was used for the GET. Change the way HTTPS connections are configured. Introduce the concept of a GitSession, which is a client-side HTTP session over several HTTPS requests. TransportHttp creates such a session and uses it to configure all HTTP requests during that session (fetch or push). This gives a way to abstract away the differences between JDK and Apache HTTP connections and to configure SSL setup outside. A GitSession can maintain state and thus give all HTTP requests in a session the same socket factory. Introduce an extension interface HttpConnectionFactory2 that adds a method to obtain a new GitSession. Implement this for both existing HTTP connection factories. Change TransportHttp to use the new GitSession to configure HTTP connections. The old methods for disabling SSL verification still exist to support possibly external connection and connection factory implementations that do not make use of the new GitSession yet. Bug: 535850 Change-Id: Iedf67464e4e353c1883447c13c86b5a838e678f1 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>	3 anos atrás
Matthias Sohn	5c5f7c6b14	Update EDL 1.0 license headers to new short SPDX compliant format This is the format given by the Eclipse legal doc generator [1]. [1] https://www.eclipse.org/projects/tools/documentation.php?id=technology.jgit Bug: 548298 Change-Id: I8d8cabc998ba1b083e3f0906a8d558d391ffb6c4 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	4 anos atrás
Matthias Sohn	53891d4eda	Fix javadoc in org.eclipse.jgit.http.apache Change-Id: I38a4854856b0103790a410b48c1c3d708b6500c1 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	6 anos atrás
David Pursehouse	7ac182f4e4	Enable and fix 'Should be tagged with @Override' warning Set missingOverrideAnnotation=warning in Eclipse compiler preferences which enables the warning: The method <method> of type <type> should be tagged with @Override since it actually overrides a superclass method Justification for this warning is described in: http://stackoverflow.com/a/94411/381622 Enabling this causes in excess of 1000 warnings across the entire code-base. They are very easy to fix automatically with Eclipse's "Quick Fix" tool. Fix all of them except 2 which cause compilation failure when the project is built with mvn; add TODO comments on those for further investigation. Change-Id: I5772061041fd361fe93137fd8b0ad356e748a29c Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>	7 anos atrás
Matthias Sohn	3d3df3942a	Move Apache httpclient based HTTP support to a separate bundle This move avoids that all consumers of org.eclipse.jgit depend on Apache httpclient. Also add another feature to make this optional for OSGi consumers as well. Change-Id: I5ef5e00c53678b9e1d7cfd54bbca3ff6f1c1c967 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	10 anos atrás
Christian Halstrick	2290516ddb	Add an implementation for HttpConnection using Apache HttpClient This change implements the http connection abstraction with the help of org.apache.http.client.HttpClient. The default implementation used by JGit is still the JDK HttpURLConnection. But now JGit users have the possibility to switch completely to org.apache.httpclient. The reason for this is that in certain (e.g. cloud) environments you are forced to use the org.apache classes. Change-Id: I0b357f23243ed13a014c79ba179fa327dfe318b2 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	11 anos atrás
Christian Halstrick	38c4f35d8b	Introduce an abstraction for HTTP connections Previously all HTTP communication was done with the help of java.net.HttpUrlConnection. In order to make JGit usable in environments where the direct usage of such connections is not allowed but where the environment provides other means to get network connections an abstraction for connections is introduced. The idea is that new implementations of this interface will be introduced which will not use java.net.HttpUrlConnection but use e.g. org.apache.client.http.HttpClient to provide network connections. One example: certain cloud infrastructures don't allow that components in the cloud communicate directly with HttpUrlConnection. Instead they provide services where a component can ask for a connection (given a symbolic name for the destination) and where the infrastructure returns a preconfigured org.apache.http.client.HttpClient. In order to allow JGit to be running in such environments we need the abstraction introduced in this commit. Change-Id: I3b06629f90a118bd284e55bb3f6465fe7d10463d Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	11 anos atrás
Shawn Pearce	f32b861243	JGit 3.0: move internal classes into an internal subpackage This breaks all existing callers once. Applications are not supposed to build against the internal storage API unless they can accept API churn and make necessary updates as versions change. Change-Id: I2ab1327c202ef2003565e1b0770a583970e432e9	11 anos atrás
Shawn O. Pearce	fa4cc2475f	DFS: A storage layer for JGit In practice the DHT storage layer has not been performing as well as large scale server environments want to see from a Git server. The performance of the DHT schema degrades rapidly as small changes are pushed into the repository due to the chunk size being less than 1/3 of the pushed pack size. Small chunks cause poor prefetch performance during reading, and require significantly longer prefetch lists inside of the chunk meta field to work around the small size. The DHT code is very complex (>17,000 lines of code) and is very sensitive to the underlying database round-trip time, as well as the way objects were written into the pack stream that was chunked and stored on the database. A poor pack layout (from any version of C Git prior to Junio reworking it) can cause the DHT code to be unable to enumerate the objects of the linux-2.6 repository in a completable time scale. Performing a clone from a DHT stored repository of 2 million objects takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row for each object being cloned. This is very difficult for some DHTs to scale, even at 5000 rows/second the lookup stage alone takes 6 minutes (on local filesystem, this is almost too fast to bother measuring). Some servers like Apache Cassandra just fall over and cannot complete the 2 million lookups in rapid fire. On a ~400 MiB repository, the DHT schema has an extra 25 MiB of redundant data that gets downloaded to the JGit process, and that is before you consider the cost of the OBJECT_INDEX table also being fully loaded, which is at least 223 MiB of data for the linux kernel repository. In the DHT schema answering a `git clone` of the ~400 MiB linux kernel needs to load 248 MiB of "index" data from the DHT, in addition to the ~400 MiB of pack data that gets sent to the client. This is 193 MiB more data to be accessed than the native filesystem format, but it needs to come over a much smaller pipe (local Ethernet typically) than the local SATA disk drive. I also never got around to writing the "repack" support for the DHT schema, as it turns out to be fairly complex to safely repack data in the repository while also trying to minimize the amount of changes made to the database, due to very common limitations on database mutation rates.. This new DFS storage layer fixes a lot of those issues by taking the simple approach for storing relatively standard Git pack and index files on an abstract filesystem. Packs are accessed by an in-process buffer cache, similar to the WindowCache used by the local filesystem storage layer. Unlike the local file IO, there are some assumptions that the storage system has relatively high latency and no concept of "file handles". Instead it looks at the file more like HTTP byte range requests, where a read channel is a simply a thunk to trigger a read request over the network. The DFS code in this change is still abstract, it does not store on any particular filesystem, but is fairly well suited to the Amazon S3 or Apache Hadoop HDFS. Storing packs directly on HDFS rather than HBase removes a layer of abstraction, as most HBase row reads turn into an HDFS read. Most of the DFS code in this change was blatently copied from the local filesystem code. Most parts should be refactored to be shared between the two storage systems, but right now I am hesistent to do this due to how well tuned the local filesystem code currently is. Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb	13 anos atrás
Shawn O. Pearce	de8946c0c2	Store Git on any DHT jgit.storage.dht is a storage provider implementation for JGit that permits storing the Git repository in a distributed hashtable, NoSQL system, or other database. The actual underlying storage system is undefined, and can be plugged in by implementing 7 small interfaces: * Database * RepositoryIndexTable * RepositoryTable * RefTable * ChunkTable * ObjectIndexTable * WriteBuffer The storage provider interface tries to assume very little about the underlying storage system, and requires only three key features: * key -> value lookup (a hashtable is suitable) * atomic updates on single rows * asynchronous operations (Java's ExecutorService is easy to use) Most NoSQL database products offer all 3 of these features in their clients, and so does any decent network based cache system like the open source memcache product. Relying only on key equality for data retrevial makes it simple for the storage engine to distribute across multiple machines. Traditional SQL systems could also be used with a JDBC based spi implementation. Before submitting this change I have implemented six storage systems for the spi layer: * Apache HBase[1] * Apache Cassandra[2] * Google Bigtable[3] * an in-memory implementation for unit testing * a JDBC implementation for SQL * a generic cache provider that can ride on top of memcache All six systems came in with an spi layer around 1000 lines of code to implement the above 7 interfaces. This is a huge reduction in size compared to prior attempts to implement a new JGit storage layer. As this package shows, a complete JGit storage implementation is more than 17,000 lines of fairly complex code. A simple cache is provided in storage.dht.spi.cache. Implementers can use CacheDatabase to wrap any other type of Database and perform fast reads against a network based cache service, such as the open source memcached[4]. An implementation of CacheService must be provided to glue this spi onto the network cache. [1] https://github.com/spearce/jgit_hbase [2] https://github.com/spearce/jgit_cassandra [3] http://labs.google.com/papers/bigtable.html [4] http://memcached.org/ Change-Id: I0aa4072781f5ccc019ca421c036adff2c40c4295 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 anos atrás
Git Development Community	1a6964c827	Initial JGit contribution to eclipse.org Per CQ 3448 this is the initial contribution of the JGit project to eclipse.org. It is derived from the historical JGit repository at commit `3a2dd9921c`. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anos atrás

5 Commits (master)