mirrors/jgit - jgit - source @ dussan.org

Commit Graph

Author	SHA1	Message	Date
Tomasz Zarna	a2dac2c78d	Allow to write tests with CLI syntax CQ: 6385 Bug: 365444 Change-Id: I2d5164cd92429673fe3c37e9f5f9bc565192cc12 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Markus Duft	2ba67bedae	Don't use java.nio channel for file size determination Java NIO has some problems (like files closing unexpectedly because the thread was interrupted). To avoid those problems, don't use a NIO channel to determine the size of a file, but rather ask the File itself. We have to be prepared to handle wrong/outdated information in this case too, as the inode of the File may change between opening and determining file size. Change-Id: Ic7aa6c3337480879efcce4a3058b548cd0e2cef0	12 years ago
Matthias Sohn	9dd6e6cd29	Revert "Allow to write tests with CLI syntax" This reverts commit `bf845c126d` since this change needs to go through a formal IP review and Chris missed to file a CQ for that. Change-Id: I303515d78116f0591a2911dbfb9f857738f086a9 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Tomasz Zarna	bf845c126d	Allow to write tests with CLI syntax Bug: 365444 Change-Id: I86f382913bc47665c5b9a2827b878e7dbedce7b1 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Robin Rosenberg	95d311f888	Move JGitText to an internal package Change-Id: I763590a45d75f00a09097ab6f89581a3bbd3c797	12 years ago
Robin Rosenberg	ad9033e43a	Revert non-sense logic in IO.readFully This was introduced in alleged support for autocrlf, but the logic makes no sense here. Change-Id: If37f3b90f21f875cee7165e17a17adfda446384d	12 years ago
Robin Rosenberg	76dd9d1d46	Support more of AutoCRLF This patch introduces CRLF handling to the DirCacheCheckout and WorkingTreeIterator supporting the AutoCRLF for add, checkout reset and status and hopefully some other places that depende on the underlying logic of the affected API's. The patch includes test cases for the Status command provided by Tomasz Zarna for bug 353867. The core.eol and core.safecrlf options are not yet supported. Bug: 301775 Bug: 353867 Change-Id: I2280a2dc0698829475de6a662a6c6e80b1df7663	12 years ago
Shawn O. Pearce	fa4cc2475f	DFS: A storage layer for JGit In practice the DHT storage layer has not been performing as well as large scale server environments want to see from a Git server. The performance of the DHT schema degrades rapidly as small changes are pushed into the repository due to the chunk size being less than 1/3 of the pushed pack size. Small chunks cause poor prefetch performance during reading, and require significantly longer prefetch lists inside of the chunk meta field to work around the small size. The DHT code is very complex (>17,000 lines of code) and is very sensitive to the underlying database round-trip time, as well as the way objects were written into the pack stream that was chunked and stored on the database. A poor pack layout (from any version of C Git prior to Junio reworking it) can cause the DHT code to be unable to enumerate the objects of the linux-2.6 repository in a completable time scale. Performing a clone from a DHT stored repository of 2 million objects takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row for each object being cloned. This is very difficult for some DHTs to scale, even at 5000 rows/second the lookup stage alone takes 6 minutes (on local filesystem, this is almost too fast to bother measuring). Some servers like Apache Cassandra just fall over and cannot complete the 2 million lookups in rapid fire. On a ~400 MiB repository, the DHT schema has an extra 25 MiB of redundant data that gets downloaded to the JGit process, and that is before you consider the cost of the OBJECT_INDEX table also being fully loaded, which is at least 223 MiB of data for the linux kernel repository. In the DHT schema answering a `git clone` of the ~400 MiB linux kernel needs to load 248 MiB of "index" data from the DHT, in addition to the ~400 MiB of pack data that gets sent to the client. This is 193 MiB more data to be accessed than the native filesystem format, but it needs to come over a much smaller pipe (local Ethernet typically) than the local SATA disk drive. I also never got around to writing the "repack" support for the DHT schema, as it turns out to be fairly complex to safely repack data in the repository while also trying to minimize the amount of changes made to the database, due to very common limitations on database mutation rates.. This new DFS storage layer fixes a lot of those issues by taking the simple approach for storing relatively standard Git pack and index files on an abstract filesystem. Packs are accessed by an in-process buffer cache, similar to the WindowCache used by the local filesystem storage layer. Unlike the local file IO, there are some assumptions that the storage system has relatively high latency and no concept of "file handles". Instead it looks at the file more like HTTP byte range requests, where a read channel is a simply a thunk to trigger a read request over the network. The DFS code in this change is still abstract, it does not store on any particular filesystem, but is fairly well suited to the Amazon S3 or Apache Hadoop HDFS. Storing packs directly on HDFS rather than HBase removes a layer of abstraction, as most HBase row reads turn into an HDFS read. Most of the DFS code in this change was blatently copied from the local filesystem code. Most parts should be refactored to be shared between the two storage systems, but right now I am hesistent to do this due to how well tuned the local filesystem code currently is. Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb	13 years ago
Carsten Pfeiffer	11e2e746c1	Support reading first SHA-1 from large FETCH_HEAD files When reading refs, avoid reading huge files that were put there accidentally, but still read the top of e.g. FETCH_HEAD, which may be longer than our limit. We're only interested in the first line anyway. Bug: 340880 Change-Id: I11029b9b443f22019bf80bd3dd942b48b531bc11 Signed-off-by: Carsten Pfieffer <carsten.pfeiffer@gebit.de> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>	13 years ago
Marc Strapetz	e2e38792b5	Perform automatic CRLF to LF conversion during WorkingTreeIterator WorkingTreeIterator now optionally performs CRLF to LF conversion for text files. A basic framework is left in place to support enabling (or disabling) this feature based on gitattributes, and also to support the more generic smudge/clean filter system. As there is no gitattribute support yet in JGit this is left unimplemented, but the mightNeedCleaning(), isBinary() and filterClean() methods will provide reasonable places to plug that into in the future. [sp: All bugs inside of WorkingTreeIterator are my fault, I wrote most of it while cherry-picking this patch and building it on top of Marc's original work.] CQ: 4419 Bug: 301775 Change-Id: I0ca35cfbfe3f503729cbfc1d5034ad4abcd1097e Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 years ago
Shawn O. Pearce	16419dad35	Don't use interruptable pread() to access pack files The J2SE NIO APIs require that FileChannel close the underlying file descriptor if a thread is interrupted while it is inside of a read or write operation on that channel. This is insane, because it means we cannot share the file descriptor between threads. If a thread is in the middle of the FileChannel variant of IO.readFully() and it receives an interrupt, the pack will be automatically closed on us. This causes the other threads trying to use that same FileChannel to receive IOExceptions, which leads to the pack getting marked as invalid. Once the pack is marked invalid, JGit loses access to its entire contents and starts to report MissingObjectExceptions. Because PackWriter must ensure that the chosen pack file stays available until the current object's data is fully copied to the output, JGit cannot simply reopen the pack when its automatically closed due to an interrupt being sent at the wrong time. The pack may have been deleted by a concurrent `git gc` process, and that open file descriptor might be the last reference to the inode on disk. Once its closed, the PackWriter loses access to that object representation, and it cannot complete sending the object the client. Fortunately, RandomAccessFile's readFully method does not have this problem. Interrupts during readFully() are ignored. However, it requires us to first seek to the offset we need to read, then issue the read call. This requires locking around the file descriptor to prevent concurrent threads from moving the pointer before the read. This reduces the concurrency level, as now only one window can be paged in at a time from each pack. However, the WindowCache should already be holding most of the pages required to handle the working set for a process, and its own internal locking was already limiting us on the number of concurrent loads possible. Provided that most concurrent accesses are getting hits in the WindowCache, or are for different repositories on the same server, we shouldn't see a major performance hit due to the more serialized loading. I would have preferred to use a pool of RandomAccessFiles for each pack, with threads borrowing an instance dedicated to that thread whenever they needed to page in a window. This would permit much higher levels of concurrency by using multiple file descriptors (and file pointers) for each pack. However the code became too complex to develop in any reasonable period of time, so I've chosen to retrofit the existing code with more serialization instead. Bug: 308945 Change-Id: I2e6e11c6e5a105e5aef68871b66200fd725134c9 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 years ago
Sasa Zivkov	f3d8a8ecad	Externalize strings from JGit The strings are externalized into the root resource bundles. The resource bundles are stored under the new "resources" source folder to get proper maven build. Strings from tests are, in general, not externalized. Only in cases where it was necessary to make the test pass the strings were externalized. This was typically necessary in cases where e.getMessage() was used in assert and the exception message was slightly changed due to reuse of the externalized strings. Change-Id: Ic0f29c80b9a54fcec8320d8539a3e112852a1f7b Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>	14 years ago
Robin Rosenberg	d4e7b70606	Move pure IO utility functions to a utility class of its own. According the javadoc, and implied by the name of the class, NB is about network byte order. The purpose of moving the IO only, and non-byte order related functions to another class is to make it easier for new contributors to understand that they can use these functions in general and it's also makes it easier to understand where to put new IO related utility functions Change-Id: I4a9f6b39d5564bc8a694b366e7ff3cc758c5181b Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 years ago
Alex Blewitt	4d91645e89	Remove trailing whitespace at end of line As discussed on the egit-dev mailing list, we prefer not to have trailing whitespace in our source code. Correct all currently offending lines by trimming them. Change-Id: I002b1d1980071084c0bc53242c8f5900970e6845 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 years ago
Git Development Community	1a6964c827	Initial JGit contribution to eclipse.org Per CQ 3448 this is the initial contribution of the JGit project to eclipse.org. It is derived from the historical JGit repository at commit `3a2dd9921c`. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 years ago

13 Commits (a2dac2c78d1fff0b2cedadcdaac030001eb851ac)