mirrors/jgit - jgit - source @ dussan.org

Commit Graph

Author	SHA1	Message	Date
Robin Rosenberg	2853c61f12	Do not set core.autocrlf when creating repo core.autorlf defaults to false, but can be set in the user or "system" config files. Note that EGit/JGit may not know where the "system" config file is located. Also fix pgm's ConfigTest which depends on default repository configuration. Bug: 382067 Change-Id: I2c698a76e30d968e7f351b4f5a2195f0b124f62f Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Robin Rosenberg	4321f9353f	cleanup: Remove unused declarations Change-Id: I3b54cb9f73cb433c71a441a11ddc74cfecdaa1dc	12 years ago
Kevin Sawicki	91f5ce3a15	Only increment mod count if packed-refs file changes Previously if a packed-refs file was racily clean then there was a 2.5 second window in which each call to getPackedRefs would increment the mod count causing a RefsChangedEvent to be fired since the FileSnapshot would report the file as modified. If a RefsChangedListener called getRef/getRefs from the onRefsChanged method then a StackOverflowError could occur since the stack could be exhausted before the 2.5 second window expired and the packed-refs file would no longer report being modified. Now a SHA-1 is computed of the packed-refs file and the mod count is only incremented when the packed refs are successfully set and the id of the new packed-refs file does not match the id of the old packed-refs file. Change-Id: I8cab6e5929479ed748812b8598c7628370e79697	12 years ago
Robin Rosenberg	1953ae6aee	Smudge index entries on first write (too), as well when reading That happens when the index and a new file is created within the same second and becomes a problem if we then modify the newly created file within the same second after adding it to the index. Without smudging JGit will, on later reads, think the file is unchanged. The accompanying test passed with the smuding on read. Change-Id: I4dfecf5c93993ef690e7f0dddb3f3e6125daae15	12 years ago
Matthias Sohn	c394d05a47	Add missing @since tags to mark API added in 2.0 Change-Id: I0a86ce0e393dfde9bb27f0b29e036e76c856396e Signed-off-by: Matthias Sohn <matthias.sohn@sap.com> Signed-off-by: Chris Aniszczyk <zx@twitter.com>	12 years ago
Kevin Sawicki	17fb542e9e	Remove 86 boxing warnings Use Integer, Character, and Long valueOf methods when passing parameters to MessageFormat and other places that expect objects instead of primitives Change-Id: I5942fbdbca6a378136c00d951ce61167f2366ca4	12 years ago
Shawn O. Pearce	6c0d300a54	Fix loading packed objects >2G Parsing the size from a packed object header was incorrectly computing the total inflated length when the length exceeded the range of a Java int. The next 7 bits of size information was shifted left as an int using a shift of 25 bits, placing the higher bits of the size into the sign position. When this size was extended to a long to be added to the current size accumulator the size went negative, resulting in NegativeArraySizeException being thrown. Fix all places where this particular pattern of code is used to read a pack size field, or a binary delta header, as they both use the same variable length encoding scheme. Change-Id: I04008728ed828f18202652c3d5401cf95a441d0a	12 years ago
Kevin Sawicki	b37b7e69cd	Add command support for dropping a stashed commit This extracts the logic for writing to the reflog from RefDirectory into a new ReflogWriter class. This class creates a public API for writing reflog entries similar to ReflogReader for reading reflog entries. The new command supports rewriting the stash's log to remove a configured entry followed by updating the stash ref to the value at the bottom of the newly written log. Change-Id: Icfcbc70e838666769a742a94196eb8dc9c7efcc7 Signed-off-by: Chris Aniszczyk <zx@twitter.com>	12 years ago
Kevin Sawicki	7aeea3b27c	Compare repository format version as parsed long This allows repositoryies with a missing repositoryformatversion config value to be successfully opened but still throws exceptions when the value is a non-long or greater than zero. git-core attempts to parse this config value as a long as well and defaults to 0 if the value is missing. Bug: 368697 Change-Id: I4a93117afca37e591e8e0ab4d2f2eef4273f0cc9 Signed-off-by: Chris Aniszczyk <zx@twitter.com>	12 years ago
Dave Borowitz	9bc26efe9d	Pass a DfsRepositoryDescription to InMemoryRepository This was likely intended originally, but this class had never been used, so the mistake went unnoticed. Change-Id: I5e0e9f22ebf707c11d0581511c7a56b182188f77	12 years ago
Shawn O. Pearce	ff13648ea2	Revert "Quickfix for AutoCRLF handling" This reverts commit `88fe2836ed`. Auto CRLF isn't special enough to be screwing around with the buffers used for raw byte processing of the ObjectInserter API. If it needs a buffer to process a file that is bigger than the buffer allocated by an ObjectInserter, it needs to do its own buffer management. Change-Id: Ida4aaa80d0f9f78035f3d2a9ebdde904c980f89a	12 years ago
Robin Rosenberg	95d311f888	Move JGitText to an internal package Change-Id: I763590a45d75f00a09097ab6f89581a3bbd3c797	12 years ago
Robin Rosenberg	708febedaf	cleanup: Remove unnecessary @SuppressWarnings Change-Id: Ie22ac47e315bff76f224214bc042fc483eb01550 Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>	12 years ago
Robin Rosenberg	88fe2836ed	Quickfix for AutoCRLF handling CRLF only works for small files, where small is the size of the buffer, i.e. about 8K. This QD fix reallocates the buffer to be large enough. Bug: 369780 Change-Id: Ifc34ad204fbf5986b257a5c616e4a8c601e8261a	12 years ago
Kevin Sawicki	97210fd6db	Add command support for listing stashed commits Bug: 309355 Change-Id: I34a8c251b89abcdb67565ca49bee02e5e2113593 Signe-off-by: Chris Aniszczyk <zx@twitter.com>	12 years ago
Robin Rosenberg	e875c905d3	Make sure all bytes are written to files on close, or get an error. Java's BufferedOutputStream swallows any errors that occur when flushing the buffer in close(). This class overrides close to make sure an error during the final flush is reported back to the caller. Change-Id: I74a82b31505fadf8378069c5f6554f1033c28f9b Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Robin Rosenberg	76dd9d1d46	Support more of AutoCRLF This patch introduces CRLF handling to the DirCacheCheckout and WorkingTreeIterator supporting the AutoCRLF for add, checkout reset and status and hopefully some other places that depende on the underlying logic of the affected API's. The patch includes test cases for the Status command provided by Tomasz Zarna for bug 353867. The core.eol and core.safecrlf options are not yet supported. Bug: 301775 Bug: 353867 Change-Id: I2280a2dc0698829475de6a662a6c6e80b1df7663	12 years ago
Kevin Sawicki	aebfc70cc8	Provide helper for unlocking a file This will allow recovery from a LockFailedException where the file associated with an exception is passed to FileUtils.unlock to attempt an unlock on the file so the operation can be retried Change-Id: I580166d386126bfb54a318a65253070a6e325936	12 years ago
Tomasz Zarna	5d95cd9418	Add constant for default name for the Git repo configuration Change-Id: I5a6ef686c444fb1e46c9f784bad01165471ef372 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Kevin Sawicki	1dcb76739c	Support getting specific entry number in reflog The number specified is interpreted as relative to the last entry in the reflog. Change-Id: Ie4dd03370bb0d475a0e89d3015113ca98920100f	12 years ago
Kevin Sawicki	656461a991	Add exception class for when locking a file fails This will allows calling classes to handle lock failures without checking against the message and will also provide access to the file that could not be locked. Change-Id: I95bc59e1330a7af71ae3b0485c4516299193f504	12 years ago
Dave Borowitz	84c80be1dc	Fire DfsPacksChangedEvents when committing packs. Once a pack has been committed with commitPack(), we know that the pack list has changed but we don't re-scan the underlying storage. Change-Id: Ia7b35df4442a5f5dfe7e817edcc77b44b5410d08	12 years ago
Kevin Sawicki	47d1616374	Use constant for logs directory Change-Id: Ie139133bcbe1ca61c85e86b3484f858bc065821f	12 years ago
Kevin Sawicki	7ed1ef953c	Implement Serializable interface in ReflogEntry Change-Id: Idf798dd3981bef3dc9e17c13c12809f89089e96f	12 years ago
Shawn O. Pearce	2fbf296fda	Fix duplicate objects in "thin+cached" packs from DFS The DfsReader must offer every representation of an object that exists on the local repository when PackWriter asks for them. This is necessary to identify objects in the thin pack part that are also in the cached pack that will be appended onto the end of the stream. Without looking at all alternatives, PackWriter may pack the same object twice (once in the thin section, again in the cached base pack). This may cause the command line C version to go into an infinite loop when repacking the resulting repository, as it may see a delta chain cycle with one of those duplicate copies of the object. Previously the DfsReader tried to avoid looking at packs that it might not care about, but this is insufficient, as all versions must be considered during pack generation. Change-Id: Ibf4a3e8ea5c42aef16404ffc42a5781edd97b18e	12 years ago
Shawn O. Pearce	60e51251db	Do not write edge objects to the pack stream Consider two objects A->B where A uses B as a delta base, and these are in the same source pack file ordered as "A B". If cached packs is enabled and B is also in the cached pack that will be appended onto the end of the thin pack, and both A, B are supposed to be in the thin pack, PackWriter must consider the fact that A's base B is an edge object that claims to be part of the new pack, but is actually "external" and cannot be written first. If the object reuse system considered B candidates fist this bug does not arise, as B will be marked as edge due to it existing in the cached pack. When the A candidates are later examined, A sees a valid delta base is available as an edge, and will not later try to "write base first" during the writing phase. However, when the reuse system considers A candidates first they see that B will be in the outgoing pack, as it is still part of the thin pack, and arrange for A to be written first. Later when A switches from being in-pack to being an edge object (as it is part of the cached pack) the pointer in B does not get its type changed from ObjectToPack to ObjectId, so B thinks A is non-edge. We work around this case by also checking that the delta base B is non-edge before writing the object to the pack. Later when A writes its object header, delta base B's ObjectToPack will have an offset == 0, which makes isWritten() = false, and the OBJ_REF delta format will be used for A's header. This will be resolved by the client to the copy of B that appears in the later cached pack. Change-Id: Ifab6bfdf3c0aa93649468f49bcf91d67f90362ca	12 years ago
Shawn O. Pearce	1421106d76	Use long for more object counts in PackWriter Packs can contain up to 2^32-1 objects, which exceeds the range of a Java int. Try harder to accept higher object counts in some cases by using long more often when we are working with the object count value. This is a trivial refactoring, we may have to make even more changes to the object handling code to support more than 2^31-1 objects. Change-Id: I8cd8146e97cd1c738ad5b48fa9e33804982167e7	12 years ago
Shawn O. Pearce	41a18d57bc	Search for annotated tag reuse first Annotated tags are relatively rare and currently are scheduled in a pack file near the commits, decreasing the time it takes to resolve client requests reading tags as part of a history traversal. Putting them first before the commits allows the storage system to page in the tag area, and have it relatively hot in the LRU when the nearby commit area gets examined too. Later looking at the tree and blob data will pollute the cache, making it more likely the tags are not loaded and would require file IO. Change-Id: I425f1f63ef937b8447c396939222ea20fdda290f	12 years ago
Shawn O. Pearce	29997ab084	Correct progress monitor on "Getting sizes:" phase This counter always was running 1 higher, because it incremented after the queue was exhausted (and every object was processed). Move increments to be after the queue has provided a result, to ensure we do not show a higher in-progress count than total count. Change-Id: I97f815a0492c0957300475af409b6c6260008463	12 years ago
Shawn O. Pearce	4b84186b64	Refactor DfsReader selection of cached packs Make the code more clear with a simple refactoring of the boolean logic into a method that describes the condition we are looking for on each pack file. A cached pack is possible if there exists a tips collection, and the collection is non-empty. Change-Id: I4ac42b0622b39d159a0f4f223e291c35c71f672c	12 years ago
Matthias Sohn	afebe7880d	[findBugs] Prefer short-cut logic as it's more performant Change-Id: I64577f8fd19ee0d2d407479cc70e521adc367f37 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Dave Borowitz	2b584b9216	Keep track of a static collection of all PackWriter instances Stored in a weak concurrent hash map, which we clean up while iterating. Usually the weak reference behavior should not be necessary because PackWriters should be released with release(), but we still want to avoid leaks when dealing with broken client code. Change-Id: I337abb952ac6524f7f920fedf04065edf84d01d2	12 years ago
Dave Borowitz	f26b79d044	Estimate the amount of memory used by a PackWriter Memory usage is dominated by three terms: - The maximum memory allocated to each delta window. - The maximum size of a single file held in memory during delta search. - ObjectToPack instances owned by the writer. For the first two terms, rather than doing complex instrumentation of the DeltaWindows, we just overestimate based on the config parameters (though we may underestimate if the maximum size is not set). For the ObjectToPack instances, we do some rough byte accounting of the underlying Java object representation. Change-Id: I23fe3cf9d260a91f1aeb6ea22d75af8ddb9b1939	12 years ago
Dave Borowitz	16b8ebf2d1	Add an object encapsulating the state of a PackWriter Exposes essentially the same state machine to the programmer as is exposed to the client via a ProgressMonitor, using a wrapper around beginTask()/endTask(). Change-Id: Ic3622b4acea65d2b9b3551c668806981fa7293e3	12 years ago
Shawn O. Pearce	9652f16a47	Always use try/finally around DfsBlockCache.clockLock Any RuntimeException or Error in this block will leave the lock held by the caller thread, which can later result in deadlock or just cache requests hanging forever because they cannot get to the lock object. Wrap everything in try/finally to prevent the lock from hanging, even though a RuntimeException or Error should never happen in any of these code paths. Change-Id: Ibb3467f7ee4c06f617b737858b4be17b10d936e0	12 years ago
Shawn O. Pearce	a6677ef28a	DfsBlockCache: Fix NPE when evicting empty cell The cache starts with a single empty Ref that has no data, as the clock list does not support being empty. When this Ref is removed, the size has to be decremented from the associated DfsPackKey, which was previously null. Make it always be non-null. Change-Id: I2af99903e8039405ea6d67f383576ffa43839cff	12 years ago
Colby Ranger	f70ecabb30	DfsBlockCache: Update hits to not include contains() Also expose the underlying hit and miss counters, in addition to the hit ratio. Change-Id: Icea2572d62e59318133b0a88848019f34ad70975	12 years ago
Dave Borowitz	0f8e486a4d	Add a listener for changes to a DfsObjDatabase's pack files Intended for cross-request use, so only refers to DfsRepositoryDescriptions rather than DfsRepositorys. Change-Id: I2633e472c9264d91d632069f608d53d4bdd0fc09	12 years ago
Dave Borowitz	d55eb35106	Expose the reverse index size in the DfsPackDescription This is analogous to the getPackSize() and getIndexSize() methods. Change-Id: I207c0c93f9145826d84b3610eb4319fca074ee0d	12 years ago
Dave Borowitz	4fc1af6850	Add a DfsPackFile method to get the number of cached bytes The counter is actually stored in the DfsPackKey so it can be manipulated by the cache. Change-Id: I10cee76c92d65c68d1aa1a9dd0c4fd7173c4cede	12 years ago
Dave Borowitz	dff9d56b94	Expose the list of pack files in the DfsBlockCache Callers may want to inspect the contents of the cache, which this allows them to do in a read-only fashion without any locking. Change-Id: Ifd78e8ce34e26e5cc33e9dd61d70c593ce479ee0	12 years ago
Dave Borowitz	35d72ac806	Add a DFS repository description and reference it in each pack Just as DfsPackDescription describes a pack but does not imply it is open in memory, a DfsRepositoryDescription describes a repository at a basic level without it necessarily being open. Change-Id: I890b5fccdda12c1090cfabf4083b5c0e98d717f6	12 years ago
Dave Borowitz	5a38e5b440	Clarify the docstring of DfsBlockCache.reconfigure() The docstring was copied from the local filesystem cache code, which actually attempted to reconfigure the cache on the fly. The DFS cache is designed to be "reconfigured" exactly once. Change-Id: Ia0b01f5d6b6b3d3a68d65a5c229ff67c1cede5bc	12 years ago
Shawn O. Pearce	fa4cc2475f	DFS: A storage layer for JGit In practice the DHT storage layer has not been performing as well as large scale server environments want to see from a Git server. The performance of the DHT schema degrades rapidly as small changes are pushed into the repository due to the chunk size being less than 1/3 of the pushed pack size. Small chunks cause poor prefetch performance during reading, and require significantly longer prefetch lists inside of the chunk meta field to work around the small size. The DHT code is very complex (>17,000 lines of code) and is very sensitive to the underlying database round-trip time, as well as the way objects were written into the pack stream that was chunked and stored on the database. A poor pack layout (from any version of C Git prior to Junio reworking it) can cause the DHT code to be unable to enumerate the objects of the linux-2.6 repository in a completable time scale. Performing a clone from a DHT stored repository of 2 million objects takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row for each object being cloned. This is very difficult for some DHTs to scale, even at 5000 rows/second the lookup stage alone takes 6 minutes (on local filesystem, this is almost too fast to bother measuring). Some servers like Apache Cassandra just fall over and cannot complete the 2 million lookups in rapid fire. On a ~400 MiB repository, the DHT schema has an extra 25 MiB of redundant data that gets downloaded to the JGit process, and that is before you consider the cost of the OBJECT_INDEX table also being fully loaded, which is at least 223 MiB of data for the linux kernel repository. In the DHT schema answering a `git clone` of the ~400 MiB linux kernel needs to load 248 MiB of "index" data from the DHT, in addition to the ~400 MiB of pack data that gets sent to the client. This is 193 MiB more data to be accessed than the native filesystem format, but it needs to come over a much smaller pipe (local Ethernet typically) than the local SATA disk drive. I also never got around to writing the "repack" support for the DHT schema, as it turns out to be fairly complex to safely repack data in the repository while also trying to minimize the amount of changes made to the database, due to very common limitations on database mutation rates.. This new DFS storage layer fixes a lot of those issues by taking the simple approach for storing relatively standard Git pack and index files on an abstract filesystem. Packs are accessed by an in-process buffer cache, similar to the WindowCache used by the local filesystem storage layer. Unlike the local file IO, there are some assumptions that the storage system has relatively high latency and no concept of "file handles". Instead it looks at the file more like HTTP byte range requests, where a read channel is a simply a thunk to trigger a read request over the network. The DFS code in this change is still abstract, it does not store on any particular filesystem, but is fairly well suited to the Amazon S3 or Apache Hadoop HDFS. Storing packs directly on HDFS rather than HBase removes a layer of abstraction, as most HBase row reads turn into an HDFS read. Most of the DFS code in this change was blatently copied from the local filesystem code. Most parts should be refactored to be shared between the two storage systems, but right now I am hesistent to do this due to how well tuned the local filesystem code currently is. Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb	13 years ago
Matthias Sohn	0db0476542	Fire IndexChangedEvent on DirCache.commit() Since we replaced GitIndex by DirCache JGit didn't fire IndexChangedEvents anymore. For EGit this still worked with a high latency since its RepositoryChangeScanner which is scheduled to run each 10 seconds fires the event in case the index changes. This scanner is meant to detect index changes induced by a different process e.g. by calling "git add" from native git. When the index is changed from within the same process we should fire the event synchronously. Compare the index checksum on write to index checksum when index was read earlier to determine if index really changed. Use IndexChangedListener interface to keep DirCache decoupled from Repository. Change-Id: Id4311f7a7859ffe8738863b3d86c83c8b5f513af Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Matthias Sohn	46771e9e88	Remove use of GitIndex to detect index changes We can detect index changes using FileSnapshot. This is more efficient and removes usage of a deprecated class. Change-Id: I4a679102c9a1bd8e82b9ca93eb9dbbde445e9be4 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Shawn O. Pearce	1b6a549ff3	PackWriter: Export more statistics Export the shallow pack information, and also a handy function to sum up the total times. Include the time writing out the index file, if it was created. Change-Id: I7f60ae6848455a357b25feedb23743bbf6c153cf Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Robin Rosenberg	eadc26c0a0	Add a helper for parsing branch switch info out of a reflog entry Change-Id: I91c7e08c4afd2562df2226887a933d93c78a0371 Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Roberto Tyley	602b2f0d45	Tolerate zlib deflation with window size < 32Kb JGit currently identifies loose objects as 'corrupt' if they've been deflated using a window size less than 32Kb, because the isStandardFormat() function doesn't recognise the header byte as a zlib header. This patch makes the method tolerant of all valid window sizes (15-bit to 8-bit) - but doesn't sacrifice it's accuracy in distingushing the standard loose-object format from the experimental (now abandoned) format. It's based on a patch which has been merged into C-Git master branch: https://github.com/git/git/commit/7f684a2aff636f44a506 On memory constrained systems zlib may use a much smaller window size - working on Agit, I found that Android uses a 4KB window; giving a header byte of 0x48, not 0x78. Consequently all loose objects generated by the Android platform appear 'corrupt' :( It might appear that this patch changes isStandardFormat() to the point where it could incorrectly identify the experimental format as the standard one, but the two criteria (bitmask & checksum) can only give a false result for an experimental object where both of the following are true: 1) object size is exactly 8 bytes when uncompressed (bitmask) 2) [single-byte in-pack git type&size header] * 256 + [1st byte of the following zlib header] % 31 = 0 (checksum) As it happens, for all possible combinations of valid object type (1-4) and window bits (0-7), the only time when the checksum will be divisible by 31 is for 0x1838 - ie object type 1, a Commit - which, due the fields all Commit objects must contain, could never be as small as 8 bytes in size. Given this, the combination of the two criteria (bitmask & checksum) always correctly determines the buffer format, and is more tolerant than the previous version. References: Android uses a 4KB window for deflation: http://android.git.kernel.org/?p=platform/libcore.git;a=blob;f=luni/src/main/native/java_util_zip_Deflater.cpp;h=c0b2feff196e63a7b85d97cf9ae5bb2583409c28;hb=refs/heads/gingerbread#l53 Code snippet searching for false positives with the zlib checksum: https://gist.github.com/1118177 Change-Id: Ifd84cd2bd6b46f087c9984fb4cbd8309f483dec0	13 years ago
Matt Fischer	9952223e06	Implement server support for shallow clones This implements the server side of shallow clones only (i.e. git-upload-pack), not the client side. CQ: 5517 Bug: 301627 Change-Id: Ied5f501f9c8d1fe90ab2ba44fac5fa67ed0035a4 Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>	14 years ago

1 2 3 4 5

243 Commits (2853c61f12f44bdacac404a58cebd191f98299b1)