mirrors/jgit - jgit - source @ dussan.org

Commit Graph

Author	SHA1	Message	Date
Luca Milanesio	d6e00d2015	Remember the cause for invalidating a packfile Keep track of the original cause for a packfile invalidation. It is needed for the sysadmin to understand if there is a real underlying filesystem problem and repository corruption or if it is simply a consequence of a concurrency of Git operations (e.g. repack or GC). Change-Id: I06ddda9ec847844ec31616ab6d17f153a5a34e33 Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com> Signed-off-by: David Pursehouse <david.pursehouse@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	5 years ago
Luca Milanesio	bf3d1ded35	Check for packfile validity and fd before reading When reading from a packfile, make sure that is valid and has a non-null file-descriptor. Because of concurrency between a thread invalidating a packfile and another trying to read it, the read() may result into a NPE that won't be able to be automatically recovered. Throwing a PackInvalidException would instead cause the packlist to be refreshed and the read to eventually succeed. Bug: 544199 Change-Id: I27788b3db759d93ec3212de35c0094ecaafc2434 Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>	5 years ago
Luca Milanesio	afef866a44	Move throw of PackInvalidException outside the catch When a packfile is invalid, throw an exception explicitly outside any catch scope, so that is not accidentally caught by the generic catch-all cause, which would set the packfile as valid again. Flagging an invalid packfile as valid again would have dangerous consequences such as the corruption of the in-memory packlist. Bug: 544199 Change-Id: If7a3188a68d7985776b509d636d5ddf432bec798 Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>	5 years ago
Luca Milanesio	2d116cd0ab	Use FileSnapshot to get lastModified on PackFile Do not redundantly call File.lastModified() for extracting the timestamp of the PackFile but rather use consistently the FileSnapshot which reads all file attributes in a single bulk call. Change-Id: I932675ae4fe56dcd3833dac249816f097303bb09 Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	5 years ago
Luca Milanesio	fef782128d	Do not reuse packfiles when changed on filesystem The pack reload mechanism from the filesystem works only by name and does not check the actual last modified date of the packfile. This lead to concurrency issues where multiple threads were loading and removing from each other list of packfiles when one of those was failing the checksum. Rely on FileSnapshot rather than directly checking lastModified timestamp so that more checks can be performed. Bug: 544199 Change-Id: I173328f29d9914007fd5eae3b4c07296ab292390 Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>	5 years ago
Christian Halstrick	1ed1e40387	Fix exception handling for opening bitmap index files When creating a new PackFile instance it is specified whether this pack has an associated bitmap index file or not. This information is cached and the public method getBitmapIndex() will always assume a bitmap index file must exist if the cached data tells so. But it may happen that the packfiles are repacked during a gc in a different process causing the packfile, bitmap-index and index file to be deleted. Since JGit still has an open FileHandle on the packfile this file is not really deleted and can still be accessed. But index and bitmap index file are deleted. Fix getBitmapIndex() to invalidate the cached packfile instance if such a situation occurs. This problem showed up when a gerrit server was serving repositories which where garbage collected with native git regularly. Fetch and clone commands for certain repositories failed permanently after a native git gc had deleted old bitmap index files. Change-Id: I8e620bec74dd3f310ba42024f9a657062f868f0e Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	6 years ago
Matthias Sohn	aaf3c5154e	Only mark packfile invalid if exception signals permanent problem Add NoPackSignatureException and UnsupportedPackVersionException to explicitly mark permanent unrecoverable problems with a pack Assume problem with a pack is permanent only if we are sure the exception signals a non-transient problem we can't recover from: - AccessDeniedException: we lack permissions - CorruptObjectException: we detected corruption - EOFException: file ended unexpectedly - NoPackSignatureException: pack has no pack signature - NoSuchFileException: file has gone missing - PackMismatchException: pack no longer matches its index - UnpackException: unpacking failed - UnsupportedPackIndexVersionException: unsupported pack index version - UnsupportedPackVersionException: unsupported pack version Do not attempt to handle Errors since they are thrown for serious problems applications should not try to recover from. Change-Id: I2c416ce2b0e23255c4fb03a3f9a0ee237f7a484a Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	7 years ago
Luca Milanesio	363a3657b1	Don't flag a packfile invalid if opening existing file failed A packfile random file open operation may fail with a FileNotFoundException even if the file exists, possibly for the temporary lack of resources. Instead of managing the FileNotFoundException as any generic IOException it is best to rethrow the exception but prevent the packfile for being flagged as invalid until it is actually opened and read successfully or unsuccessfully. Bug: 514170 Change-Id: Ie37edba2df77052bceafc0b314fd1d487544bf35 Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	7 years ago
Luca Milanesio	4c558225dc	Don't remove pack when FileNotFoundException is transient The FileNotFoundException is typically raised in three conditions: 1. file doesn't exist 2. incompatible read vs. read/write open modes 3. filesystem locking 4. temporary lack of resources (e.g. too many open files) 1. is already managed, 2. would never happen as packs are not overwritten while with 3. and 4. it is worth logging the exception and retrying to read the pack again. Log transient errors using an exponential backoff strategy to avoid flooding the logs with the same error if consecutive retries to access the pack fail repeatedly. Bug: 513435 Change-Id: I03c6f6891de3c343d3d517092eaa75dba282c0cd Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	7 years ago
David Pursehouse	7ac182f4e4	Enable and fix 'Should be tagged with @Override' warning Set missingOverrideAnnotation=warning in Eclipse compiler preferences which enables the warning: The method <method> of type <type> should be tagged with @Override since it actually overrides a superclass method Justification for this warning is described in: http://stackoverflow.com/a/94411/381622 Enabling this causes in excess of 1000 warnings across the entire code-base. They are very easy to fix automatically with Eclipse's "Quick Fix" tool. Fix all of them except 2 which cause compilation failure when the project is built with mvn; add TODO comments on those for further investigation. Change-Id: I5772061041fd361fe93137fd8b0ad356e748a29c Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>	7 years ago
David Pursehouse	b20f7d610e	Organize imports Change-Id: I97044f69d220fc2d3f9fe890fdfec542454f02d2 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>	7 years ago
James Melvin	d980a3fa85	Fix keep pack filename Previously it was looking for a keep file with the name of a pack file (extenstion included) appended with a '.keep'. However, the keep file name should be the pack file name with a '.keep' extension Change-Id: I9dc4c7c393ae20aefa0b9507df8df83610ce4d42 Signed-off-by: James Melvin <jmelvin@codeaurora.org>	7 years ago
Andrey Loskutov	79a7dd026f	[performance] Remove synthetic access$ methods in pack and file packages Java compiler must generate synthetic access methods for private methods and fields of the enclosing class if they are accessed from inner classes and vice versa. While invisible in the code, those synthetic access methods exist in the bytecode and seem to produce some extra execution overhead at runtime (compared with the direct access to this fields or methods), see https://git.eclipse.org/r/58948/. By removing the "private" access modifier from affected methods and fields we help compiler to avoid generation of synthetic access methods and hope to improve execution performance. To validate changes, one can either use javap or use Bytecode Outline plugin in Eclipse. In both cases one should look for "synthetic access$<number>" methods at the end of the class and inner class files in question - there should be none. NB: don't mix this "synthetic access$" methods up with "public synthetic bridge" methods generated to allow generic method override return types. Change-Id: If53ec94145bae47b74e2561305afe6098012715c Signed-off-by: Andrey Loskutov <loskutov@gmx.de>	8 years ago
Andrey Loskutov	b5941c74e5	Set "potentialNullReference" to "error" level and fixed all issues There should be no functional change, the logic updated only to make code simple so that compiler can understand what is going for. Removed all @SuppressWarnings("null") annotations since they cannot be used if "org.eclipse.jdt.core.compiler.problem.potentialNullReference" option is set to the "error" level. Bug: 470647 Change-Id: Ie93c249fa46e792198d362e531d5cbabaf41fdc4 Signed-off-by: Andrey Loskutov <loskutov@gmx.de>	8 years ago
Saša Živkov	d9062145b8	Don't invalidate pack file on InterruptedIOException If the thread reading a pack file is interrupted don't invalidate that pack file. This could happen when Gerrit invoked JGit for computing a diff in one thread and waited for the call to finish from another thread, with a timeout. When the timeout was reached the "diff" thread was interrupted. If it happened to be in an IO operation, reading a pack file, an InterruptedIOException was thrown and the pack file was marked as invalid and removed from the pack list. Invalidating the pack in that case could cause the project disappearing in Gerrit as discussed in [1] and [2]. [1] https://groups.google.com/forum/#!topic/repo-discuss/CYYoHfDxCfA [2] https://groups.google.com/forum/#!topic/repo-discuss/ZeGWPyyJlrM Change-Id: I2eb1f98370936b5be541d96d70c3973cbfc39238 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com> Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>	9 years ago
Shawn Pearce	6e5c71b358	Remove validate support when reusing cached pack Cached packs are only used when writing over the network or to a bundle file and reuse validation is always disabled in these two contexts. The client/consumer of the stream will be SHA-1 checksumming every object. Reuse validation is most critical during local GC to avoid silently ignoring corruption by stopping as soon as a problem is found and leaving everything alone for the end-user to debug and salvage. Cached packs are not supported during local GC as the bitmap rebuild logic does not support including a cached pack in the result. Strip out the validation and force PackWriter to always disable the cached pack feature if reuseValidation is enabled. Change-Id: If0d7baf2ae1bf1f7e71bf773151302c9f7887039	9 years ago
Matthias Sohn	4dd4d7e12a	Silence false null pointer access warnings in PackFile Change-Id: Ia39085557b38840dfaa9b4995e6f6c40e19042cb Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	9 years ago
Shawn Pearce	7ab6ffdf50	Remove AutoCloseable from internal PackFile and friends PackFile is held by the block cache and cannot be auto closed in a try-with-resources statement. Remove the interface as JGit does explicit management of the instances. ObjectDatabase and RefDatabase are internal details of Repository and are managed with the Repository. Marking them AutoCloseable provides no value to the library or an application using the API. Change-Id: Ibee19eadd66233e6666b601583daa1834a7778f1	9 years ago
Matthias Sohn	57644f23a1	Provide more details in exceptions thrown when packfile is invalid Mention packfile path in exceptions thrown when we detect that a packfile is invalid and make excplicit that corrupt packs are removed from the pack list. Change-Id: I454ada5f8e69307d3f34d1c1b8f3cb87607ddf35 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	9 years ago
Matthias Sohn	27ae8bc655	Implement AutoClosable interface to support try-with-resources block Bug: 428039 Change-Id: I41880862db5303b5bea4b2184ba7844d69c997b5 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	9 years ago
Shawn Pearce	94c4d7eee8	Cleanup use of java.util.Inflater, fixing rare infinite loops The native implementation of inflate() can set finished to return true at the same time as it copies the last bytes into the buffer. Check for finished on each iteration, terminating as soon as libz knows the stream was completely inflated. If not finished, it is likely input is required before the next native call could do any useful work. Most invocations are passing in a buffer large enough to store the entire result. A partial return from inflate() will need more input before it can continue. Checking right away that needsInput() is true saves a native call to determine no bytes can be inflated without more input. This should fix a rare infinite loop condition inside of inflation when an object ends exactly at the end of a block boundary, and the next block contains only the 20 byte trailing SHA-1. When the stream is finished each new attempt to inflate() returns n == 0, as no additional bytes were output. The needsInput() test tries to add the length of the footer block to itself, but then loops back around an reloads the same block as the block is smaller than a full block size. A zero length input is set to the inflater, which triggers needsInput() condition again. Change-Id: I95d02bfeab4bf995a254d49166b4ae62d1f21346	9 years ago
Doug Kelly	62697c8d33	Remove streaming delta support from JGit Streaming packed deltas is so slow that it never feasibly completes (it will take hours for it to stream a few hundred megabytes on relatively fast systems with a large amount of storage). This was indicated as a "failed experiment" by Shawn in the following mailing list post: http://dev.eclipse.org/mhonarc/lists/jgit-dev/msg01674.html Change-Id: Idc12f59e37b122f13856d7b533a5af9d8867a8a5 Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>	10 years ago
Colby Ranger	570bba5e7a	Ignore bitmap indexes that do not match the pack checksum If `git gc` creates a new pack with the same file name, the pack checksum may not match that in the .bitmap. Fix the PackFile implementaion to silently ignore invalid bitmap indexes. Fixes Issue https://code.google.com/p/gerrit/issues/detail?id=2131 Change-Id: I378673c00de32385ba90f4b639cb812f9574a216	10 years ago
Shawn Pearce	f32b861243	JGit 3.0: move internal classes into an internal subpackage This breaks all existing callers once. Applications are not supposed to build against the internal storage API unless they can accept API churn and make necessary updates as versions change. Change-Id: I2ab1327c202ef2003565e1b0770a583970e432e9	11 years ago
Shawn Pearce	913cccd5c4	Do not attempt to read bitmap from invalid pack If a pack file has been marked invalid due to a prior IOException accessing its contents, do not offer its bitmap index to callers. The pack cannot be used so its bitmap should be off limits from any reader trying to work from a bitmap. Change-Id: Ia44e46558abdddee560bb184158b1e0af9437eee	11 years ago
Colby Ranger	3b325917a5	Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f	11 years ago
Colby Ranger	1512d0ab4e	Remove the unused method PackFile.hasExt(). It will be used in a future change, so just include it with that change. Change-Id: I7db28d86f8e8b282a403acd9a4c4defaae828f94	11 years ago
Colby Ranger	4a317a1790	Include supported extensions in PackFile constructor. Previously a PackFile class was assumed to only support a .pack and .idx file. Update the constructor to enumerate the supported extensions for the pack file. This will allow the bitmap code to only be executed if the bitmap extension file is known to exist. Change-Id: Ie59041dffec5f60d7ea2771026ffd945106bd4bf	11 years ago
Colby Ranger	698705c754	Rename PackConstants to PackExt, a typed pack file extension. PackConstants previously contained string values for the pack and pack index extension. Change PackConstant to be PackExt, a typed wrapper around the string pack file extension. Change-Id: I86ac4db6da8f33aa42d6f37cfcc119e819444318	11 years ago
Colby Ranger	82ecfb3e31	Remove packIndex field from FileObjDatabase openPack method. Previously, the FileObjDatabase required both the pack file path and index file path to be passed to openPack(). A future change to add a bitmap index will add a .bitmap file parallel to the pack file (similar to the .idx file). Update the PackFile to support automatically loading pack index extensions based on the pack file path. Change-Id: Ifc8fc3e57f4afa177ba5a88df87334dbfa799f01	11 years ago
Robin Rosenberg	c310fa0c80	Mark non-externalizable strings as such A few classes such as Constanrs are marked with @SuppressWarnings, as are toString() methods with many liternal, but otherwise $NLS-n$ is used for string containing text that should not be translated. A few literals may fall into the gray zone, but mostly I've tried to only tag the obvious ones. Change-Id: I22e50a77e2bf9e0b842a66bdf674e8fa1692f590	11 years ago
Christian Halstrick	4c3f017a77	Garbage collector for FileRepositories Implements a garbage collector for FileRepositories. Main ideas are copied from the garbage collector for DFS based repos (DfsGarbageCollector). Added functionalities are - pruning loose objects - handling of the index - packing refs - handling of reflogs (objects referenced from reflog will not be pruned/) These are features of a GC which are not handled in this change and which should come with subsequent changes: - unpacking packed objects into loose objects (to support that pruning packed objects doesn't delete them until they are older than two weeks) - expiration of reflogs - support for configuration parameters (e.g. gc.pruneExpire) Change-Id: I14ea5cb7e0fd1b5c50b994fd77f4e05bfbb9d911 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>	12 years ago
Kevin Sawicki	17fb542e9e	Remove 86 boxing warnings Use Integer, Character, and Long valueOf methods when passing parameters to MessageFormat and other places that expect objects instead of primitives Change-Id: I5942fbdbca6a378136c00d951ce61167f2366ca4	12 years ago
Shawn O. Pearce	6c0d300a54	Fix loading packed objects >2G Parsing the size from a packed object header was incorrectly computing the total inflated length when the length exceeded the range of a Java int. The next 7 bits of size information was shifted left as an int using a shift of 25 bits, placing the higher bits of the size into the sign position. When this size was extended to a long to be added to the current size accumulator the size went negative, resulting in NegativeArraySizeException being thrown. Fix all places where this particular pattern of code is used to read a pack size field, or a binary delta header, as they both use the same variable length encoding scheme. Change-Id: I04008728ed828f18202652c3d5401cf95a441d0a	12 years ago
Robin Rosenberg	95d311f888	Move JGitText to an internal package Change-Id: I763590a45d75f00a09097ab6f89581a3bbd3c797	12 years ago
Shawn O. Pearce	53fb027284	Make DeltaBaseCache per-ObjectReader The 'Counting objects' phase of PackWriter requires good hit rates from the DeltaBaseCache while walking trees, the deltas need to find their bases in the cache in order to inflate in a reasonable time. If JGit is running in a multi-threaded server, such as Gerrit Code Review, each thread needs its own DeltaBaseCache to prevent one thread from evicting the other thread's relevant bases. Move the cache to be per-ObjectReader, lazily allocated when required by a PackFile. Change-Id: If9d5ed06728e813632ae96dcfb811f4860b276e8 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	48fb404a3f	PackFile: Cache the packName string Instead of computing this on every request, compute it once and hold onto the result. This improves performance for LocalCachedPack which does a lot of tests against the pack name string. Change-Id: I3803745e3a5dda7b5f0faf39aae9423e2c777e7f Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	4e187d898a	PackFile: Fix copy as-is for small objects When I disabled validation I broke the code that handled copying small objects whose contents were below 8192 bytes in size but spanned over the end of one window and into the next window. These objects did not ever populate the temporary write buffer, resulting in garbage writing into the output stream instead of valid object contents. Change-Id: Ie26a2aaa885d0eee4888a9b12c222040ee4a8562 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	a468cb57c2	PackWriter: Validate reused cached packs If object reuse validation is enabled, the output pack is going to probably be stored locally. When reusing an existing cached pack to save object enumeration costs, ensure the cached pack has not been corrupted by checking its SHA-1 trailer. If it has, writing will abort and the output pack won't be complete. This prevents anyone from trying to use the output pack, and catches corruption before it can be carried any further. Change-Id: If89d0d4e429d9f4c86f14de6c0020902705153e6 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	1b2062fe37	PackWriter: Avoid CRC-32 validation when feeding IndexPack There is no need to validate the object contents during copyObjectAsIs if the result is going to be parsed by unpack-objects or index-pack. Both programs will compute the SHA-1 of the object, and also validate most of the pack structure. For git daemon like servers, this work is already done on the client end of the connection, so the server doesn't need to repeat that work itself. Disable object validation for the 3 transport cases where we know the remote side will handle object validation for us (push, bundle creation, and upload pack). This improves performance on the server side by reducing the work that must be done. Change-Id: Iabb78eec45898e4a17f7aab3fb94c004d8d69af6 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	461b012e95	PackWriter: Support reuse of entire packs The most expensive part of packing a repository for transport to another system is enumerating all of the objects in the repository. Once this gets to the size of the linux-2.6 repository (1.8 million objects), enumeration can take several CPU minutes and costs a lot of temporary working set memory. Teach PackWriter to efficiently reuse an existing "cached pack" by answering a clone request with a thin pack followed by a larger cached pack appended to the end. This requires the repository owner to first construct the cached pack by hand, and record the tip commits inside of $GIT_DIR/objects/info/cached-packs: cd $GIT_DIR root=$(git rev-parse master) tmp=objects/.tmp-$$ names=$(echo $root \| git pack-objects --keep-true-parents --revs $tmp) for n in $names; do chmod a-w $tmp-$n.pack $tmp-$n.idx touch objects/pack/pack-$n.keep mv $tmp-$n.pack objects/pack/pack-$n.pack mv $tmp-$n.idx objects/pack/pack-$n.idx done (echo "+ $root"; for n in $names; do echo "P $n"; done; echo) >>objects/info/cached-packs git repack -a -d When a clone request needs to include $root, the corresponding cached pack will be copied as-is, rather than enumerating all of the objects that are reachable from $root. For a linux-2.6 kernel repository that should be about 376 MiB, the above process creates two packs of 368 MiB and 38 MiB[1]. This is a local disk usage increase of ~26 MiB, due to reduced delta compression between the large cached pack and the smaller recent activity pack. The overhead is similar to 1 full copy of the compressed project sources. With this cached pack in hand, JGit daemon completes a clone request in 1m17s less time, but a slightly larger data transfer (+2.39 MiB): Before: remote: Counting objects: 1861830, done remote: Finding sources: 100% (1861830/1861830) remote: Getting sizes: 100% (88243/88243) remote: Compressing objects: 100% (88184/88184) Receiving objects: 100% (1861830/1861830), 376.01 MiB \| 19.01 MiB/s, done. remote: Total `1861830` (delta 4706), reused `1851053` (delta `1553844`) Resolving deltas: 100% (1564621/1564621), done. real 3m19.005s After: remote: Counting objects: 1601, done remote: Counting objects: 1828460, done remote: Finding sources: 100% (50475/50475) remote: Getting sizes: 100% (18843/18843) remote: Compressing objects: 100% (7585/7585) remote: Total `1861830` (delta 2407), reused `1856197` (delta 37510) Receiving objects: 100% (1861830/1861830), 378.40 MiB \| 31.31 MiB/s, done. Resolving deltas: 100% (1559477/1559477), done. real 2m2.938s Repository owners can periodically refresh their cached packs by repacking their repository, folding all newer objects into a larger cached pack. Since repacking is already considered to be a normal Git maintenance activity, this isn't a very big burden. [1] In this test $root was set back about two weeks. Change-Id: Ib87131d5c4b5e8c5cacb0f4fe16ff4ece554734b Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	165358bc99	Use heap based stack for PackFile deltas Instead of using the current thread's stack to recurse through the delta chain, use a linked list that is stored in the heap. This permits the any thread to load a deep delta chain without running out of thread stack space. Despite needing to allocate a stack entry object for each delta visited along the chain being loaded, the object allocation count is kept the same as in the prior version by removing the transient ObjectLoaders from the intermediate objects accessed in the chain. Instead the byte[] for the raw data is passed, and null is used as a magic value to signal isLarge() and enter the large object code path. Like the old version, this implementation minimizes the amount of memory that must be live at once. The current delta instruction sequence, the base it applies onto, and the result are the only live data arrays. As each level is processed, the prior base is discarded and replaced with the new result. Each Delta frame on the stack is slightly larger than the standard ObjectLoader.SmallObject type that was used before, however the Delta instances should be smaller than the old method stack frames, so total memory usage should actually be lower with this new implementation. Change-Id: I6faca2a440020309658ca23fbec4c95aa637051c Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	e0a9961b78	Avoid unnecessary decoding of length in PackFile If the object type is a whole object and all we want is the type, there is no need to skip the length header. The type is already known and can be returned as-is. Instead skip the length header only for the two delta formats, where the delta base must itself be scanned. Change-Id: I87029258e88924b3e5850bdd6c9006a366191d10 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	d29b5db695	Remove unused 'shift' variable from PackFile This variable was not used for anything, but Eclipse's JDT failed to notice because of the "shift += " operation within the body of the while loop. Here we don't need the shift because we do not decode the length, but we do have to skip over the bytes that store the length to locate the delta base. Bug: 331319 Change-Id: I200a874fd7e39e3adf2640b8cd0f53dcf91ef4c9 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> CC: Remy Suen <remysuen@ca.ibm.com>	13 years ago
Robin Stocker	96bea14c7b	Use readFully() instead of read() Fixes the "Method ignores results of InputStream.read()" warning. This is the only place where read() was used instead of readFully() and the return value was not checked. So it was either an oversight or should be documented. This change assumes it was an oversight. Change-Id: I859404a7d80449c538a552427787f3e57d7c92b4	13 years ago
Matthias Sohn	ffc010fda4	[findbugs] Static comparator made final Fixing FindBugs warning MS_SHOULD_BE_FINAL. Change-Id: Ic69e6f6425e0a8950ce809eb3894f48a33e860aa Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 years ago
Shawn O. Pearce	7ba31474a3	Increase core.streamFileThreshold default to 50 MiB Projects like org.eclipse.mdt contain large XML files about 6 MiB in size. So does the Android project platform/frameworks/base. Doing a clone of either project with JGit takes forever to checkout the files into the working directory, because delta decompression tends to be very expensive as we need to constantly reposition the base stream for each copy instruction. This can be made worse by a very bad ordering of offsets, possibly due to an XML editor that doesn't preserve the order of elements in the file very well. Increasing the threshold to the same limit PackWriter uses when doing delta compression (50 MiB) permits a default configured JGit to decompress these XML file objects using the faster random-access arrays, rather than re-seeking through an inflate stream, significantly reducing checkout time after a clone. Since this new limit may be dangerously close to the JVM maximum heap size, every allocation attempt is now wrapped in a try/catch so that JGit can degrade by switching to the large object stream mode when the allocation is refused. It will run slower, but the operation will still complete. The large stream mode will run very well for big objects that aren't delta compressed, and is acceptable for delta compressed objects that are using only forward referencing copy instructions. Copies using prior offsets are still going to be horrible, and there is nothing we can do about it except increase core.streamFileThreshold. We might in the future want to consider changing the way the delta generators work in JGit and native C Git to avoid prior offsets once an object reaches a certain size, even if that causes the delta instruction stream to be slightly larger. Unfortunately native C Git won't want to do that until its also able to stream objects rather than malloc them as contiguous blocks. Change-Id: Ief7a3896afce15073e80d3691bed90c6a3897307 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>	13 years ago
Shawn O. Pearce	eb64ccad6d	Correctly name DeltaBaseCache This class is used only to cache the unpacked form of an object that was used as a base for another object. The theory goes that if an object is used as a delta base for A, it will probably also be a delta base for B, C, D, E, etc. and therefore having an unpacked copy of it on hand will make delta resolution for the others very fast. However since objects are usually only accessed once, we don't want to cache everything we unpack, just things that we are likely to need again. The only things we need again are the delta bases. Hence, its a delta base cache. This gets us the class name UnpackedObjectCache back, so we can use it to actually create a cache of unpacked object information. Change-Id: I121f356cf4eca7b80126497264eac22bd5825a1d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	7a9edb3662	Fix reuse from pack file for REF_DELTA types We miscomputed the CRC32 checksum for a REF_DELTA type of object, by not including the full 20 byte ObjectId of the delta base in the CRC code we use when the delta is too large to go through our two faster small reuse code paths. This resulted in a corruption error during packing, where the PackFile erroneously suspected the data was wrong on the local filesystem and aborted writing, because the CRC didn't match what we had read from the index. Change-Id: I7d12cdaeaf2c83ddc11223ce0108d9bd6886e025 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	a5c18fcfc7	Fully implement SHA-1 abbreviations ObjectReader implementations are now responsible for creating the unique abbreviation of an ObjectId, or for resolving an abbreviation back to its full form. In this latter case the reader can offer up multiple candidates to the caller, who may be able to disambiguate them based on context. Repository.resolve() doesn't take multiple candidates into account right now, but it could in the future by looking for a remaining ^0 or ^{commit} suffix and take an expansion if there is only one commit that matches the input abbreviation. It could also use the distance from an annotated tag to resolve "tag-NNN-gcommit" style strings that are often output by `git describe`. Change-Id: Icd3250adc8177ae05278b858933afdca0cbbdb56 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago

30 Commits (b23735107bb6af0413855e64f6eb9fe946a8f659)