mirrors/jgit - jgit - source @ dussan.org

Commit Graph

Author	SHA1	Message	Date
Luca Milanesio	d6e00d2015	Remember the cause for invalidating a packfile Keep track of the original cause for a packfile invalidation. It is needed for the sysadmin to understand if there is a real underlying filesystem problem and repository corruption or if it is simply a consequence of a concurrency of Git operations (e.g. repack or GC). Change-Id: I06ddda9ec847844ec31616ab6d17f153a5a34e33 Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com> Signed-off-by: David Pursehouse <david.pursehouse@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	5 years ago
David Pursehouse	280b2bfe52	DfsPackFile: Refactor getBitmapIndex to open ReadableChannel in try-with-resource Refactor getBitmapIndex to open ReadableChannel in try-with-resource instead of closing the channel in the finally block. The same cannot be done in copyPackThroughCache, so just suppress the warning with an explanatory comment. Change-Id: I9b95373d350728e85a159423d5ca80e8b215914d Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>	6 years ago
David Pursehouse	5c70be0085	Open auto-closeable resources in try-with-resource When an auto-closeable resources is not opened in try-with-resource, the warning "should be managed by try-with-resource" is emitted by Eclipse. Fix the ones that can be silenced simply by moving the declaration of the variable into a try-with-resource. In cases where we explicitly call the close() method, for example in tests where we are testing specific behavior caused by the close(), suppress the warning. Leave the ones that will require more significant refcactoring to fix. They can be done in separate commits that can be reviewed and tested in isolation. Change-Id: I9682cd20fb15167d3c7f9027cecdc82bc50b83c4 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>	6 years ago
David Pursehouse	0f1c160aa7	Don't unnecessarily explicitly call CorruptObjectException#initCause CorruptObjectException has a constructor that takes Throwable and calls initCause with it. Use that instead of instantiating the exception and explicitly calling initCause. Change-Id: I1f2747d6c4cc5249e93401b9787eb4ceb50cb995 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>	6 years ago
David Pursehouse	365c6cb387	Use new StoredObjectRepresentationNotAvailableException constructor In `5e7eed4` a new StoredObjectRepresentationNotAvailableException constructor was added, that takes a Throwable to initialize the exception cause. Update more call sites to use this constructor instead of first instantiating it and explicitly calling initCause(). All callers now use the new constructor, so annotate the other one as deprecated. Change-Id: I6d2a7e289a95f0360ddebf904cfd8b6c18fef10c Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>	6 years ago
Matthias Sohn	a224b78675	Fix javadoc in org.eclipse.jgit dfs package Change-Id: I1f5e3dc3ba34b323ee7244dbefee207ce19e6021 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	6 years ago
David Pursehouse	5e7eed41f6	StoredObjectRepresentationNotAvailableException: Add constructor that takes cause If the cause can be passed into the constructor, callers don't need to instantiate it and then explicitly call initCause. Note that the constructors in this class cause "non-API parameter type" warnings because ObjectToPack is internal, however it's probably OK since the only non-internal reference to it is in the pgm.debug package. Change-Id: Ia4eab24e79f9afe6214ea8160137d941d4048319 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>	6 years ago
Shawn Pearce	1a7b8a11df	dfs: expose DfsReftable from DfsObjDatabase Reftable storage in DFS is related to pack storage. Reftables are stored in the same namespace, but with PackExt.REFTABLE. Include the set of DfsReftable instances in the PackList and export some helpers to access the tables. Change-Id: I6a4f5f953ed6b0ff80a7780f4c6cbcc5eda0da3e	7 years ago
Shawn Pearce	1222f34506	dfs: support reading reftables through DfsBlockCache DfsBlockCache directly shares its internal byte[] with ReftableReader, avoding copying between the DfsBlockCache and the BlockReader instances used by ReftableReader. Change-Id: Icaa4f40052b26f952681414653a8b5314b7c2c23	7 years ago
Shawn Pearce	d4cfa95ba3	dfs: optionally store blockSize in DfsPackDescription Allow a DFS implementation to report blockSize to DfsPackFile, bypassing alignment errors and corrections in the DfsBlockCache when the blockSize of a specific file differs from the cache's configured blockSize. Change-Id: Ic376314d4a86a0bd528c033e169d93eef035b233	7 years ago
Shawn Pearce	da7671fcd5	dfs: Fix caching of index, bitmap index, reverse index When `07f98a8b71` ("Derive DfsStreamKey from DfsPackDescription") stopped caching DfsPackFile in the DfsBlockCache, the DfsPackFile began to always load the idx, bitmap, or compute reverse index, as the cache handles were no longer populated by prior requests. Rework caching to lookup the objects from the DfsBlockCache if the local DfsPackFile handle is invalid. This allows the DfsPackFile to be more of a flyweight instance across requests. Change-Id: Ic7b42ce2d90692cccea36deb30c2c76ccc81638b	7 years ago
Shawn Pearce	b1bdeeb0ee	dfs: Use special ForReverseIndex DfsStreamKey wrapper instead of derive While implementing a custom subclass of DfsStreamKey it became obvious the required derive(String) was making it impossible to construct an efficient key in all cases. Instead, use a special wrapper type ForReverseIndex around the INDEX's own DfsStreamKey to denote the reverse index stream in the DfsBlockCache. This adds a smaller layer of boxing, but eliminates weird issues for DFS implementors using specialized DfsStreamKey implementations for space efficiency reasons. Now that DfsStreamKey is reasonably light-weight, avoid allocating the index and reverse index keys until necessary. DfsPackFile mostly holds the DfsBlockCache.Ref handle to the object, and only needs the DfsStreamKey when its looking up the handle. Change-Id: Icea78e8f7f1514087b94ef5f525d9573ea2913f2	7 years ago
Shawn Pearce	07f98a8b71	Derive DfsStreamKey from DfsPackDescription By making this a deterministic function, DfsBlockCache can stop retaining a map of every DfsPackDescription it has ever seen. This fixes a long standing memory leak in DfsBlockCache. This refactoring also simplifies the idea of setting up more lightweight objects around streams. Change-Id: I051e7b96f5454c6b0a0e652d8f4a69c0bed7f6f4	7 years ago
Shawn Pearce	84c71ac933	Extract BlockBasedFile base class for DfsPackFile This new base class has the minimum set of properties and methods necessary for DfsBlockCache to manage blocks of a file in the cache. Subclasses can use DfsBlockCache for any content. This refactoring opens the door for additional PackExt types other than PACK to be stored on a block-by-block basis by the DfsBlockCache. Change-Id: I307228fc805c3ff0c596783beb24fd52bec35ba8	7 years ago
Shawn Pearce	8c566be72f	Use separate DfsStreamKey for PackIndex Instead of overloading the pack's DfsStreamKey with negative positions for the idx, reverse idx and bitmap, assign a unique DfsStreamKey for each of these related streams. Change-Id: Ie048036c74a1d1bbf5ea7e888452dc0c1adf992f	7 years ago
Shawn Pearce	e924de5295	Rename DfsPackKey to DfsStreamKey This renaming supports reusing DfsStreamKey in a future commit to index other PackExt type streams inside of the DfsBlockCache. Change-Id: Ib52d374e47724ccb837f4fbab1fc85c486c5b408	7 years ago
Shawn Pearce	5fdbcc1081	Use read ahead during copyPackThroughCache If a block is missing from the block cache, open the pack stream, retain the ReadableChannel, and turn on read-ahead. This should help to load a medium sized pack into a cold cache more quickly from a slower IO stream, as the pack is scanned sequentially and missing blocks are more likely to be available through the read-ahead. Change-Id: I3300d936b9299be6d9eb642992df7c04bb439cde	7 years ago
Shawn Pearce	562de51239	Track read IO for DfsReader Compute how much disk IO a DfsReader is performing, and how long the sum of those operations took on this reader instance. Implementations of DFS and interested applications can get the stats by calling the new DfsReader.getIoStats() method at or after close(). Change-Id: If585741301f29182617933d6406d4a70497f2ca7	7 years ago
Jonathan Nieder	061d24f6d5	Remove @since tags from internal packages These packages don't use @since tags because they are not part of the stable public API. Some @since tags snuck in, though. Remove them to make the convention easier to find for new contributors and the expectations clearer for users. Change-Id: I6c17d3cfc93657f1b33cf5c5708f2b1c712b0d31	7 years ago
Zhen Chen	d6b354f60f	Skip pack header bytes in DfsPackFile The 12 bytes `PACK...` header is written in PackWriter before reading CachedPack files. In DfsPackFile#copyPackBypassCache, the header was not skipped when the first block is not in cache. Change-Id: Ibbe2e564d36b79922a936657f286addb1044d237 Signed-off-by: Zhen Chen <czhen@google.com>	7 years ago
Terry Parker	2708b11b6c	Filter corrupt objects from DfsReader.selectObjectRepresentation() PackWriter.writeObject() can get into an infinite loop when corrupt packs are present. When it finds a pack file with an object that can be reused it calls DfsPackFile.copyAsIs(). If that method sees an invalid CRC, it adds the object to the DfsPackFile's corrupt object list and throws a CorruptObjectException, which it later catches as an IOException and wraps in a StoredObjectRepresentationNotAvailableException. PackWriter.writeObjectImpl() catches that SORNAE and retries the operation by calling DfsReader.selectObjectRepresentation(). But currently that method returns the same object which was just seen to be corrupt. Change DfsPackFile.isCorrupt() from private to package private, and use that method in DfsReader.selectObjectRepresentation() to filter out corrupt objects. The stack traces that show the problem are: org.eclipse.jgit.errors.CorruptObjectException.<init>(CorruptObjectException.java:113) org.eclipse.jgit.internal.storage.dfs.DfsPackFile.copyAsIs(DfsPackFile.java:624) org.eclipse.jgit.internal.storage.dfs.DfsReader.copyObjectAsIs(DfsReader.java:491) org.eclipse.jgit.internal.storage.pack.PackWriter.writeObjectImpl(PackWriter.java:1478) org.eclipse.jgit.internal.storage.pack.PackWriter.writeObject(PackWriter.java:1455) org.eclipse.jgit.internal.storage.dfs.DfsPackFile.getPackIndex(DfsPackFile.java:228) org.eclipse.jgit.internal.storage.dfs.DfsReader.findAllFromPack(DfsReader.java:476) org.eclipse.jgit.internal.storage.dfs.DfsReader.selectObjectRepresentation(DfsReader.java:455) org.eclipse.jgit.internal.storage.pack.PackWriter.writeObjectImpl(PackWriter.java:1492) org.eclipse.jgit.internal.storage.pack.PackWriter.writeObject(PackWriter.java:1455) Change-Id: Iad7bbcaed1f11a6aa3b4f5af911a73a34c0fabfd Signed-off-by: Terry Parker <tparker@google.com>	8 years ago
Andrey Loskutov	b5941c74e5	Set "potentialNullReference" to "error" level and fixed all issues There should be no functional change, the logic updated only to make code simple so that compiler can understand what is going for. Removed all @SuppressWarnings("null") annotations since they cannot be used if "org.eclipse.jdt.core.compiler.problem.potentialNullReference" option is set to the "error" level. Bug: 470647 Change-Id: Ie93c249fa46e792198d362e531d5cbabaf41fdc4 Signed-off-by: Andrey Loskutov <loskutov@gmx.de>	9 years ago
Matthias Sohn	bb48901ddf	Silence false potential null pointer access warnings Change-Id: I30c074a1b3436e47316b2eb2f083708161182d32	9 years ago
Shawn Pearce	6e5c71b358	Remove validate support when reusing cached pack Cached packs are only used when writing over the network or to a bundle file and reuse validation is always disabled in these two contexts. The client/consumer of the stream will be SHA-1 checksumming every object. Reuse validation is most critical during local GC to avoid silently ignoring corruption by stopping as soon as a problem is found and leaving everything alone for the end-user to debug and salvage. Cached packs are not supported during local GC as the bitmap rebuild logic does not support including a cached pack in the result. Strip out the validation and force PackWriter to always disable the cached pack feature if reuseValidation is enabled. Change-Id: If0d7baf2ae1bf1f7e71bf773151302c9f7887039	9 years ago
Shawn Pearce	a3476ced1f	Rename copyThroughCache to shouldCopyThroughCache Sensible suggestion from Terry Parker as a late comment on commit `f2efcdc6f7`. Change-Id: I225775bfb6d3d91ae066ff00f9d80a9c02a422c2	9 years ago
Shawn Pearce	1553a6ff55	Add readAheadBuffer hint to ReadableChannel This hint allows an underlying implementation to read more bytes when possible and buffer them locally for future read calls to consume. Change-Id: Ia986a1bb8640eecb91cfbd515c61fa1ff1574a6f	9 years ago
Shawn Pearce	c761c8bb5c	Avoid storing large packs in block cache during reuse When a large pack (> 30% of the block cache) is being reused by copying it pollutes the block cache with noise by storing blocks that are never referenced again. Avoid this by streaming the file directly from its channel onto the output stream. Change-Id: I2e53de27f3dcfb93de68b1fad45f75ab23e79fe7	9 years ago
Shawn Pearce	7ab6ffdf50	Remove AutoCloseable from internal PackFile and friends PackFile is held by the block cache and cannot be auto closed in a try-with-resources statement. Remove the interface as JGit does explicit management of the instances. ObjectDatabase and RefDatabase are internal details of Repository and are managed with the Repository. Marking them AutoCloseable provides no value to the library or an application using the API. Change-Id: Ibee19eadd66233e6666b601583daa1834a7778f1	9 years ago
David Pletcher	cc8c270c9b	Allow public access to PackIndex The index provides access to a list of objects in a pack. This will be helpful for repository integrity checking. Change-Id: I435eeeb3fe1b1f5632d40528936416e97491d412 Signed-off-by: David Pletcher <dpletcher@google.com>	9 years ago
Matthias Sohn	57644f23a1	Provide more details in exceptions thrown when packfile is invalid Mention packfile path in exceptions thrown when we detect that a packfile is invalid and make excplicit that corrupt packs are removed from the pack list. Change-Id: I454ada5f8e69307d3f34d1c1b8f3cb87607ddf35 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	9 years ago
Matthias Sohn	27ae8bc655	Implement AutoClosable interface to support try-with-resources block Bug: 428039 Change-Id: I41880862db5303b5bea4b2184ba7844d69c997b5 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	9 years ago
Colby Ranger	f3b80f1a74	Do not update the ref hot bit when checking isIndexLoaded DfsPackFile.isIndexLoaded() uses the DfsBlockCache.Ref.get() method to check if the index loaded. However, using the get() method marks a hot bit in the cache, which can cause the index to never be unloaded and seem hotter than it really is. Add a has() method which only checks if the value is not null and does not update the hot bit. Change-Id: I7e9ed216f6e273e8f5d79ae573973197654419b4	10 years ago
Shawn Pearce	56497be34d	Delete broken DFS read-ahead support This implementation has been proven to deadlock in production server loads. Google has been running with it disabled for a quite a while, as the bugs have been difficult to identify and fix. Instead of suggesting it works and is useful, drop the code. JGit should not advertise support for functionality that is known to be broken. In a few of the places where read-ahead was enabled by DfsReader there is more information about what blocks should be loaded when. During object representation selection, or size lookup, or sending object as-is to a PackWriter, or sending an entire pack as-is the reader knows exactly which blocks are required in the cache, and it also can compute when those will be needed. The broken read-ahead code was stupid and just read a fixed amount ahead of the current offset, which can waste IOs if more precise data was available. DFS systems are usually slow to respond so read-ahead is still a desired feature, but it needs to be rebuilt from scratch and make better use of the offset information. Change-Id: Ibaed8288ec3340cf93eb269dc0f1f23ab5ab1aea	11 years ago
Shawn Pearce	d72416afbb	Optimize DFS object reuse selection code Rewrite this complicated logic to examine each pack file exactly once. This reduces thrashing when there are many large pack files present and the reader needs to locate each object's header. The intermediate temporary list is now smaller, it is bounded to the same length as the input object list. In the prior version of this code the list contained one entry for every representation of every object being packed. Only one representation object is allocated, reducing the overall memory footprint to be approximately one reference per object found in the current pack file (the pointer in the BlockList). This saves considerable working set memory compared to the prior version that made and held onto a new representation for every ObjectToPack. Change-Id: I2c1f18cd6755643ac4c2cf1f23b5464ca9d91b22	11 years ago
Shawn Pearce	f32b861243	JGit 3.0: move internal classes into an internal subpackage This breaks all existing callers once. Applications are not supposed to build against the internal storage API unless they can accept API churn and make necessary updates as versions change. Change-Id: I2ab1327c202ef2003565e1b0770a583970e432e9	11 years ago
Shawn Pearce	4e9fe58bb5	Avoid looking at UNREACHABLE_GARBAGE for client have lines Clients send a bunch of unknown objects to UploadPack on each round of negotiation. Many of these are not known to the server, which leads the implementation to be looking at indexes for garbage packs. Disable examining the index of a garbage pack, allowing servers to avoid reading them from disk during negotiation. The effect of this change is the server will only ACK a have line if the object was reachable during the last garbage collection, or was recently added to the repository. For most repositories there is no impact in this behavior change. If a repository rewinds a branch, runs GC, and then resets the branch back to where it was before, the now current tip is going to be skipped by this change. A client that has the commit may wind up getting a slightly larger data transfer from the server as an older common ancestor will be chosen during negotiation. This is fixable on the server side by running GC again to correct the layout of objects in pack files. Change-Id: Icd550359ef70fc7b701980f9b13d923fd13c744b	11 years ago
Shawn Pearce	913cccd5c4	Do not attempt to read bitmap from invalid pack If a pack file has been marked invalid due to a prior IOException accessing its contents, do not offer its bitmap index to callers. The pack cannot be used so its bitmap should be off limits from any reader trying to work from a bitmap. Change-Id: Ia44e46558abdddee560bb184158b1e0af9437eee	11 years ago
Shawn Pearce	88c962484f	Rename DfsPackFile getBitmap method to match PackFile There is no reason for these to differ in name. Match the shorter name used by PackFile. Change-Id: I2d3a299069acc5ce276b1b5439ff2258903c6ff3	11 years ago
Colby Ranger	3b325917a5	Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f	11 years ago
Colby Ranger	698705c754	Rename PackConstants to PackExt, a typed pack file extension. PackConstants previously contained string values for the pack and pack index extension. Change PackConstant to be PackExt, a typed wrapper around the string pack file extension. Change-Id: I86ac4db6da8f33aa42d6f37cfcc119e819444318	11 years ago
Colby Ranger	3f0176aea6	Remove getReverseIndexSize() from DfsPackDescription. The method is used in only one location (DfsPackFile). Furthermore, PackIndex already does an explicit computation of the size in DfsPackFile. Simplify the DfsPackDescription by removing the method and do the calculation similar to PackIndex. Change-Id: I1391fdaaf7c2c3226d96ada1ae8647bcdff4794e	11 years ago
Colby Ranger	510a605546	Use file extension with DfsPackDescription get/set file size. Previously the size getters and setters had explicit methods for index and pack. Update the api to be based on the file extension. This will make it possible to support other extensions in the future, such as the forthcoming bitmap extensions. Change-Id: Iab9d4abe0af65b2fc71ad71ef1db0feb6b3b5c58	11 years ago
Colby Ranger	5d3c2b3def	Update DfsObjDatabase API to open/write by pack extension. Previously, the DfsObjDatabase had a hardcoded getPackFile() and getPackIndex() methods which opens a .pack and .idx file, respectively. A future change to add a bitmap index will need to be stored in a parallel .bitmap file. Update the DfsObjDatabase to support opening and writing of files for any pack extension. Change-Id: I7c403b501e242096a2d435f6865d6025a9f86108	11 years ago
Matthias Sohn	05a7113002	Add missing @since tags to mark API added in 2.2 Change-Id: I458167739210214fa54c4b3d62fac5abc82f96f7 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com> Signed-off-by: Chris Aniszczyk <zx@twitter.com>	11 years ago
Colby Ranger	034ff96674	Add the an event and listener for a dfs PackIndex being loaded. The DfsPackFile will fire any static repository listeners on the event just before the PackIndex is loaded. Change-Id: Ie51098106bd5a1a32feae7d2dd068abf02b030ee	11 years ago
Colby Ranger	f6f8bcd9df	Add isIndexLoaded() to DfsPackFile. The method reports whether the index file for the pack has been loaded and cached in memory. Change-Id: Ifa8d63f737458e102cb3d28579c9711d46693d17	11 years ago
Shawn O. Pearce	3534fa9c61	Expose some DFS APIs as public or protected Expose class DfsReader and method DfsPackFile.hasObject() as public. Applications may want to be able to inquire about some details of the storage of a repository. Make this possible by exposing some simple accessor methods. Expose method DfsObjDatabase.clearCache() as protected, allowing implementing subclasses to dump the cache if necessary, and force it to reload on a future request. Change-Id: Ic592c82d45ace9f2fa5f8d7e4bacfdce96dea969	12 years ago
Shawn O. Pearce	6c0d300a54	Fix loading packed objects >2G Parsing the size from a packed object header was incorrectly computing the total inflated length when the length exceeded the range of a Java int. The next 7 bits of size information was shifted left as an int using a shift of 25 bits, placing the higher bits of the size into the sign position. When this size was extended to a long to be added to the current size accumulator the size went negative, resulting in NegativeArraySizeException being thrown. Fix all places where this particular pattern of code is used to read a pack size field, or a binary delta header, as they both use the same variable length encoding scheme. Change-Id: I04008728ed828f18202652c3d5401cf95a441d0a	12 years ago
Robin Rosenberg	95d311f888	Move JGitText to an internal package Change-Id: I763590a45d75f00a09097ab6f89581a3bbd3c797	12 years ago
Dave Borowitz	d55eb35106	Expose the reverse index size in the DfsPackDescription This is analogous to the getPackSize() and getIndexSize() methods. Change-Id: I207c0c93f9145826d84b3610eb4319fca074ee0d	12 years ago

41 Commits (710d00b85e9419c7feaed71a1945351dab486f71)