mirrors/jgit - jgit - source @ dussan.org

Graphique des révisions

Auteur	SHA1	Message	Date
Michael Keppler	2fc00af44e	refactor: simplify collection.toArray() On recent VMs, collection.toArray(new T[0]) is faster than collection.toArray(new T[collection.size()]). Since it is also more readable, it should now be the preferred way of collection to array conversion. https://shipilev.net/blog/2016/arrays-wisdom-ancients/ Change-Id: I80388532fb4b2b0663ee1fe8baa94f5df55c8442 Signed-off-by: Michael Keppler <Michael.Keppler@gmx.de>	il y a 5 ans
Matthias Sohn	5480da5999	Fix javadoc in org.eclipse.jgit storage/file package Change-Id: Ieb2f66aef2cab7e2a6d8e35c5f5047da881994dd Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	il y a 6 ans
Hector Caballero	4334b27d3c	ObjectDirectory: Add pack directory getter So far, in order to get the pack directory it was necessary to resolve it from the object directory. This resolution is already done when creating the object directory, so simplify the call by just adding a getter to the pack directory. Change-Id: I69e783141dc6739024e8b3d5acc30843edd651a7 Signed-off-by: Hector Caballero <hector.caballero@ericsson.com>	il y a 6 ans
Shawn Pearce	6e5c71b358	Remove validate support when reusing cached pack Cached packs are only used when writing over the network or to a bundle file and reuse validation is always disabled in these two contexts. The client/consumer of the stream will be SHA-1 checksumming every object. Reuse validation is most critical during local GC to avoid silently ignoring corruption by stopping as soon as a problem is found and leaving everything alone for the end-user to debug and salvage. Cached packs are not supported during local GC as the bitmap rebuild logic does not support including a cached pack in the result. Strip out the validation and force PackWriter to always disable the cached pack feature if reuseValidation is enabled. Change-Id: If0d7baf2ae1bf1f7e71bf773151302c9f7887039	il y a 9 ans
Shawn Pearce	f32b861243	JGit 3.0: move internal classes into an internal subpackage This breaks all existing callers once. Applications are not supposed to build against the internal storage API unless they can accept API churn and make necessary updates as versions change. Change-Id: I2ab1327c202ef2003565e1b0770a583970e432e9	il y a 11 ans
Shawn Pearce	3760e4319b	Remove cached_packs support in favor of bitmaps The bitmap code in PackWriter knows exactly when to use a pack as a "cached pack". It enables cached pack usage only when the pack has a bitmap and its entire closure of objects needs to be sent. This is a much simpler code path to maintain, and JGit actually has a way to write the necessary index. Change-Id: I2645d482f8733fdf0c4120cc59ba9aa4d4ba6881	il y a 11 ans
Colby Ranger	dafcb8f6db	Support creating pack bitmap indexes in PackWriter. Update the PackWriter to support writing out pack bitmap indexes, a parallel ".bitmap" file to the ".pack" file. Bitmaps are selected at commits every 1 to 5,000 commits for each unique path from the start. The most recent 100 commits are all bitmapped. The next 19,000 commits have a bitmaps every 100 commits. The remaining commits have a bitmap every 5,000 commits. Commits with more than 1 parent are prefered over ones with 1 or less. Furthermore, previously computed bitmaps are reused, if the previous entry had the reuse flag set, which is set when the bitmap was placed at the max allowed distance. Bitmaps are used to speed up the counting phase when packing, for requests that are not shallow. The PackWriterBitmapWalker uses a RevFilter to proactively mark commits with RevFlag.SEEN, when they appear in a bitmap. The walker produces the full closure of reachable ObjectIds, given the collection of starting ObjectIds. For fetch request, two ObjectWalks are executed to compute the ObjectIds reachable from the haves and from the wants. The ObjectIds needed to be written are determined by taking all the resulting wants AND NOT the haves. For clone requests, we get cached pack support for "free" since it is possible to determine if all of the ObjectIds in a pack file are included in the resulting list of ObjectIds to write. On my machine, the best times for clones and fetches of the linux kernel repository (with about 2.6M objects and 300K commits) are tabulated below: Operation Index V2 Index VE003 Clone 37530ms (524.06 MiB) 82ms (524.06 MiB) Fetch (1 commit back) 75ms 107ms Fetch (10 commits back) 456ms (269.51 KiB) 341ms (265.19 KiB) Fetch (100 commits back) 449ms (269.91 KiB) 337ms (267.28 KiB) Fetch (1000 commits back) 2229ms ( 14.75 MiB) 189ms ( 14.42 MiB) Fetch (10000 commits back) 2177ms ( 16.30 MiB) 254ms ( 15.88 MiB) Fetch (100000 commits back) 14340ms (185.83 MiB) 1655ms (189.39 MiB) Change-Id: Icdb0cdd66ff168917fb9ef17b96093990cc6a98d	il y a 11 ans
Robin Rosenberg	c310fa0c80	Mark non-externalizable strings as such A few classes such as Constanrs are marked with @SuppressWarnings, as are toString() methods with many liternal, but otherwise $NLS-n$ is used for string containing text that should not be translated. A few literals may fall into the gray zone, but mostly I've tried to only tag the obvious ones. Change-Id: I22e50a77e2bf9e0b842a66bdf674e8fa1692f590	il y a 11 ans
Shawn O. Pearce	9f5bbb5dd4	PackWriter: Speed up pruning of objects from cached packs During object enumeration for the thin pack, very few objects come out that are duplicated with the cached pack. Typically these are only cases where a blob or tree was cherry-picked forward, got a copy or rename, or was reverted... all relatively infrequent events. Speed up pruning of the thin pack object list by combining the phase with the object representation selection. Implementers should already be offering to reuse the object from the cached pack if it is stored there, at which point the implementation can perform a very fast type of containment test using the cached pack's identity rather than yet another index lookup. For the local disk case this is probably not a big improvement, but it does help on the DHT implementation where the two passes combined into one reduces latency. Change-Id: I6a07fc75d9075bf6233e967360b6546f9e9a2b33 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	il y a 13 ans
Shawn O. Pearce	a468cb57c2	PackWriter: Validate reused cached packs If object reuse validation is enabled, the output pack is going to probably be stored locally. When reusing an existing cached pack to save object enumeration costs, ensure the cached pack has not been corrupted by checking its SHA-1 trailer. If it has, writing will abort and the output pack won't be complete. This prevents anyone from trying to use the output pack, and catches corruption before it can be carried any further. Change-Id: If89d0d4e429d9f4c86f14de6c0020902705153e6 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	il y a 13 ans
Shawn O. Pearce	461b012e95	PackWriter: Support reuse of entire packs The most expensive part of packing a repository for transport to another system is enumerating all of the objects in the repository. Once this gets to the size of the linux-2.6 repository (1.8 million objects), enumeration can take several CPU minutes and costs a lot of temporary working set memory. Teach PackWriter to efficiently reuse an existing "cached pack" by answering a clone request with a thin pack followed by a larger cached pack appended to the end. This requires the repository owner to first construct the cached pack by hand, and record the tip commits inside of $GIT_DIR/objects/info/cached-packs: cd $GIT_DIR root=$(git rev-parse master) tmp=objects/.tmp-$$ names=$(echo $root \| git pack-objects --keep-true-parents --revs $tmp) for n in $names; do chmod a-w $tmp-$n.pack $tmp-$n.idx touch objects/pack/pack-$n.keep mv $tmp-$n.pack objects/pack/pack-$n.pack mv $tmp-$n.idx objects/pack/pack-$n.idx done (echo "+ $root"; for n in $names; do echo "P $n"; done; echo) >>objects/info/cached-packs git repack -a -d When a clone request needs to include $root, the corresponding cached pack will be copied as-is, rather than enumerating all of the objects that are reachable from $root. For a linux-2.6 kernel repository that should be about 376 MiB, the above process creates two packs of 368 MiB and 38 MiB[1]. This is a local disk usage increase of ~26 MiB, due to reduced delta compression between the large cached pack and the smaller recent activity pack. The overhead is similar to 1 full copy of the compressed project sources. With this cached pack in hand, JGit daemon completes a clone request in 1m17s less time, but a slightly larger data transfer (+2.39 MiB): Before: remote: Counting objects: 1861830, done remote: Finding sources: 100% (1861830/1861830) remote: Getting sizes: 100% (88243/88243) remote: Compressing objects: 100% (88184/88184) Receiving objects: 100% (1861830/1861830), 376.01 MiB \| 19.01 MiB/s, done. remote: Total `1861830` (delta 4706), reused `1851053` (delta `1553844`) Resolving deltas: 100% (1564621/1564621), done. real 3m19.005s After: remote: Counting objects: 1601, done remote: Counting objects: 1828460, done remote: Finding sources: 100% (50475/50475) remote: Getting sizes: 100% (18843/18843) remote: Compressing objects: 100% (7585/7585) remote: Total `1861830` (delta 2407), reused `1856197` (delta 37510) Receiving objects: 100% (1861830/1861830), 378.40 MiB \| 31.31 MiB/s, done. Resolving deltas: 100% (1559477/1559477), done. real 2m2.938s Repository owners can periodically refresh their cached packs by repacking their repository, folding all newer objects into a larger cached pack. Since repacking is already considered to be a normal Git maintenance activity, this isn't a very big burden. [1] In this test $root was set back about two weeks. Change-Id: Ib87131d5c4b5e8c5cacb0f4fe16ff4ece554734b Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	il y a 13 ans
Shawn O. Pearce	ea21c111cb	Move PackWriter over to storage.pack.PackWriter Similar to what we did with the file code, move the pack writer into its own package so the related classes and their package private methods are hidden from the rest of the library. Change-Id: Ic1b5c7c8c8d266e90c910d8d68dfc8e93586854f Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	il y a 14 ans
Shawn O. Pearce	86547022f0	Tighten up local packed object representation during packing Rather than making a loader, and then using that to fill the object representation, parse the header and set up our data directly. This saves some time, as we don't waste cycles on information we won't use right now. The weight computed for a representation is now its actual stored size in the pack file, rather than its inflated size. This accounts for changes made when the compression level is modified on the repository. It is however more costly to determine the weight of the object, since we have to find its length in the pack. To try and recover that cost we now cache the length as part of our ObjectToPack record, so it doesn't have to be found during the output phase. A LocalObjectToPack now costs us (assuming 32 bit pointers): (32 bit) (64 bit) vm header: 8 bytes 8 bytes ObjectId: 20 bytes 20 bytes PackedObjectInfo: 12 bytes 12 bytes ObjectToPack: 8 bytes 12 bytes LocalOTP: 20 bytes 24 bytes ----------- --------- 68 bytes 74 bytes Change-Id: I923d2736186eb2ac8ab498d3eb137e17930fcb50 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	il y a 14 ans
Shawn O. Pearce	ad5238dc67	Move FileRepository to storage.file.FileRepository This move isolates all of the local file specific implementation code into a single package, where their package-private methods and support classes are properly hidden away from the rest of the core library. Because of the sheer number of files impacted, I have limited this change to only the renames and the updated imports. Change-Id: Icca4884e1a418f83f8b617d0c4c78b73d8a4bd17 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	il y a 14 ans
Shawn O. Pearce	bf4ffff07f	Redo PackWriter object reuse selection The new selection implementation uses a public API on the ObjectReader, allowing the storage library to enumerate its candidates and select the best one for this packer without needing to build a temporary list of the candidates first. Change-Id: Ie01496434f7d3581d6d3bbb9e33c8f9fa649b6cd Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	il y a 14 ans
Shawn O. Pearce	8a9844b2af	Redo event listeners to be more generic Replace the old crude event listener system with a much more generic implementation, patterned after the event dispatch techniques used in Google Web Toolkit 1.5 and later. Each event delivers to an interface that defines a single method, and the event itself is what performs the delivery in a type-safe way through its own dispatch method. Listeners are registered in a generic listener list, indexed by the interface they implement and wish to receive an event for. Delivery of events is performed by looping through all listeners implementing the event's corresponding listener interface, and using the event's own dispatch method to deliver the event. This is the classical "double dispatch" pattern for event delivery. Listeners can be unregistered by invoking remove() on their registration handle. This change therefore requires application code to track the handle if it wishes to remove the listener at a later point in time. Event delivery is now exposed as a generic public method on the Repository class, making it easier for any type of message to be sent out to any type of listener that has registered, without needing to pre-arrange for type-safe fireFoo() methods. New event types can be added in the future simply by defining a new RepositoryEvent subclass and a corresponding RepositoryListener interface that it dispatches to. By always adding new events through a new interface, we never need to worry about defining an Adapter to provide default no-op implementations of new event methods. Change-Id: I651417b3098b9afc93d91085e9f0b2265df8fc81 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	il y a 14 ans
Git Development Community	1a6964c827	Initial JGit contribution to eclipse.org Per CQ 3448 this is the initial contribution of the JGit project to eclipse.org. It is derived from the historical JGit repository at commit `3a2dd9921c`. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	il y a 14 ans

5 Révisions (2fc00af44e5658cf34200ad20d777a75333a0564)