mirrors/jgit - jgit - source @ dussan.org

Revīziju grafs

Autors	SHA1	Ziņojums	Datums
Christian Halstrick	88b25a58f0	When marking commits as uninteresting don't care if the tree exists When during an ObjectWalk commits are marked as uninteresting we should be tolerant against the situation that the commit exists in the repo but the referenced tree is not exisiting. Since commit `c4797fe986` we are throwing MissingObjectException in such a case. This semantic differs from native git behaviour and may cause push operations to fail while they would work in native git. See: http://dev.eclipse.org/mhonarc/lists/egit-dev/msg03585.html Bug: 445744 Change-Id: Ib7dec10fd2ef1adbb8adbabb9d3d5a64e554286a	pirms 9 gadiem
Robin Rosenberg	05896dabfc	Drop warnings about unchecked casts in a few stable select places Change-Id: Ie163a4940f0d13bbdefd8c4643c0944c71800544	pirms 10 gadiem
Saša Živkov	c4797fe986	Let ObjectWalk.markUninteresting also mark the root tree as uninteresting Using the ObjectWalk and marking a commit as uninteresting didn't mark its root tree as uninteresting. This caused the "missing tree ..." error in Gerrit under special circumstances. For example, if the patch-set 2 changes only the commit message then the patch-set 1 and patch-set 2 share the same root-tree: ps1 -> o o <- ps2 \ / o root-tree The transported pack will contain the ps2 commit but not the root-tree object. When using the BaseReceivePack.setCheckReferencedObjectsAreReachable JGit will check the reachability of all referenced objects not provided in the transported pack. Since the ps1 was advertised it will properly be marked as uninteresting. However, the root-tree was reachable because the ObjectWalk.markUninteresting missed to mark it as uninteresting. JGit was then rejecting the pack with the "missing tree ..." exception. Gerrit-issue: https://code.google.com/p/gerrit/issues/detail?id=1582 Change-Id: Iff2de8810f14ca304e6655fc8debeb8f3e20712b Signed-off-by: Saša Živkov <sasa.zivkov@sap.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	pirms 9 gadiem
Dave Borowitz	7eb0b702fd	Don't set REWRITE flag unless parent rewriting is requested Change-Id: I65e3702ceb6c8854a2c358cfc2c2e3a9fb9486ff	pirms 10 gadiem
Dave Borowitz	eb69cef35c	Rename RewriteTreeFilter to TreeRevFilter and make it public The current behavior of passing a TreeFilter to RevWalk has limited usefulness, since the RevFilter derived from the TreeFilter is always ANDed together with any other RevFilters. It is also tied fairly tightly to the parent rewriting mechanism. Make TreeRevFilter a generic RevFilter that matches modified paths against any TreeFilter. This allows for more complex logic like (modified this path OR authored by this person). Leave the rewrite flag logic in this class, since it's closely tied to the parent comparison code, but hidden behind a protected constructor. Change-Id: Ia72ef591a99415e6f340c5f64583a49c91f1b82f	pirms 10 gadiem
Dave Borowitz	dbf922ce91	RevWalk: Allow disabling parent rewriting Previously, setting any TreeFilter on a RevWalk triggered parent rewriting, which in the current StartGenerator implementation ends up buffering the entire commit history in memory. Aside from causing poor performance on large histories, this does not match the default behavior of `git rev-list`, which does not rewrite parent SHAs unless asked to via --parents/--children. Add a new method setRewriteParents() to RevWalk to disable this behavior. Continue rewriting parents by default to maintain backwards compatibility. Change-Id: I1f38e05526071c75ca58095e312663de5e6f334d	pirms 10 gadiem
Robin Rosenberg	32ff57a2b2	Cleanup javadocs so they pass the java8 doclint checks Bug: 431552 Change-Id: I469316f5645205016e1fa6b0fbd2ff3b509b14bc Signed-off-by: Robin Stocker <robin@nibor.org>	pirms 10 gadiem
Gustaf Lundh	7d5e1f8497	Fixed RevWalk.isMergedInto() returning wrong results Under certain circumstances isMergedInto() returned false even though base is reachable from the tip. This will hinder pushes and receives by falsely detecting "non fast forward" merges. o---o---o---o---o / \ / o---o---A---o---M / / ---2---1- if M (tip) was compared to 1 (base), the method isMergedInto() could still return false, since two mergeBases will be detected and the return statement will only look at one of them: return next() == base; In most cases this would pass, but if "A" is a commit with an old timestamp, the Generator would walk down to "2" before completing the walk pass "A" and first finding the other merge base "1". In this case, the first call to next() returns 2, which compared to base evaluates as false. This is fixed by iterating merge bases and returning true if base is found among them. Change-Id: If2ee1f4270f5ea4bee73ecb0e9c933f8234818da Signed-off-by: Gustaf Lundh <gustaf.lundh@sonymobile.com> Signed-off-by: Sven Selberg <sven.selberg@sonymobile.com>	pirms 10 gadiem
Shawn Pearce	f716ad6d54	Fix StackOverflowError in RevCommit.carryFlags on deep side graphs Copying flags through a graph with deep side branches can cause StackOverflowError. The recursive step to visit the 2nd parent of a merge commit can overflow the stack if these are themselves very deep histories with many branches. Rewrite the loop to iterate up to 500 recursive steps deep before unwinding the stack and running the remaining parts of the graph using a dynamically allocated FIFORevQueue. This approach still allows simple graphs that mostly merge short lived topic branches into master to copy flags with no dynamic memory allocation, relying only on temporary stack extensions. Larger more complex graphs only pay the allocation penalities if copying has to extend outwards "very far" in the graph, which is unlikely for many coloring based algorithms. Change-Id: I1882e6832c916e27dd5f6b7602d9caf66fb39c84	pirms 10 gadiem
Robin Stocker	2852b6a07d	Fix RevWalkUtils.findBranchesReachableFrom not finding some branches The "cut off" optimization causes it to not include branches that contain the specified commit but happen to share commits with a branch that does not contain the commit. An example: -B foo \ -A---C master findBranchesReachableFrom for commit A with both branches as input may not return master (depending on the order of the input). The reason is that A is not contained in foo, and therefore the old code would put B in the cutOff set. When then walking the master commits and B is checked, it is found in the cutOff set and the walk is aborted, causing master not to be returned even though it should. Bug: 425674 Change-Id: I2c0c406ce5fcc9a03538b483473af930d4895d30 Signed-off-by: Robin Stocker <robin@nibor.org> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	pirms 10 gadiem
Christian Halstrick	8352d1729c	Add a missing since tag Otherwise you get errors if you want to edit JGit in Eclipse Change-Id: I840d4388f159e2db27845a17030b511fc5708f43	pirms 10 gadiem
Shawn Pearce	b0174a089c	Fix serving fetch of existing shallow client In certain cases a JGit server updating an existing shallow client selected a common ancestor that was behind the shallow edge of the client. This allowed the server to assume the client had some objects it did not have and allowed creation of pack deltas the client could never inflate. Any commit the client has advertised as shallow must be treated by UploadPack server as though it has no parents. With no parents the walker cannot visit graph history the client does not have, and PackWriter cannot consider delta base candidates the client is lacking. Change-Id: I4922b9354df9f490966a586fb693762e897345a2	pirms 10 gadiem
Robin Rosenberg	ec2202f563	Recognize CRLF when parsing the short message of a commit or tag Bug: 400707 Change-Id: I9b09bb88528af465018fc0278f5441f7e6b75986	pirms 11 gadiem
Robin Rosenberg	a2b33a8ac3	Add NON-NLS comments for some obviously untranslatable strings Change-Id: I2d1076b46695dac84961b8ae663bfc5cb123b3a3 Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>	pirms 11 gadiem
Robin Rosenberg	b0ffacf122	Recognize CRLF when parsing the short message of a commit or tag Bug: 400707 Change-Id: I9b09bb88528af465018fc0278f5441f7e6b75986	pirms 11 gadiem
Dave Borowitz	b0326235e1	Remove unused repository field from RevWalk The comment about legacy Tag and Object types no longer applies, though prior to Idb273d5a92849b42935ac14eed73b796b80aad50 the field was still being used by RewriteTreeFilter. Change-Id: I9ee5da8f8a3b61c9cf543817c03117ee0609dd8f	pirms 11 gadiem
Dave Borowitz	0bdf030b26	Require a DiffConfig when creating a FollowFilter The various rename detection options are an inherent part of the filter, similar to the path being followed. This fixes a potential NPE when a RevWalk with a FollowFilter is created without a Repository, since the old code path tried to get the DiffConfig from the RevWalk's possibly-missing repository. Change-Id: Idb273d5a92849b42935ac14eed73b796b80aad50	pirms 11 gadiem
Robin Stocker	3699ea648e	Document RevTag#getObject() that returned object is unparsed Change-Id: I238d388e40362721eecf37f64ad7d48a399ff129	pirms 11 gadiem
Tomasz Zarna	48f30b8614	Fix @since tags in JGit, version 2.4 never existed Change-Id: Iaca88ec28b412e6b58e7b39a0762ba54b25f9471 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	pirms 11 gadiem
Colby Ranger	dafcb8f6db	Support creating pack bitmap indexes in PackWriter. Update the PackWriter to support writing out pack bitmap indexes, a parallel ".bitmap" file to the ".pack" file. Bitmaps are selected at commits every 1 to 5,000 commits for each unique path from the start. The most recent 100 commits are all bitmapped. The next 19,000 commits have a bitmaps every 100 commits. The remaining commits have a bitmap every 5,000 commits. Commits with more than 1 parent are prefered over ones with 1 or less. Furthermore, previously computed bitmaps are reused, if the previous entry had the reuse flag set, which is set when the bitmap was placed at the max allowed distance. Bitmaps are used to speed up the counting phase when packing, for requests that are not shallow. The PackWriterBitmapWalker uses a RevFilter to proactively mark commits with RevFlag.SEEN, when they appear in a bitmap. The walker produces the full closure of reachable ObjectIds, given the collection of starting ObjectIds. For fetch request, two ObjectWalks are executed to compute the ObjectIds reachable from the haves and from the wants. The ObjectIds needed to be written are determined by taking all the resulting wants AND NOT the haves. For clone requests, we get cached pack support for "free" since it is possible to determine if all of the ObjectIds in a pack file are included in the resulting list of ObjectIds to write. On my machine, the best times for clones and fetches of the linux kernel repository (with about 2.6M objects and 300K commits) are tabulated below: Operation Index V2 Index VE003 Clone 37530ms (524.06 MiB) 82ms (524.06 MiB) Fetch (1 commit back) 75ms 107ms Fetch (10 commits back) 456ms (269.51 KiB) 341ms (265.19 KiB) Fetch (100 commits back) 449ms (269.91 KiB) 337ms (267.28 KiB) Fetch (1000 commits back) 2229ms ( 14.75 MiB) 189ms ( 14.42 MiB) Fetch (10000 commits back) 2177ms ( 16.30 MiB) 254ms ( 15.88 MiB) Fetch (100000 commits back) 14340ms (185.83 MiB) 1655ms (189.39 MiB) Change-Id: Icdb0cdd66ff168917fb9ef17b96093990cc6a98d	pirms 12 gadiem
Gustaf Lundh	212fb3071c	Fix while boundries in DateRevQueue.add() In add(), "low" will never equals "first". This fact should be reflected in the code. Change-Id: I5cab51374e67bd2d3301e5d9dac47c4259b5e562	pirms 11 gadiem
Gustaf Lundh	84afea9179	Performance fixes in DateRevQueue When a lot of commits are added to DateRevQueue, the sort-on-insertion approach is very heavy on CPU cycles. One approach to fix this was made by Dave Borowitz: https://git.eclipse.org/r/#/c/5491/ But using Java's PriorityQueue seems to have brought some extra overhead, and the desired performance could not be reached. This fix takes another approach to the insertion problem, without changing the expected behaviour or bringing extra memory overhead: If we detect over 1000 commits in the DateRevQueue, a "seek-index" is rebuilt every 1000th added commit. The index keeps track of every 100th commit in the DateRevQueue. During insertions, it will be used for a preliminary scanning (binary search) of the queue, with the intention of helping add() find a good starting point to start walking from. After finding this starting point, add() will step commit-by-commit until the correct insertion place in the queue is found (today, the queue is expected to be sorted at all times). When applied to repositories with many refs, this approach has proven to bring huge performance gains and scales quite well. For instance, in a repository with close to 80000 refs, we could cut down the time a typical Gerrit replication of 1 commit would take (just a push from JGit's point of view) from 32sec down to 3.5sec. Below you see some typical times to add a specific amount of commits (with random commit times) to the DateRevQueue and the difference the preliminary seek-index makes: Commits \| Index \| No Index 1024 8ms 8ms 2048 13ms 9ms 4096 5ms 59ms 8192 11ms 595ms 16384 22ms 3058ms 32768 64ms 13811ms 65536 201ms 62677ms 131072 783ms 331585ms Only one extra reference is needed for every 100 inserted commits (and only when we see more than 1000 commits in the queue), so the memory overhead should be negligible. Various index-stepping values were tested, and 100 seemed to scale very well and be effective from start. In the future, it should probably be dynamic and based on the number of refs in the queue, but this should serve well as a starting point. Note: While other fundamentally different data structures may be more suitable, the DateRevQueue is extremely central to many of the Git core operations. This approach was chosen, since the effect of the patch is easy to predict in conjuction with the current implementation. A totally new data structure will make it harder to predict behaviour in many common and uncommon cases (in terms of breaking ties, memory usage, cost when using few elements, object creation/disposing overhead, etc). Change-Id: Ie7b99f40eacf6324bfb4716d82073adeda64d10f	pirms 11 gadiem
Robin Rosenberg	c310fa0c80	Mark non-externalizable strings as such A few classes such as Constanrs are marked with @SuppressWarnings, as are toString() methods with many liternal, but otherwise $NLS-n$ is used for string containing text that should not be translated. A few literals may fall into the gray zone, but mostly I've tried to only tag the obvious ones. Change-Id: I22e50a77e2bf9e0b842a66bdf674e8fa1692f590	pirms 11 gadiem
Marc Strapetz	67edd3eda7	RevWalk support for shallow clones StartGenerator now processes .git/shallow to have the RevWalk stop for shallow commits. See RevWalkShallowTest for tests. Bug: 394543 CQ: 6908 Change-Id: Ia5af1dab3fe9c7888f44eeecab1e1bcf2e8e48fe Signed-off-by: Chris Aniszczyk <zx@twitter.com>	pirms 11 gadiem
Robin Stocker	b7f5ed5612	Add Javadoc description for packages These appear as descriptions in the index, see here (currently empty): http://download.eclipse.org/jgit/docs/latest/apidocs/ Change-Id: If7996deef30ae688bade8b3ad6b19547ca3d8b50 Signed-off-by: Chris Aniszczyk <zx@twitter.com>	pirms 11 gadiem
Robin Stocker	c8ed4a5006	RevWalk: Add link to parseHeaders/parseBody in Javadoc of lookupCommit Change-Id: I7765d1a69d19968ebad603025a9c686f17633ebd	pirms 11 gadiem
Robin Rosenberg	833fd09442	Ignore non-commit refs when in RevWalkUtils.findBranchesReachableFrom This methods is for finding branches only. Change-Id: Ic68b5295ff814401890f0592ae95851554706ca6	pirms 11 gadiem
Robin Rosenberg	d4fed9cb4b	Refactored method to find branches from which a commit is reachable The method uses some heuristics to obtain much better performance than isMergeBase. Since I wrote the relevant code in the method I approve the license change from EPL to EDL implied by the move. Change-Id: Ic4a7584811a2b0bf24e4f6b3eab2a7c022eabee8 Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>	pirms 12 gadiem
Tomasz Zarna	2656ac1b5a	Add "--squash" option to MergeCommand CQ: 6570 Bug: 351806 Change-Id: I5e47810376419264ecf4247b5a333af5c8945080 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	pirms 12 gadiem
Matthias Sohn	59d2ef9470	Enable loading history until a given commit This is needed to allow jumping to a selected commit when loading history incrementally. Change-Id: Id3b97d88d3b4b2d67561b11f8810cb88fe040823 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	pirms 12 gadiem
Matthias Sohn	b14aa4df99	Enable loading history until a given commit This is needed to allow jumping to a selected commit when loading history incrementally. Change-Id: Id3b97d88d3b4b2d67561b11f8810cb88fe040823 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	pirms 12 gadiem
Kevin Sawicki	17fb542e9e	Remove 86 boxing warnings Use Integer, Character, and Long valueOf methods when passing parameters to MessageFormat and other places that expect objects instead of primitives Change-Id: I5942fbdbca6a378136c00d951ce61167f2366ca4	pirms 12 gadiem
Tomasz Zarna	c75aa1aed2	LogCommand#setMaxCount affects all commits Bug: 370132 Change-Id: I9f5ff3640a4f69c0b48c97609728d7672e63e6ab Signed-off-by: Matthias Sohn <matthias.sohn@sap.com> Signed-off-by: Chris Aniszczyk <zx@twitter.com>	pirms 12 gadiem
Robin Rosenberg	95d311f888	Move JGitText to an internal package Change-Id: I763590a45d75f00a09097ab6f89581a3bbd3c797	pirms 12 gadiem
Robin Stocker	69d7d1b0b6	Add RevWalkUtils with count(start, end) method It returns the number of commits that are in start and not in end. Useful for calculating how much a branch is ahead of another one. Change-Id: I09f7d9b049beea417da7ff32c9f8bf0d4ed46a7f Signed-off-by: Robin Stocker <robin@nibor.org> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	pirms 12 gadiem
Tomasz Zarna	1a2ca5b811	Skip a number commits before starting to show the commit output Change-Id: Id2666d897d29b6371f7a6cf241cfda02964b4971 Signed-off-by: Kevin Sawicki <kevin@github.com>	pirms 12 gadiem
Tomasz Zarna	248959146a	Limit the number of commits in LogCommand output Bug: 316680 Change-Id: I88cf7aac6b5763cc94421433dd4bbd42f81e0e69	pirms 12 gadiem
Carsten Pfeiffer	98d4bd6d36	Allow detecting which files were renamed during a revwalk The egit history view shows the files associated with a commit by using a PathFilter. When following renames with a FollowFilter, the PathFilter cannot be configured anymore because the affected files are simply not known. Thus, it should be possible to get to know which files are renamed. Bug: 302549 Change-Id: I4761e9f5cfb4f0ef0b0e1e38991401a1d5003bea	pirms 12 gadiem
Kevin Sawicki	86e96b41e2	Correct typo in RevWalk.parseBody comment Change-Id: I0e65a5a6809a8d32d256322dbcae94b6aa603e5e Signed-off-by: Kevin Sawicki <kevin@github.com>	pirms 12 gadiem
Matt Fischer	9952223e06	Implement server support for shallow clones This implements the server side of shallow clones only (i.e. git-upload-pack), not the client side. CQ: 5517 Bug: 301627 Change-Id: Ied5f501f9c8d1fe90ab2ba44fac5fa67ed0035a4 Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>	pirms 14 gadiem
Shawn O. Pearce	53db854185	Speed up ObjectWalk by 6235 objects/sec The "Counting objects" phase of packing is the most time consuming part for any server providing access to Git repositories. Scanning through the entire project history, including every revision of every tree that has ever existed is expensive and takes an incredible amount of CPU time. Inline the tree parsing logic, unroll a number of loops, and setup to better handle the common case of seeing another occurrence of an object that was already marked SEEN. This change boosts the "Counting objects" phase when JGit is acting as a server and is packing the linux-2.6 repository for its client. Compared to CGit on the same hardware, a JGit daemon server is now 21883 objects/sec faster: CGit: Counted `2058062` objects in 38981 ms at 52796.54 objects/sec Counted `2058062` objects in 38920 ms at 52879.29 objects/sec Counted `2058062` objects in 39059 ms at 52691.11 objects/sec JGit (before): Counted `2058062` objects in 31529 ms at 65275.21 objects/sec Counted `2058062` objects in 30359 ms at 67790.84 objects/sec Counted `2058062` objects in 30033 ms at 68526.69 objects/sec JGit (this commit): Counted `2058062` objects in 28726 ms at 71644.57 objects/sec Counted `2058062` objects in 27652 ms at 74427.24 objects/sec Counted `2058062` objects in 27528 ms at 74762.50 objects/sec Above the first run was a "cold server". For JGit the JVM had just started up with `jgit daemon`, and for CGit we hadn't touched the repository "recently" (but it was certainly in kernel buffer cache). The second and third runs were against the running JGit JVM, allowing timing tests to better reflect the benefits of JGit's pack and index caching, as well as any optimizations the JIT may have performed. The timings are fair. CGit is opening, checking and mmap'ing both the pack and index during the timer. JGit is opening, checking and malloc+read'ing the pack and index data into its Java heap during the timer. Both processes are walking the same graph space, and are computing the "path hash" necessary to sort objects in the object table for delta compression. Since this commit only impacts the "Counting objects" phase, delta compression was obviously not included in the timings and JGit may still be performing delta compression slower than CGit, resulting in an overall slower server experience for clients. Change-Id: Ieb184bfaed8475d6960a494b1f3c870e0382164a Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	pirms 13 gadiem
Shawn O. Pearce	bd970007be	ObjectIdOwnerMap: More lightweight map for ObjectIds OwnerMap is about 200 ms faster than SubclassMap, more friendly to the GC, and uses less storage: testing the "Counting objects" part of PackWriter on `1886362` objects: ObjectIdSubclassMap: load factor 50% table: `4194304` (wasted `2307942`) ms spent 36998 36009 34795 34703 34941 35070 34284 34511 34638 34256 ms avg 34800 (last 9 runs) ObjectIdOwnerMap: load factor 100% table: `2097152` (wasted 210790) directory: 1024 ms spent 36842 35112 34922 34703 34580 34782 34165 34662 34314 34140 ms avg 34597 (last 9 runs) The major difference with OwnerMap is entries must extend from ObjectIdOwnerMap.Entry, where the OwnerMap has injected its own private "next" field into each object. This allows the OwnerMap to use a singly linked list for chaining collisions within a bucket. By putting collisions in a linked list, we gain the entire table back for the SHA-1 bits to index their own "private" slot. Unfortunately this means that each object can appear in at most ONE OwnerMap, as there is only one "next" field within the object instance to thread into the map. For types that are very object map heavy like RevWalk (entity RevObject) and PackWriter (entity ObjectToPack) this is sufficient, these entity types are only put into one map by their container. By introducing a new map type, we don't break existing applications that might be trying to use ObjectIdSubclassMap to track RevCommits they obtained from a RevWalk. The OwnerMap uses less memory. Each object uses 1 reference more (so we're up 1,886,362 references), but the table is 1/2 the size (2^20 rather than 2^21). The table itself wastes only 210,790 slots, rather than 2,307,942. So OwnerMap is wasting 200k fewer references. OwnerMap is more friendly to the GC, because it hardly ever generates garbage. As the map reaches its 100% load factor target, it doubles in size by allocating additional segment arrays of 2048 entries. (So the first grow allocates 1 segment, second 2 segments, third 4 segments, etc.) These segments are hooked into the pre-allocated directory of 1024 spaces. This permits the map to grow to 2 million objects before the directory itself has to grow. By using segments of 2048 entries, we are asking the GC to acquire 8,204 bytes in a 32 bit JVM. This is easier to satisfy then 2,307,942 bytes (for the 512k table that is just an intermediate step in the SubclassMap). By reusing the previously allocated segments (they are re-hashed in-place) we don't release any memory during a table grow. When the directory grows, it does so by discarding the old one and using one that is 4x larger (so the directory goes to 4096 entries on its first grow). A directory of size 4096 can handle up to 8 millon objects. The second directory grow (16384) goes to 33 million objects. At that point we're starting to really push the limits of the JVM heap, but at least its many small arrays. Previously SubclassMap would need a table of `67108864` entries to handle that object count, which needs a single contiguous allocation of 256 MiB. That's hard to come by in a 32 bit JVM. Instead OwnerMap uses 8192 arrays of about 8 KiB each. This is much easier to fit into a fragmented heap. Change-Id: Ia4acf5cfbf7e9b71bc7faa0db9060f6a969c0c50 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	pirms 13 gadiem
Shawn O. Pearce	e757975fcd	RevWalk: Don't release during inMergeBase() In `bc1af8459e` ("RevWalk: Don't reset ObjectReader when stopping") we stopped releasing the reader when the current log traversal is over. This should have also been applied to the merge base logic that is buried within MergeGenerator, but got missed. Change-Id: I8328f43f02cba06fd545e22134872e781b9d4d36 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	pirms 13 gadiem
Shawn O. Pearce	24c1c530db	RevWalk: Avoid unnecessary re-parsing of commit bodies If the RevFilter doesn't actually require the commit body, we shouldn't reparse it if the body was disposed. This happens often inside of UploadPack during common ancestor negotation, the RevWalk is reset and re-run over roughly the same commit space, but the bodies are discarded because the commit message is not relevant to the process. Change-Id: I87b6b6a5fb269669867047698abf718d366bd002 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	pirms 13 gadiem
Shawn O. Pearce	bc1af8459e	RevWalk: Don't reset ObjectReader when stopping Applications like UploadPack reset() and reuse the same RevWalk multiple times in very rapid succession. Releasing the ObjectReader's internal state on each use, only to allocate it again on the next cycle kills performance if the ObjectReader has internal caches, or even if the Inflater gets returned and pulled from the InflaterCache too frequently. Making releasing the ObjectReader the application's responsibility when it is done with the RevWalk, which most already do by wrapping their loop in a try/finally block. Change-Id: I3ad188a719e8d7f6bf27d1a7ca16d465534713f4 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	pirms 13 gadiem
Shawn O. Pearce	5664fb3bfb	UploadPack: Donate parsed commits to PackWriter When UploadPack has computed the merge base between the client's have set and the want set, its already loaded and parsed all of the interesting commits that PackWriter needs to transmit to the client. Switching the RevWalk and its object pool over to be an ObjectWalk saves PackWriter from needing to re-parse these same commits from the ObjectDatabase, reducing the startup latency for the enumeration phase of packing. UploadPack doesn't want to use an ObjectWalk for the okToGiveUp() tests because its slower, during each commit popped it needs to cache the tree into the pendingObjects list, and during each reset() it discards a bunch of ObjectWalk specific state and reallocates some internal collections. ObjectWalk was never meant to be rapidly reset() like UploadPack does, so its perhaps somewhat cleaner to allow "upgrading" a RevWalk to an ObjectWalk. Bug: 301639 Change-Id: I97ef52a0b79d78229c272880aedb7f74d0f7532f Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	pirms 13 gadiem
Shawn O. Pearce	d6b7139cd8	UploadPack: Avoid walking the entire project history If the client presents a common commit on a side branch, and there is a want for a disconnected branch UploadPack was walking back on the entire history of the disconnected branch because it never would find the common commit. Limit our search back along any given want to be no earlier than the oldest common commit received via a "have" line from our client. This prevents us from looking at all of the project history. Bug: 301639 Change-Id: Iffaaa2250907150d6efa1cf2f2fcf59851d5267d Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>	pirms 13 gadiem
Shawn O. Pearce	13bcf05a9e	PackWriter: Make thin packs more efficient There is no point in pushing all of the files within the edge commits into the delta search when making a thin pack. This floods the delta search window with objects that are unlikely to be useful bases for the objects that will be written out, resulting in lower data compression and higher transfer sizes. Instead observe the path of a tree or blob that is being pushed into the outgoing set, and use that path to locate up to WINDOW ancestor versions from the edge commits. Push only those objects into the edgeObjects set, reducing the number of objects seen by the search window. This allows PackWriter to only look at ancestors for the modified files, rather than all files in the project. Limiting the search to WINDOW size makes sense, because more than WINDOW edge objects will just skip through the window search as none of them need to be delta compressed. To further improve compression, sort edge objects into the front of the window list, rather than randomly throughout. This puts non-edges later in the window and gives them a better chance at finding their base, since they search backwards through the window. These changes make a significant difference in the thin-pack: Before: remote: Counting objects: 144190, done remote: Finding sources: 100% (50275/50275) remote: Getting sizes: 100% (101405/101405) remote: Compressing objects: 100% (7587/7587) Receiving objects: 100% (50275/50275), 24.67 MiB \| 9.90 MiB/s, done. Resolving deltas: 100% (40339/40339), completed with 2218 local objects. real 0m30.267s After: remote: Counting objects: 61549, done remote: Finding sources: 100% (50275/50275) remote: Getting sizes: 100% (18862/18862) remote: Compressing objects: 100% (7588/7588) Receiving objects: 100% (50275/50275), 11.04 MiB \| 3.51 MiB/s, done. Resolving deltas: 100% (43160/43160), completed with 5014 local objects. real 0m22.170s The resulting pack is 13.63 MiB smaller, even though it contains the same exact objects. 82,543 fewer objects had to have their sizes looked up, which saved about 8s of server CPU time. 2,796 more objects from the client were used as part of the base object set, which contributed to the smaller transfer size. Change-Id: Id01271950432c6960897495b09deab70e33993a9 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Sigend-off-by: Chris Aniszczyk <caniszczyk@gmail.com>	pirms 13 gadiem
Shawn O. Pearce	c2ab3421a2	ObjectWalk: Fix reset for non-commit objects Non-commits are added to a pending queue, but duplicates are removed by checking a flag. During a reset that flag must be stripped off the old roots, otherwise the caller cannot reuse the old roots after the reset. RevWalk already does this correctly for commits, but ObjectWalk failed to handle the non-commit case itself. Change-Id: I99e1832bf204eac5a424fdb04f327792e8cded4a Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	pirms 13 gadiem
Shawn O. Pearce	065a0a8122	Revert "Teach PackWriter how to reuse an existing object list" This reverts commit `f5fe2dca3c`. I regret adding this feature to the public API. Caches aren't always the best idea, as they require work to maintain. Here the cache is redundant information that must be computed, and when it grows stale must be removed. The redundant information takes up more disk space, about the same size as the pack-*.idx files are. For the linux-2.6 repository, that's more than 40 MB for a 400 MB repository. So the cache is a 10% increase in disk usage. The entire point of this cache is to improve PackWriter performance, and only PackWriter performance, and only when sending an initial clone to a new client. There may be better ways to optimize this, and until we have a solid solution, we shouldn't be using a separate cache in JGit.	pirms 13 gadiem

1 2

86 Revīzijas (stable-3.5)