Ivan Frade [Tue, 4 Jan 2022 19:10:04 +0000 (11:10 -0800)]
PackConfig: add entry for minimum size to index
The object size index can have up to #(blobs-in-repo) entries, taking
a relevant amount of memory. Let operators configure the threshold size
to include objects in the size index.
The index will include objects with size *at or above* this
value (with -1 for none). This is more effective for the
filter-by-size case.
Lowering the threshold adds more objects to the index. This improves
performance at the cost of memory/storage space. For the object-size
case, more calls will use the index instead of reading IO. For the
filter-by-size case, lower threshold means better granularity (if
ObjectReader#isSmallerThan is implemented based only on the index).
Ivan Frade [Wed, 15 Dec 2021 00:36:54 +0000 (16:36 -0800)]
PackObjectSizeIndex: interface and impl for the object-size index
Operations like "clone --filter=blob:limit=N" or the "object-info"
command need to read the size of the objects from the storage. An
index would provide those sizes at once rather than having to seek in
the packfile.
Introduce an interface for the Object-size index. This index returns
the inflated size of an object. Not all objects could be indexed (to
limit memory usage).
This implementation indexes only blobs (no trees, nor
commits) *above* certain size threshold (configurable). Lower
threshold adds more objects to the index, consumes more memory and
provides better performance. 0 means "all blobs" and -1 "disabled".
If we don't index everything, for the filter use case is more
efficient to index the biggest objects first: the set is small and
most objects are filtered by NOT being in the index. For the
object-size, the more objects in the index the better, regardless
their size. All together, it is more helpful to index above threshold.
Ivan Frade [Fri, 10 Feb 2023 20:21:01 +0000 (12:21 -0800)]
UInt24Array: Array of unsigned ints encoded in 3 bytes.
The object size index stores positions of objects in the main
index (when ordered by sha1). These positions are per-pack and usually
a pack has <16 million objects (there are exceptions but rather
rare). It could save some memory storing these positions in three bytes
instead of four. Note that these positions are sorted and always positive.
Implement a wrapper around a byte[] to access and search "ints" while
they are stored as unsigned 3 bytes.
Ivan Frade [Tue, 17 Jan 2023 18:01:29 +0000 (10:01 -0800)]
PackIndex: expose the position of an object-id in the index
The primary index returns the offset in the pack for an
objectId. Internally it keeps the object-ids in lexicographical order,
but doesn't expose an API to find the position of an object-id in that
list. This is needed for the object-size index, that we want to store
as "position-in-idx, size".
Add a #findPosition(object-id) method to the PackIndex interface to
know where an object-id sits in the ordered list of ids in the pack.
Note that this index position is over the list of ordered object-ids,
while reverse-index position is over the list of objects in packed
order.
Xing Huang [Mon, 6 Feb 2023 20:18:59 +0000 (14:18 -0600)]
DfsPackFile/DfsGC: Write commit graphs and expose in pack
JGit knows how to read/write commit graphs but the DFS stack is not
using it yet.
The DFS garbage collector generates a commit-graph with commits
reachable from any ref. The pack is stored as extra stream in the GC
pack. DfsPackFile mimicks how other indices are loaded storing the
reference in DFS cache.
Xing Huang [Mon, 6 Feb 2023 20:18:16 +0000 (14:18 -0600)]
ObjectReader: Allow getCommitGraph to throw IOException
ObjectReader#getCommitGraph doesn't report errors loading the
commit graph. The caller should be aware of the situation and
ultimately decide what to do.
Add IOException to ObjectReader#getCommitGraph signature. RevWalk
defaults to an empty commit-graph on IO errors.
This change only affects inCore repositories.
Before this change, any file that wasn't part of the patch
wasn't read, and therefore wasn't part of the output tree.
Change-Id: I246ef957088f17aaf367143f7a0b3af0f8264ffb
Bug: Google b/267270348
Matthias Sohn [Wed, 1 Feb 2023 00:31:32 +0000 (01:31 +0100)]
Merge branch 'master' into stable-6.5
* master:
Shortcut during git fetch for avoiding looping through all local refs
FetchCommand: fix fetchSubmodules to work on a Ref to a blob
Silence API warnings introduced by I466dcde6
Allow the exclusions of refs prefixes from bitmap
PackWriterBitmapPreparer: do not include annotated tags in bitmap
BatchingProgressMonitor: avoid int overflow when computing percentage
[pgm] Fetch-CLI: add support for shallow
Speedup GC listing objects referenced from reflogs
Re-add servlet-api 4.0 to the target platform
Upgrade maven plugins
Cache trustFolderStat/trustPackedRefsStat value per-instance
Refresh 'objects' dir and retry if a loose object is not found
FileSnapshotTest: Add more MISSING_FILE coverage
Matthias Sohn [Wed, 1 Feb 2023 00:21:11 +0000 (01:21 +0100)]
Merge branch 'stable-6.4'
* stable-6.4:
Shortcut during git fetch for avoiding looping through all local refs
FetchCommand: fix fetchSubmodules to work on a Ref to a blob
Silence API warnings introduced by I466dcde6
Allow the exclusions of refs prefixes from bitmap
PackWriterBitmapPreparer: do not include annotated tags in bitmap
BatchingProgressMonitor: avoid int overflow when computing percentage
Speedup GC listing objects referenced from reflogs
FileSnapshotTest: Add more MISSING_FILE coverage
Matthias Sohn [Wed, 1 Feb 2023 00:12:06 +0000 (01:12 +0100)]
Merge branch 'stable-6.3' into stable-6.4
* stable-6.3:
Shortcut during git fetch for avoiding looping through all local refs
FetchCommand: fix fetchSubmodules to work on a Ref to a blob
Silence API warnings introduced by I466dcde6
Allow the exclusions of refs prefixes from bitmap
PackWriterBitmapPreparer: do not include annotated tags in bitmap
BatchingProgressMonitor: avoid int overflow when computing percentage
Speedup GC listing objects referenced from reflogs
FileSnapshotTest: Add more MISSING_FILE coverage
Matthias Sohn [Tue, 31 Jan 2023 23:59:32 +0000 (00:59 +0100)]
Merge branch 'stable-6.2' into stable-6.3
* stable-6.2:
Shortcut during git fetch for avoiding looping through all local refs
FetchCommand: fix fetchSubmodules to work on a Ref to a blob
Silence API warnings introduced by I466dcde6
Allow the exclusions of refs prefixes from bitmap
PackWriterBitmapPreparer: do not include annotated tags in bitmap
BatchingProgressMonitor: avoid int overflow when computing percentage
Speedup GC listing objects referenced from reflogs
FileSnapshotTest: Add more MISSING_FILE coverage
Matthias Sohn [Tue, 31 Jan 2023 23:44:41 +0000 (00:44 +0100)]
Merge branch 'stable-6.1' into stable-6.2
* stable-6.1:
Shortcut during git fetch for avoiding looping through all local refs
FetchCommand: fix fetchSubmodules to work on a Ref to a blob
Silence API warnings introduced by I466dcde6
Allow the exclusions of refs prefixes from bitmap
PackWriterBitmapPreparer: do not include annotated tags in bitmap
BatchingProgressMonitor: avoid int overflow when computing percentage
Speedup GC listing objects referenced from reflogs
FileSnapshotTest: Add more MISSING_FILE coverage
Matthias Sohn [Tue, 31 Jan 2023 23:38:52 +0000 (00:38 +0100)]
Merge branch 'stable-6.0' into stable-6.1
* stable-6.0:
Shortcut during git fetch for avoiding looping through all local refs
FetchCommand: fix fetchSubmodules to work on a Ref to a blob
Silence API warnings introduced by I466dcde6
Allow the exclusions of refs prefixes from bitmap
PackWriterBitmapPreparer: do not include annotated tags in bitmap
BatchingProgressMonitor: avoid int overflow when computing percentage
Speedup GC listing objects referenced from reflogs
FileSnapshotTest: Add more MISSING_FILE coverage
Matthias Sohn [Tue, 31 Jan 2023 23:30:52 +0000 (00:30 +0100)]
Merge branch 'stable-5.13' into stable-6.0
* stable-5.13:
Shortcut during git fetch for avoiding looping through all local refs
FetchCommand: fix fetchSubmodules to work on a Ref to a blob
Silence API warnings introduced by I466dcde6
Allow the exclusions of refs prefixes from bitmap
PackWriterBitmapPreparer: do not include annotated tags in bitmap
BatchingProgressMonitor: avoid int overflow when computing percentage
Speedup GC listing objects referenced from reflogs
FileSnapshotTest: Add more MISSING_FILE coverage
Luca Milanesio [Wed, 18 May 2022 12:31:30 +0000 (13:31 +0100)]
Shortcut during git fetch for avoiding looping through all local refs
The FetchProcess needs to verify that all the refs received point
to objects that are reachable from the local refs, which could be
very expensive but is needed to avoid missing objects exceptions
because of broken chains.
When the local repository has a lot of refs (e.g. millions) and the
client is fetching a non-commit object (e.g. refs/sequences/changes in
Gerrit) the reachability check on all local refs can be very expensive
compared to the time to fetch the remote ref.
Example for a 2M refs repository:
- fetching a single non-commit object: 50ms
- checking the reachability of local refs: 30s
A ref pointing to a non-commit object doesn't have any parent or
successor objects, hence would never need to have a reachability check
done. Skipping the askForIsComplete() altogether would save the 30s
time spent in an unnecessary phase.
Matthias Sohn [Tue, 4 Oct 2022 13:42:25 +0000 (15:42 +0200)]
FetchCommand: fix fetchSubmodules to work on a Ref to a blob
FetchCommand#fetchSubmodules assumed that FETCH_HEAD can always be
parsed as a tree. This isn't true if it refers to a Ref referring to a
BLOB. This is e.g. used in Gerrit for Refs like refs/sequences/changes
which are used to implement sequences stored in git.
Luca Milanesio [Tue, 20 Dec 2022 21:50:19 +0000 (21:50 +0000)]
Allow the exclusions of refs prefixes from bitmap
When running a GC.repack() against a repository with over one
thousands of refs/heads and tens of millions of ObjectIds,
the calculation of all bitmaps associated with all the refs
would result in an unreasonable big file that would take up to
several hours to compute.
Test scenario: repo with 2500 heads / 10M obj Intel Xeon E5-2680 2.5GHz
Before this change: 20 mins
After this change and 2300 heads excluded: 10 mins (90s for bitmap)
Having such a large bitmap file is also slow in the runtime
processing and have negligible or even negative benefits, because
the time lost in reading and decompressing the bitmap in memory
would not be compensated by the time saved by using it.
It is key to preserve the bitmaps for those refs that are mostly
used in clone/fetch and give the ability to exlude some refs
prefixes that are known to be less frequently accessed, even
though they may actually be actively written.
Example: Gerrit sandbox branches may even be actively
used and selected automatically because its commits are very
recent, however, they may bloat the bitmap, making it ineffective.
A mono-repo with tens of thousands of developers may have
a relatively small number of active branches where the
CI/CD jobs are continuously fetching/cloning the code. However,
because Gerrit allows the use of sandbox branches, the
total number of refs/heads may be even tens to hundred
thousands.
Luca Milanesio [Wed, 28 Dec 2022 01:09:52 +0000 (01:09 +0000)]
PackWriterBitmapPreparer: do not include annotated tags in bitmap
The annotated tags should be excluded from the bitmap associated
with the heads-only packfile. However, this was not happening
because of the check of exclusion of the peeled object instead
of the objectId to be excluded from the bitmap.
When creating a bitmap for the above commit graph, before this
change all the commits are included (3 bitmaps), which is
incorrect, because all commits reachable from annotated tags
should not be included.
The heads-only bitmap should include only commit0 and commit1
but because PackWriterBitPreparer was checking for the peeled
pointer of tag1 to be excluded (commit2) which was not found in
the list of tags to exclude (annotated-tag1), the commit2 was
included, even if it wasn't reachable only from the head.
Add an additional check for exclusion of the original objectId
for allowing the exclusion of annotated tags and their pointed
commits. Add one specific test associated with an annotated tag
for making sure that this use-case is covered also.
Example repository benchmark for measuring the improvement:
# refs: 400k (2k heads, 88k tags, 310k changes)
# objects: 11M (88k of them are annotate tags)
# packfiles: 2.7G
Before this change:
GC time: 5h
clone --bare time: 7 mins
After this change:
GC time: 20 mins
clone --bare time: 3 mins
Harald Weiner [Mon, 24 Oct 2022 21:07:27 +0000 (23:07 +0200)]
[pgm] Fetch-CLI: add support for shallow
This adds support for shallow cloning. The CloneCommand and the
FetchCommand now have the new options --depth, --shallow-since and
--shallow-exclude to tell the server that the client doesn't want to
download the complete history.
Matthias Sohn [Wed, 18 Jan 2023 16:39:19 +0000 (17:39 +0100)]
Speedup GC listing objects referenced from reflogs
GC needs to get a ReflogReader for all existing refs to list all objects
referenced from reflogs. The existing Repository#getReflogReader method
accepts the ref name and then resolves the Ref to create a ReflogReader.
GC calling that for a huge number of Refs one by one is very slow. GC
first gets all Refs in bulk and then calls getReflogReader for each of
them.
Fix this by adding another getReflogReader method to Repository which
accepts a Ref directly.
This speeds up running JGit gc on a mirror clone of the Gerrit
repository from 15:36 min to 1:08 min. The repository used in this test
had 45k refs, 275k commits and 1.2m git objects.
Michael Keppler [Fri, 6 Jan 2023 17:33:00 +0000 (18:33 +0100)]
Upgrade maven plugins
Remove tycho-extras-version, because Tycho and Tycho Extras are
meanwhile in a single repository and maintained together.
Update
- build-helper-maven-plugin to 3.3.0
- eclipse-jarsigner-plugin to 1.3.5
- jacoco-maven-plugin to 0.8.8
- japicmp to 0.17.1
- maven-antrun-plugin to 3.1.0
- maven-clean-plugin to 3.2.0
- maven-compiler-plugin to 3.10.1
- maven-dependency-plugin to 3.5.0
- maven-deploy-plugin to 3.0.0
- maven-enforcer-plugin to 3.1.0
- maven-install-plugin to 3.1.0
- maven-jar-plugin to 3.3.0
- maven-javadoc-plugin to 3.4.1
- maven-jxr-plugin to 3.3.0
- maven-pmd-plugin to 3.20.0
- maven-project-info-reports-plugin to 3.4.2
- maven-resources-plugin to 3.3.0
- maven-shade-plugin to 3.4.1
- maven-site-plugin to 4.0.0-M4
- maven-surefire-plugin to 3.0.0-M8
- spotbugs-maven-plugin to 4.7.3.0
- spring-boot-maven-plugin to 2.7.7
Nasser Grainawi [Tue, 10 Jan 2023 23:15:42 +0000 (16:15 -0700)]
Cache trustFolderStat/trustPackedRefsStat value per-instance
Instead of re-reading the config every time the methods using these
values were called, cache the config value at the time of instance
construction. Caching the values improves performance for each of the
method calls. These configs are set based on the filesystem storing the
repository and unlikely to change while an application is running.
Refresh 'objects' dir and retry if a loose object is not found
A new loose object may not be immediately visible on a NFS
client if it was created on another client. Refreshing the
'objects' dir and trying again can help work around the NFS
behavior.
Here's an E2E problem that this change can help fix. Consider
a Gerrit multi-primary setup with repositories based on NFS.
Add a new patch-set to an existing change and then immediately
fetch the new patch-set of that change. If the fetch is handled
by a Gerrit primary different that the one which created the
patch-set, then we sometimes run into a MissingObjectException
that causes the fetch to fail.
Matthias Sohn [Wed, 11 Jan 2023 14:46:02 +0000 (15:46 +0100)]
Update Orbit to S20230101190934
and update
- com.google.gson to 2.10.0.v20221207-1049"
- org.apache.commons.compress to 1.22.0.v20221207-1049
- org.apache.httpcomponents.httpclient to 4.5.14.v20221207-1049
- org.apache.httpcomponents.httpcore to 4.4.16.v20221207-1049
RevWalk: integrate commit-graph with commit parsing
RevWalk#createCommit() will inspect the commit-graph file to find the
specified object's graph position and then return a new RevCommitCG
instance.
RevCommitGC is a RevCommit with an additional "pointer" (the position)
to the commit-graph, so it can load the headers and metadata from there
instead of the pack. This saves IO access in walks where the body is not
needed (i.e. #isRetainBody is false and #parseBody is not invoked).
RevWalk uses automatically the commit-graph if available, no action
needed from callers. The commit-graph is fetched on first access from
the reader (that internally can keep it loaded and reuse it between
walks).
The startup cost of reading the entire commit graph is small. After
testing, reading a commit-graph with 1 million commits takes less than
50ms. If we use RepositoryCache, it will not be initialized util the
commit-graph is rewritten.
kylezhao [Fri, 6 Jan 2023 07:24:14 +0000 (15:24 +0800)]
GC: disable writing commit-graph for shallow repos
In shallow repos, GC writes to the commit-graph that shallow commits
do not have parents. This won't be true after a "git fetch --unshallow"
(and before another GC).
Do not write the commit-graph from shallow clones of a repo. The
commit-graph must have the real metadata of commits and that is not
available in a shallow view of the repo.
Currently, we always read packed-refs file when 'trustFolderStat'
is false. Introduce a new config 'trustPackedRefsStat' which takes
precedence over 'trustFolderStat' when reading packed refs. Possible
values for this new config are:
* always: Trust packed-refs file attributes
* after_open: Same as 'always', but refresh the file attributes of
packed-refs before trusting it
* never: Always read the packed-refs file
* unset: Fallback to 'trustFolderStat' to determine if the file
attributes of packed-refs can be trusted
Folks whose repositories are on NFS and have traditionally been
setting 'trustFolderStat=false' can now get some performance improvement
with 'trustPackedRefsStat=after_open' as it refreshes the file
attributes of packed-refs (at least on some NFS clients) before
considering it.
For example, consider a repository on NFS with ~500k packed-refs. Here
are some stats which illustrate the improvement with this new config
when reading packed refs on NFS:
Matthias Sohn [Tue, 5 Oct 2021 02:08:49 +0000 (04:08 +0200)]
Add TernarySearchTree
A ternary search tree is a type of tree where nodes are arranged in a
manner similar to a binary search tree, but with up to three children
rather than the binary tree's limit of two.
Each node of a ternary search tree stores a single character, a
reference to a value object and references to its three children named
equal kid, lo kid and hi kid. The lo kid pointer must point to a node
whose character value is less than the current node. The hi kid pointer
must point to a node whose character is greater than the current
node.[1] The equal kid points to the next character in the word. Each
node in a ternary search tree represents a prefix of the stored strings.
All strings in the middle subtree of a node start with that prefix.
Like other prefix trees, a ternary search tree can be used as an
associative map with the ability for incremental string search. Ternary
search trees are more space efficient compared to standard prefix trees,
at the cost of speed.
They allow efficient prefix search which is important to implement
searching refs by prefix in a RefDatabase.
Searching by prefix returns all keys if the prefix is an empty string.
Thomas Wolf [Fri, 16 Dec 2022 22:23:03 +0000 (23:23 +0100)]
PatchApplier: fix handling of last newline in text patch
If the last line came from the patch, use the patch to determine whether
or not there should be a trailing newline. Otherwise use the old text.
Add test cases for
- no newline at end, last line not in patch hunk
- no newline at end, last line in patch hunk
- patch removing the last newline
- patch adding a newline at the end of file not having one
all for core.autocrlf false, true, and input.
Add a test case where the "no newline" indicator line is not the last
line of the last hunk. This can happen if the patch ends with removals
at the file end.
Bug: 581234
Change-Id: I09d079b51479b89400ad300d0662c1dcb50deab6 Also-by: Yuriy Mitrofanov <a2terminator@mail.ru> Signed-off-by: Thomas Wolf <twolf@apache.org>
CommitGraph: add commit-graph for FileObjectDatabase
This change makes JGit can read .git/objects/info/commit-graph file
and then get CommitGraph.
Loading a new commit-graph into memory requires additional time. After
testing, loading a copy of the Linux's commit-graph(1039139 commits)
is under 50ms.
Thomas Wolf [Fri, 16 Dec 2022 22:40:42 +0000 (23:40 +0100)]
Reformat PatchApplier and PatchApplierTest
Some lines were too long, unnecessary fully qualified class names,
and an assertEquals(actual, expected) when it should have been
assertEquals(expected, actual).
Change-Id: I3b3c46c963afe2fb82a79c1e93970e73778877e5 Signed-off-by: Thomas Wolf <twolf@apache.org>
Anna Papitto [Fri, 2 Dec 2022 23:56:56 +0000 (15:56 -0800)]
IO#readFully: provide overload that fills the full array
IO#readFully is often called with the intent to fill the destination
array from beginning to end. The redundant arguments for where to start
and stop filling are opportunities for bugs if specified incorrectly or
if not changed to match a changed array length.
Provide a overloaded method for filling the full destination array.
Change-Id: I964f18f4a061189cce1ca00ff0258669277ff499 Signed-off-by: Anna Papitto <annapapitto@google.com>
If 'core.commitGraph' and 'gc.writeCommitGraph' are both true, then gc
will rewrite the commit-graph file when 'git gc' is run. Defaults to
false while the commit-graph feature matures.
Git introduced a new file storing the topology and some metadata of
the commits in the repo (commitGraph). With this data, git can browse
commit history without parsing the pack, speeding up e.g.
reachability checks.
This change teaches JGit to read commit-graph-format file, following
the upstream format([1]).
JGit can read a commit-graph file from a buffered stream, which means
that we can provide this feature for both FileRepository and
DfsRepository.
Anna Papitto [Thu, 10 Nov 2022 22:48:22 +0000 (14:48 -0800)]
Gc#deleteOrphans: avoid dependence on PackExt alphabetical ordering
Deleting orphan files depends on .pack and .keep being reverse-sorted
to before the corresponding index files that could be orphans. The new
reverse index file extension (.rev) will break that frail dependency.
Rewrite Gc#deleteOrphans to avoid that dependence by tracking which pack
names have a .pack or .keep file and then deleting any index files that
without a corresponding one. This approach takes linear time instead of
the O(n logn) time needed for sorting.
Change-Id: If83c378ea070b8871d4b01ae008e7bf8270de763 Signed-off-by: Anna Papitto <annapapitto@google.com>
Ivan Frade [Tue, 6 Dec 2022 23:59:03 +0000 (15:59 -0800)]
GraphCommits: Remove unused getter by position
CommitGraphWriter uses the GraphCommits in for-each loops and doesn't
need the access by position anymore. This was a left-over from
https://git.eclipse.org/r/c/jgit/jgit/+/182832
Errorprone run in the bazel build raised this exception:
org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/commitgraph/CommitGraphWriter.java:105:
error: [UnusedException] This catch block catches an exception and
re-throws another, but swallows the caught exception rather than setting
it as a cause. This can make debugging harder.
} catch (InterruptedIOException e) {
^
(see https://errorprone.info/bugpattern/UnusedException)
Did you mean 'throw new
IOException(JGitText.get().commitGraphWritingCancelled, e);'?
Matthias Sohn [Tue, 13 Dec 2022 13:40:27 +0000 (14:40 +0100)]
Update jetty to 10.0.13
Since Oomph's p2 repo for jetty 10.0.13 doesn't have source bundles, we
remove them. Eclipse platform doesn't create p2 repos for jetty anymore
and we aren't yet ready to use maven dependencies like the platform
does.
Xing Huang [Fri, 9 Dec 2022 22:15:50 +0000 (16:15 -0600)]
PackExt: Add a commit graph extension.
There is no commit graph PackExt because the non-DFS stack is not writing using PackExt mechanism. The extension is needed in DFS to determine the stream to write the commit-graph.
Add a commit graph extension that matches the one in cgit
(https://git-scm.com/docs/commit-graph#_file_layout)
in preparation for adding DFS support for reading and writing commit graphs.
Sergey [Wed, 7 Dec 2022 11:49:47 +0000 (15:49 +0400)]
BatchRefUpdate: Consistent switch branches in ref update
The expression RefUpdate ru = newUpdate(cmd) is eagerly evaluated before the switch statement.
But it is not used in some switch cases and thus is calculated uselessly.
Move expression evaluation to the switch case where it is actually used.
After such a move, several cases became identical and thus were squashed.
Sergey [Wed, 7 Dec 2022 11:49:47 +0000 (15:49 +0400)]
RefWriter#writePackedRefs: Remove a redundant "if" check
After checking the variable, the same variable was checked again inside
the "if" block, and after the first check, this variable does not
change. Remove the second unnecessary check.
Dmitrii Filippov [Mon, 27 Jun 2022 18:03:53 +0000 (20:03 +0200)]
Fix crashes on rare combination of file names
The NameConflictTreeWalk class is used in merge for iterating over
entries in commits. The class uses a separate iterator for each
commit's tree. In rare cases it can incorrectly report the same entry
twice. As a result, duplicated entries are added to the merge result
and later jgit throws an exception when it tries to process merge
result.
The problem appears only when there is a directory-file conflict for
the last item in trees. Example from the bug:
Commit 1:
* subtree - file
* subtree-0 - file
Commit 2:
* subtree - directory
* subtree-0 - file
Here the names are ordered like this:
"subtree" file <"subtree-0" file < "subtree" directory.
The NameConflictTreeWalk handles similar cases correctly if there are
other files after subtree... in commits - this is processed in the
AbstractTreeIterator.min function. Existing code has a special
optimization for the case, when all trees are pointed to the same
entry name - it skips additional checks. However, this optimization
incorrectly skips checks if one of trees reached the end.
The fix processes a situation when some trees reached the end, while
others are still point to an entry.
Matthias Sohn [Wed, 23 Nov 2022 15:54:34 +0000 (16:54 +0100)]
Merge branch 'master' into stable-6.4
* master:
[pgm] Add options --name-only, --name-status to diff, log, show
Update Orbit to R20221123021534 for 2022-12
RBE: Update toolchain with bazel-toolchains 5.1.2 release
SshTestGitServer: : ensure UploadPack is closed to fix resource leak
UploadPackTest: ensure UploadPack is closed to fix resource leak
[pgm] Ensure UploadPack is closed to fix resource leak
UploadPackServlet#doPost use try-with-resource to ensure up is closed
Fix warnings in PatchApplierTest
Fix boxing warnings in TransportTest
Silence warnings about unclosed BasePackPushConnection
Fix warning about non-externalized String
Remove unused imports
Suppress non-externalized String warnings
Remove unused API problem filters
Silence API errors
Silence API errors
Silence API warnings
Add 4.26 target platform
Use "releases" repository for 4.25 target platform
Update Apache Mina SSHD to 2.9.2
Update Orbit to S20221118032057
DfsBlockCache: Report IndexEventConsumer metrics for reverse indexes.
DfsStreamKey: Replace ForReverseIndex to separate metrics.
RawText.isBinary(): handle complete buffer correctly
PackExt: Add a reverse index extension.