Dariusz Luksza [Mon, 20 Nov 2023 11:00:51 +0000 (11:00 +0000)]
Fix handling of missing pack index file
As demonstrated in
`UploadPackHandleDeletedPackFile.testV2IdxFileRemovedDuringUploadPack`
the fetch operation will fail when the pack index file is removed.
This is due to a wrapping of `FileNotFoundException` (which is a
subclass of `IOExeption`) in an `IOException` at PackIndex L#68. This
is then changing the behaviour of error handling in
`Pack.file.getBitmapIndex()` where the `FileNotFoundException` is
swallowed and allows the fetch process to continue. With FNFE being
wrapped in IOE, this blows up and breaks the fetch operation.
Simply rethrowing `FileNotFoundException` from `PackFile.open()` fixes
the broken fetch operation. This will also mark the whole pack as
invalid in the `IOException` handler in `Pack.idx()` method.
Dmitrii Naumenko [Wed, 22 Nov 2023 18:11:17 +0000 (19:11 +0100)]
CherryPick: add ability to customise cherry-picked commit message
Originally I wanted to support a feature similar to `-x` options from
https://git-scm.com/docs/git-cherry-pick#_options.
The idea was to append original commit hash in this format:
```
my original commit message
(cherry picked from commit 75355897dc28e9975afed028c1a6d8c6b97b2a3c)
```
This can be useful information in some integrations.
I decided to make it in a more generic way
and pass custom `CherryPickCommitMessageProvider` implementation.
One of the two default implementations can append original commit hash
Matthias Sohn [Sat, 20 Jan 2024 00:03:52 +0000 (01:03 +0100)]
Merge branch 'stable-6.8'
* stable-6.8:
Introduce a PriorityQueue sorting RevCommits by commit timestamp
Remove org.eclipse.jgit.benchmark/.factorypath
Update jmh to 1.37 for org.eclipse.jgit.benchmark
Matthias Sohn [Fri, 19 Jan 2024 23:40:42 +0000 (00:40 +0100)]
Merge branch 'stable-6.7' into stable-6.8
* stable-6.7:
Introduce a PriorityQueue sorting RevCommits by commit timestamp
Remove org.eclipse.jgit.benchmark/.factorypath
Update jmh to 1.37 for org.eclipse.jgit.benchmark
Matthias Sohn [Fri, 19 Jan 2024 23:18:25 +0000 (00:18 +0100)]
Merge branch 'stable-6.6' into stable-6.7
* stable-6.6:
Introduce a PriorityQueue sorting RevCommits by commit timestamp
Remove org.eclipse.jgit.benchmark/.factorypath
Update jmh to 1.37 for org.eclipse.jgit.benchmark
Luca Milanesio [Mon, 13 Jun 2022 22:09:55 +0000 (23:09 +0100)]
Introduce a PriorityQueue sorting RevCommits by commit timestamp
The DateRevQueue uses a tailor-made algorithm to keep
RevCommits sorted by reversed commit timestamp, which has a O(n*n/2)
complexity and caused the explosion of the Git fetch times to
tens of seconds.
The standard Java PriorityQueue provides a O(n*log(n)) complexity
and scales much better with the increase of the number of
RevCommits.
Introduce a new implementation DateRevPriorityQueue of the DateRevQueue
based on PriorityQueue.
Enable usage of the new DateRevPriorityQueue implementation by setting
the system property REVWALK_USE_PRIORITY_QUEUE=true. By default the old
implementation DateRevQueue is used.
Dariusz Luksza [Fri, 17 Nov 2023 19:28:53 +0000 (19:28 +0000)]
Add tests for handling pack files removal during fetch
Although this could sound like a corner case, it really can occur out
there in the real world. Especially in the Gerrit world where the
repositories could be GC'ed on a separate process or system.
The `FileNotFoundException` seems to be handled correctly in
`PackFile#doOpen` (line 671) and it will mark the pack as invalid. But
triggering that code path was not an easy task.
First of all, we need to add a new commit to the `master` branch of the
test repository after `UploadPack` object is created.
Secondly, in the refspec for fetch, commit id instead of "regular"
refspec must be used.
With both in place, we can see a warning log statement about deleted
pack file. And the fetch succeeds!
Also, tests for the removal of *.idx and *.bitmap files were added.
This unveiled a corner for the *.idx file deletion while fetching, as
the test will fail with "Unreachable pack index" IOException only
when the HEAD commit is empty.
PackWriterBitmapPreparer: Set limit on excessive branch count
If there are too many branches then the bitmap
indexing selects only the tip commits of the least active
branches to reduce the amount of bitmaps to load on request.
This can still be a problem if the number of inactive branches
rival or exceed the total number of commits selected
for the active branches.
Limit the number of branches that receive only-tip bitmaps.
This reduces the memory pressure of loading all the bitmaps,
and allows us to model the size of the bitmap index without
considering the number of branches.
Bitmaps are generated for branches in order of most recent commit,
and follow these rules:
* The first {@code DEFAULT_BITMAP_EXCESSIVE_BRANCH_COUNT} most active
branches have full bitmap coverage.
* The {@code DEFAULT_BITMAP_EXCESSIVE_BRANCH_COUNT} to {@code
DEFAULT_BITMAP_EXCESSIVE_BRANCH_TIP_COUNT} most active branches have
only the tip commit covered.
* The remaining branches have no bitmap coverage.
To prevent effecting existing repositories, the default value is set
at Integer.MAX_VALUE.
Revert commit 170244d05977491271a1cc234583d2e5ba75145d
"Checkout: better directory handling" which is the downport of the
original fix Ie12864c54c9f901a2ccee7caddec73027f353111 which was done
on stable-6.6. Merging this up to stable-6.6 would be a lot of work and
these branches aren't maintained anymore hence revert this change here.
This way the fix is available on stable-5.13 for those who still need
Java 8 and everybody else should upgrade to 6.6.1 or higher.
FooterLines: handle extraction from messages without headers
Prior to this change, long subjects of messages with no headers were
treated as headers, and therefore were skipped. In a message of the
form `<long subject>\n\n<footers>`, the footers would then get parsed
as a message, meaning no footers were returned.
After this change, the first lines are skipped only if they match any
of the known headers. The first line ofter the optional headers is then
assumed to be the subject line.
`FooterLineTest` had a few test cases for extracting footers from
messages with no headers. However, there were all with short messages,
so the "skip this line" logic in `RawParseUtils` was never triggered.
Added test case to catch this issue.
Change-Id: I971a1dddf1a9aea094360c3c8fc3b9a8b011bbf9
Issue: Google b/287891316
Michael Keppler [Sat, 23 Dec 2023 19:31:11 +0000 (20:31 +0100)]
Remove invalid/unnecessary Maven settings
* Remove jgit.target POM and remove it from the module list. This was
only necessary when the target file had to be referenced as an artifact.
Meanwhile we reference it directly by its path, and can remove the Maven
build around it.
* Remove tycho configuration options that are no longer valid (resolved
was removed very early, probably before 1.0; includePackedArtifacts was
removed in 3.0). Also remove duplicate version specification.
Matthias Sohn [Fri, 22 Dec 2023 23:39:07 +0000 (00:39 +0100)]
Update maven plugins
- com.github.siom79.japicmp:japicmp-maven-plugin to 0.18.3
- com.github.spotbugs:spotbugs-maven-plugin to 4.8.2.0
- io.github.git-commit-id:git-commit-id-maven-plugin to 7.0.0
- org.apache.maven.plugins:maven-clean-plugin to 3.3.2
- org.apache.maven.plugins:maven-compiler-plugin to 3.12.0
- org.apache.maven.plugins:maven-dependency-plugin to 3.6.1
- org.apache.maven.plugins:maven-enforcer-plugin to 3.4.1
- org.apache.maven.plugins:maven-javadoc-plugin to 3.6.3
- org.apache.maven.plugins:maven-jxr-plugin to 3.3.1
- org.apache.maven.plugins:maven-pmd-plugin to 3.21.2
- org.apache.maven.plugins:maven-project-info-reports-plugin to 3.5.1
- org.apache.maven.plugins:maven-shade-plugin to 3.5.1
- org.apache.maven.plugins:maven-site-plugin to 4.0.0-M13
- org.apache.maven.plugins:maven-surefire-plugin to 3.2.3
- org.codehaus.mojo:build-helper-maven-plugin to 3.5.0
- org.cyclonedx:cyclonedx-maven-plugin to 2.7.10
- org.eclipse.cbi.maven.plugins:eclipse-jarsigner-plugin to 1.4.3
- org.jacoco:jacoco-maven-plugin to 0.8.11
Dariusz Luksza [Mon, 18 Dec 2023 18:24:49 +0000 (18:24 +0000)]
BasePackFetchConnection: Skip object/ref lookups if local repo is empty
When cloning repository some of the operations in
`BasePackFetchConnection` can be skipped. We don't need to advertise
packs, compute "wanted time" or wanted refs to send. All of those
operations will try to read objects from an empty repository which
always results in a missing object.
This saves 99.9% of `LooseObjects.open()` calls which dramatically
speeds up object negotiation in V2 protocol.
In testing on JGit (v6.8.0.202311291450-r) repository, which contains
564 refs, the number of calls to `LooseObjects.open()` was reduced from
1187 to 1.
Skipping a call to `markRefsAdvertised()` initially reduced be above
number to 623. Then assuming "0" "want time" an on empty repository
pushed the calls down to 312. Finally, skipping objects reachability on
empty repository set calls down to 1.
The last call is performed from `FetchProcess.asForIsComplete()` which
probably needs to stay in place.
Dariusz Luksza [Sun, 17 Dec 2023 16:43:56 +0000 (16:43 +0000)]
LooseObjects: Use File#exists when possible
When `trustFolderStat` flag is enabled we can use `File.exist()`
instead of rethrowing `FileNotFoundException`. This improves performance
of cloning and fetching.
A simple benchmark that generates a random `ObjectId` instance and then
tries to parse that object id, shows about 30% improvement with this
change.
The benchmark scenario was based on the stacktrace reported in jgit-5.
Where `RevWalk.parse()` call will eventually call `LooseObjects.open()`
and finally `LoseObjects.getOpenLoader()`.
Michael Keppler [Sun, 17 Dec 2023 14:45:18 +0000 (15:45 +0100)]
Remove invalid spotbugs configuration
* findbugsXmlOutput was renamed to spotbugsXmlOutput long ago
* spotbugsXmlOutput has a default value of true and is deprecated,
therefore removing the entire line seems most reasonable
See https://spotbugs.github.io/spotbugs-maven-plugin/check-mojo.html#spotbugsXmlOutput
Ivan Frade [Thu, 14 Dec 2023 17:25:06 +0000 (09:25 -0800)]
DfsReader: give subclasses visiblity over the pack bitmap index
Subclasses intercept many methods in DfsReader to capture metrics, but
they cannot record stats from PackBitmapIndex, as it is wrapped inside
a BitmapIndex.
Move the creation of the BitmapIndex to a protected method. Subclasses
can override it to e.g. read metrics from the index or set listeners to
the BitmapIndex.
Ivan Frade [Fri, 10 Nov 2023 20:05:51 +0000 (12:05 -0800)]
PackBitmapIndex/StoredBitmap: Expose size and counts
PackBitmapIndex holds a collection of StoredBitmaps. StoredBitmaps can
be either base bitmaps (ready) or an XOR over another bitmap. XOR
bitmaps are replaced with a resolved version on demand. Bitmaps
can use a significant amount of memory but we don't have detailed
visibility about it.
Add methods to PackBitmapIndex to know how many xor/bases we have and
their sizes.
Ivan Frade [Tue, 12 Dec 2023 21:31:03 +0000 (13:31 -0800)]
PackWriter/Statistics: Remove the bitmapt hit stats
The request uses bitmaps for reachability and to decide what to
pack. Setting the listener in the PackWriter only covers the second
case.
Remove the listener from the PackWriter. It makes more sense to set it
in the reader and at the moment the BitmapIndex only supports a single
listener.
This was introduced after the 6.8 tag, so it should be safe to remove.
Fabio Ponciroli [Wed, 6 Dec 2023 13:38:21 +0000 (14:38 +0100)]
Make sure ref to prune is in packed refs
RefDirectory:pack might raise an NPE when deleting loose
refs as final part of the RefDirectory.pack().
This is what the code does:
1) packed ref update: update the list of refs which will be
persisted in packed-refs
2) persit packed-refs: flush on file the refs computed in #1
3) prune loose refs: prune loose refs that have been packed in #2
The code correctly locks the packed-refs file during phases 1 to 3.
However, it makes the wrong assumption of considering
the loose refs set as immutable between phases 1 and 3.
The number and values of loose refs on the filesystem can mutate
at any time whilst the RefDirectory.pack() is in progress.
Assuming the contrary can lead to an NPE when retrieving refs
from the mutable loose refs list during phase #3.
Make sure that the ref is not null before dereferencing its
object-id value.
Kamil Musin [Tue, 5 Dec 2023 15:22:08 +0000 (16:22 +0100)]
FooterLine: Protect from ill-formed message
A raw commit message has some headers and then the actual
message. RawParseUtils.commitMessage returns the start position of the
actual message, or -1 when the message is not raw. FooterLine is not
handling this -1 and throws an IndexOutOfBounds exception.
Consider than msgB can be -1 when looking for the beginning of the
last paragraph.
FooterLine javadoc and parameter talk only about "raw" but previous
code accepted non-raw messages (used mostly in unit tests) so we need
to keep this behavior.
Ronald Bhuleskar [Wed, 25 Oct 2023 20:10:33 +0000 (13:10 -0700)]
StartGenerator: Fix parent rewrite with non-default RevFilter
StartGenerator is responsible for propagating the RevWalk's
parent rewrite setting, but it currently only does so when a
non-default TreeFilter is set, when it should also do so if
the default TreeFilter is used with a non-default RevFilter.
Adding a new if condition within StartGenerator to enable parent
rewrite with non-default RevFilter.
TreeRevFilter relied on the old buggy functionality and has
been modified to explicitly refrain from rewriting parents.
Ivan Frade [Fri, 17 Nov 2023 18:44:56 +0000 (10:44 -0800)]
PackWriter: store the objects with bitmaps in the statistics
We want to know what objects had bitmaps in the walk of the
request. We can check their position in the history and evaluate
our bitmap selection algorithm.
Use the listener interface of the BitmapWalker to get the objects
walked with bitmaps and store them in the statistics.
Ivan Frade [Wed, 29 Nov 2023 18:51:04 +0000 (10:51 -0800)]
FooterLine: First line cannot be a footer
The first line of the commit message cannot be a footer line. This
restriction was dropped in commit [1] while adding multiline
footers. This affects at least the git-numberer gerrit plugin, that
even have a test for it [2].
Reintroduce the restriction that the first line of the commit message
cannot be a footer and bring the test from git-numberer to jgit.
This breaks a test in the git_numberer gerrit plugin used by chromium
[1]. The test checks that first line is never a footer, which sounds
right. That test should be included in FooterLineTest.
Matthias Sohn [Mon, 27 Nov 2023 22:34:02 +0000 (23:34 +0100)]
Merge branch 'master' into stable-6.8
* master:
Adapt to type parameter added in commons-compress 1.25.0
Improve footer parsing to allow multiline footers.
Make the tests buildable by bazel test
BitmapIndex: Add interface to track bitmaps found (or not)
BitmapWalker: Remove BitmapWalkListener
Kamil Musin [Fri, 24 Nov 2023 14:17:26 +0000 (15:17 +0100)]
Improve footer parsing to allow multiline footers.
According to the https://git-scm.com/docs/git-interpret-trailers the
CGit supports multiline trailers. Subsequent lines of such multiline
trailers have to start with a whitespace.
We also rewrite the original parsing code to make it easier to work
with. The old code had pointers moving both backwards and forwards at
the same time. In the rewritten code we first find the start of the last
paragraph and then do all the parsing.
Since all the getters of the FooterLine return String, I've considered
rewriting the parsing code to operate on strings. However the original
code seems to be written with the idea, that the data is only lazily
copied in getters and no extra allocations should be performed during
original parsing (ex. during RevWalk). The changed code keeps to this
idea.
Bug: Google b/312440626
Change-Id: Ie1e3b17a4a5ab767b771c95f00c283ea6c300220
Ivan Frade [Mon, 20 Nov 2023 20:01:12 +0000 (12:01 -0800)]
BitmapIndex: Add interface to track bitmaps found (or not)
We want to know what objects had bitmaps in the walk of the
request. We can check their position in the history and evaluate our
bitmap selection algorithm.
Introduce a listener interface to the BitmapIndex to report which
getBitmap() calls returned a bitmap (or not) and a method to the
bitmap index to set the listener.
florian.signoret [Mon, 16 Oct 2023 14:08:14 +0000 (16:08 +0200)]
Fix branch ref exist check
When a tag with the same name as the branch exists, the branch creation
process should work too. We should detect that the branch already
exists, and allow to force create it when the force option is used.
Ivan Frade [Fri, 17 Nov 2023 18:11:17 +0000 (10:11 -0800)]
Adjust javadoc to pass errorprone checks
bazel build //org.eclipse.jgit.test:all doesn't build due to
errorprone errors. This leds to disabling the checks and find the
errors later in the jenkins builder.
Fix javadoc errors reported by errorprone. This doesn't fix completely
the build but it is a step towards it.