Dariusz Luksza [Fri, 3 Nov 2023 11:52:03 +0000 (11:52 +0000)]
Optimise Git protocol v2 `ref-prefix` scanning
Currenty JGit will go over all refs in the repository for each
`ref-prefix`. This means that refs will be read multiple
times, which leads to subpar performance.
Native git, uses a different approach, where all refs are read once
and then for each ref, all `ref-prefix` filter values are checked in
one pass.
This change implements this approach in JGit. And makes `ref-prefix`
filtering ~28% faster for a repository with fully packed refs
and ~5% when RefTable is used instead of refdir.
Different implementations were tested on a synthetic file repository
with 300k refs. Different implementations were tested for unpacked and
fully packed refs (results are in seconds).
Ivan Frade [Tue, 24 Oct 2023 16:37:13 +0000 (09:37 -0700)]
UploadPackTest: Cover using wanted-refs as advertised set
Parent change introduced using "wanted-refs" as advertised set during
fetchV2, but it tested only a request without wantIds (only
wanted-refs).
Add a second where the request has wanted-refs AND wantId (which
disables the optimization). Change the test to measure the amount of
refs considered advertised, instead of relying in calls to the
refdb.
Patrick Hiesel [Fri, 13 May 2022 11:49:02 +0000 (13:49 +0200)]
UploadPack: use want-refs as advertised set in fetch v2
Protocol v2 introduced refs-in-wants and ls-remote with
prefixes. UploadPack already uses prefixes provided by the client
during a v2 ref advertisement (ls-refs). However, when the client
consequently sends another request to fetch a previously advertised
ref (with want-ref lines), the server uses the whole set of advertised
refs to compute reachability.
In repos with many refs, this slows down the reachability checks
setting up and walking through unnecessary refs. For gerrit it can
also break valid requests because in gerrit "all" means "recent" and
the wanted-ref could fall out of the "recent" range when reloading all
refs at fetch time.
Treat wanted-refs like a ref-prefix when calculating the advertised
refs on v2 fetch command. Less refs means a faster setup and less walk
for the reachability checks. Note that wanted-refs filters only over
the refs visible to the user, so this doesn't give any extra
visibility to the caller.
If the request contains also "want <oid>" lines, we cannot use this
optimization. Those objects could be reachable from any visible
branch, not necessarily in the wanted-refs.
Ronald Bhuleskar [Thu, 21 Sep 2023 23:12:06 +0000 (16:12 -0700)]
BasePackFetchConnection: Avoid full clone with useNegotiationTip
With the useNegotiationTip flag (introduced in change 738dacb), the client sends to the server only the tips of the wanted refs for the negotiation. Some wanted refs may not exist in the client (yet) and our implementation ignores them. So when only non-existing refs are wanted, jgit doesn't send any tips and the server understands it is a full clone.
In useNegotiationTip, send ALL_REFS if any of the wanted refs does not exists locally.
Matthias Sohn [Tue, 17 Oct 2023 09:25:12 +0000 (11:25 +0200)]
Silence API warnings for API added in 5.13.3
This was added in
- f103a1d5c605 "Add support for git config repack.packKeptObjects"
- f5f4bf0ad97f "Do not exclude objects in locked packs from bitmap
processing"
Thomas Wolf [Sat, 14 Oct 2023 21:25:19 +0000 (23:25 +0200)]
FileBasedConfig: in-process synchronization for load() and save()
On Windows reading and replacing a file via renaming concurrently may
fail either in the reader or in the thread renaming the file. For
renaming, FileUtils.rename() has a last-case fallback in which it
deletes the target file before attempting the rename. If a reader reads
at that moment, it will produce an empty config, and the snapshot and
hash may be wrong because the concurrently running save() may set them.
It's not really possible to do all this in a thread-safe manner without
some synchronization. Add a read-write lock to synchronize readers and
writers to avoid at least that JGit steps on its own feet.
Bug: 451508
Change-Id: I7e5f0f26e02f34ba02dc925a445044d3e21389b4 Signed-off-by: Thomas Wolf <twolf@apache.org>
Thomas Wolf [Sun, 8 Oct 2023 20:03:22 +0000 (22:03 +0200)]
FileUtils.rename(): better retry handling
When the atomic move fails on Windows, it may be because some other
thread is currently reading the destination. If we delete the file
then, that reader may get an exception, and conclude the file didn't
exist, even though the rename() would re-create it right away.
Try to avoid this from happening frequently by only deleting the
destination on the last retry. Also don't sleep after the last attempt.
Bug: 451508
Change-Id: I95bb4ec59d6e7efb4a7fc8d67f5df301f690257a Signed-off-by: Thomas Wolf <twolf@apache.org>
Thomas Wolf [Sun, 8 Oct 2023 19:50:34 +0000 (21:50 +0200)]
DeleteBranchCommand: update config only at the end
When multiple branches were to be removed, the git config was updated
after each and every branch. Newly do so only once at the end, after all
branches have been deleted.
Because there may be an exception after some branches have already been
deleted, take care to update the config even if an exception is thrown.
Bug: 451508
Change-Id: I645be8a1a59a1476d421e46933c3f7cbd0639fec Signed-off-by: Thomas Wolf <twolf@apache.org>
Thomas Wolf [Sun, 8 Oct 2023 19:47:05 +0000 (21:47 +0200)]
Config.removeSection() telling whether it changed the config
Add a variant of unsetSection() that returns whether it did indeed
change the config. This can be used in to skip saving the config if
it was not changed.
Also fix the iteration over the entries: lastWasMatch was never reset,
and thus all empty lines after a match would be removed.
Change-Id: Iea9e84aa74b1e4bb3c89efe3936fa3a8a09532e5 Signed-off-by: Thomas Wolf <twolf@apache.org>
Matthias Sohn [Fri, 13 Oct 2023 19:31:00 +0000 (21:31 +0200)]
Merge branch 'stable-6.7'
* stable-6.7:
PackConfig: fix @since tags
Remove unused API problem filters
Add support for git config repack.packKeptObjects
Do not exclude objects in locked packs from bitmap processing
Matthias Sohn [Fri, 13 Oct 2023 08:15:08 +0000 (10:15 +0200)]
Use net.i2p.crypto.eddsa 0.3.0 from new Orbit build
consuming it directly from Maven Central.
The bundle net.i2p.crypto.eddsa 0.3.0 contains bad OSGi metadata,
earlier it was repackaged in Orbit tweaking its mandatory dependency to
sun.security.x509 to an optional dependency.
This project seems to be orphaned, probably because Java 15 added
support for eddsa with JEP339 [1].
This repackaged bundle is no longer available after Orbit was renovated
[2] to consume the vast majority of bundles directly from Maven Central
without repacking them. Hence we have to workaround this (probably
false) mandatory dependency. For that export an empty dummy package
"sun.security.x509" to satisfy OSGi.
Matthias Sohn [Fri, 13 Oct 2023 07:06:21 +0000 (09:06 +0200)]
Merge branch 'stable-6.6' into stable-6.7
* stable-6.6:
PackConfig: fix @since tags
Remove unused API problem filters
Add support for git config repack.packKeptObjects
Do not exclude objects in locked packs from bitmap processing
Matthias Sohn [Fri, 13 Oct 2023 06:46:11 +0000 (08:46 +0200)]
Merge branch 'stable-6.5' into stable-6.6
* stable-6.5:
PackConfig: fix @since tags
Remove unused API problem filters
Add support for git config repack.packKeptObjects
Do not exclude objects in locked packs from bitmap processing
Matthias Sohn [Thu, 12 Oct 2023 23:52:43 +0000 (01:52 +0200)]
Merge branch 'stable-6.4' into stable-6.5
* stable-6.4:
PackConfig: fix @since tags
Remove unused API problem filters
Add support for git config repack.packKeptObjects
Do not exclude objects in locked packs from bitmap processing
Matthias Sohn [Thu, 12 Oct 2023 23:33:56 +0000 (01:33 +0200)]
Merge branch 'stable-6.3' into stable-6.4
* stable-6.3:
PackConfig: fix @since tags
Remove unused API problem filters
Add support for git config repack.packKeptObjects
Do not exclude objects in locked packs from bitmap processing
Matthias Sohn [Thu, 12 Oct 2023 22:51:46 +0000 (00:51 +0200)]
Merge branch 'stable-6.2' into stable-6.3
* stable-6.2:
PackConfig: fix @since tags
Remove unused API problem filters
Add support for git config repack.packKeptObjects
Do not exclude objects in locked packs from bitmap processing
Matthias Sohn [Thu, 12 Oct 2023 22:44:33 +0000 (00:44 +0200)]
Merge branch 'stable-6.1' into stable-6.2
* stable-6.1:
PackConfig: fix @since tags
Remove unused API problem filters
Add support for git config repack.packKeptObjects
Do not exclude objects in locked packs from bitmap processing
Matthias Sohn [Thu, 12 Oct 2023 22:33:19 +0000 (00:33 +0200)]
Merge branch 'stable-6.0' into stable-6.1
* stable-6.0:
PackConfig: fix @since tags
Remove unused API problem filters
Add support for git config repack.packKeptObjects
Do not exclude objects in locked packs from bitmap processing
Matthias Sohn [Thu, 12 Oct 2023 22:22:11 +0000 (00:22 +0200)]
Merge branch 'stable-5.13' into stable-6.0
* stable-5.13:
PackConfig: fix @since tags
Remove unused API problem filters
Add support for git config repack.packKeptObjects
Do not exclude objects in locked packs from bitmap processing
Antonio Barone [Thu, 5 Oct 2023 19:58:13 +0000 (21:58 +0200)]
Add support for git config repack.packKeptObjects
Change Ide3445e652 introduced the `--pack-kept-objects` option to GC for
including the objects contained in the locked packfiles during the
repack phase.
Whilst this allowed to explicitly pass a command line argument to the
jgit gc program, it did not allow the option to be read from
configuration.
Allow the pack kept objects option to be configured exactly as C-Git
documents [1], by introducing a new `repack.packKeptObjects`
configuration.
`repack.packKeptObjects` defaults to `true`, when the
`pack.buildBitmaps` is `true` (which is the default case), `false`
otherwise.
Luca Milanesio [Fri, 11 Aug 2023 19:15:17 +0000 (20:15 +0100)]
Do not exclude objects in locked packs from bitmap processing
Packfiles having an equivalent .keep file are associated with in-flight
pushes that haven't been completed, with potentially a set of git
objects not yet referenced by a ref.
If the Git client is not up-to-date, it may result in pushing a
packfile, generating a <packfile>.keep on the server, which
may also contain existing commits due to the lack of Git protocol
negotiation in the git-receive-pack.
The Git protocol negotiation is the phase where the client and the
server exchange the list of refs they have for trying to find a common
base and minimise the amount of objects to be transferred.
The repack phase in GC was previously skipping all objects that were
contained in all packfiles having a <packfile>.keep file associated
(aka "locked packfiles"), which did not take into consideration the
fact that excluding the existing commits would have resulted in the
generation of an invalid bitmap file.
The code for excluding the objects in the locked packfiles was written
well before the bitmap was introduced, hence could not consider a use
case that did not exist at that time.
However, when the bitmap was introduced, the exclusion of locked
packfiles was not changed, hence creating a potential problem.
The issue went unnoticed for many years because the bitmap generation
was disabled when JGit noticed any locked packfiles; however, the
bitmaps are enabled again since Id722e68d9f , and the the issue is now
visible and is impacting the GC repack phase.
Introduce the '--pack-kept-objects' option in GC for including the
objects contained in the locked packfiles during the repack phase,
which is not an issue because of the following:
- If there are any existing commits duplicated in the packfiles
they will be just considered once anyway because the repack doesn't
generate duplicates in the output packfile.
- If there are any new commits that do not have any ref pointing to
them, they will be automatically excluded from the output repacked
packfile.
The same identical solution is adopted in the C implementation of git
in repack.c.
Because the locked packfile is not pruned, any new commits not pointed
by any refs will remain in the repository and there will not be any
accidental pruning or object loss as it is today before this change.
As a side-effect of this change, it is now potentially possible to still
have duplicate BLOBs after GC when the keep packfile contained existing
objects. However, it is way better to keep the duplication until the
next GC phase rather than omitting existing objects from repacking and,
therefore generating an invalid bitmap and incorrect packfile.
David Ostrovsky [Thu, 5 Oct 2023 21:04:00 +0000 (23:04 +0200)]
TestRepository: Add getInstant method
Error Prone is flagging Date-API as obsolete and recommends to migrate
to Instant and LocalDate. Given that more JGit users starting to migrate
to new Time API, offer getInstant method.
Ivan Frade [Thu, 28 Sep 2023 22:06:40 +0000 (15:06 -0700)]
UploadPack: Delay freeing refs in sendPack()
Change [1] set refs to null at the beggining of sendPack claiming they
are not needed anymore, but they are still used few lines below to
hoist referenced objects to the front of the pack. With refs nullified,
the hoist doesn't happened. This hasn't caused any problem so far,
probably because it is just an optimization and the objects are in the
pack anyway.
Move the nullification after the hoisting to keep the optimization and
save the memory.
Matthias Sohn [Sat, 23 Sep 2023 23:41:13 +0000 (01:41 +0200)]
WorkingTreeIterator: directly filter input stream
This way we can avoid to access the byte buffers backing array.
Implement a ByteBufferInputStream to wrap a byte buffer which we can use
to expose the filter result as an input stream.
See
https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/concurrent/CopyOnWriteArrayList.html
https://errorprone.info/bugpattern/ModifyCollectionInEnhancedForLoop
Matthias Sohn [Sat, 16 Sep 2023 20:19:41 +0000 (22:19 +0200)]
[errorprone] AddCommand#filepattern: use a more specific type
Variable type can use a more specific type to convey more information to
callers.
private Collection<String> filepatterns;
^
(see https://errorprone.info/bugpattern/PreferredInterfaceType)
Ivan Frade [Fri, 8 Sep 2023 20:23:09 +0000 (13:23 -0700)]
CommitGraphWriter: Decouple Stats from computing bloom filters
The public stats object is created only to be populated by the computation of
bloom filters.
Make the computation return its numbers with the results and copy them
to the stats when needed. This eliminates the side effects from the
computation and makes it easier to add more data to the stats later.
Ronald Bhuleskar [Mon, 18 Sep 2023 18:45:11 +0000 (11:45 -0700)]
Documentation: Move writeChangedPaths flag from commitGraph to gc section.
The flag commitGraph.writeChangedPaths is not correct. The option was introduced in commit 3b77e33 (Jul 19th) under gc section so updating the documentation to reflect it under correct section.
Matthias Sohn [Fri, 15 Sep 2023 09:48:05 +0000 (11:48 +0200)]
[errorprone] Fix wrong comparison which always evaluated to false
org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/commitgraph/GraphObjectIndex.java:59:
error: [ComparisonOutOfRange] ints may have a value in the range
-2147483648 to 2147483647; therefore, this comparison to
Integer.MAX_VALUE will always evaluate to false
if (table[k] > Integer.MAX_VALUE) {
^
See https://errorprone.info/bugpattern/ComparisonOutOfRange
We need to check if variable `uint` of type `long` exceeds the maximum
possible int value before casting it to `int` below.
This was introduced in Ib5c0d6678cb242870a0f5841bd413ad3885e95f6
Matthias Sohn [Fri, 15 Sep 2023 09:44:09 +0000 (11:44 +0200)]
[errorprone] Remove unnecessary comparison
Raised by errorprone:
org.eclipse.jgit/src/org/eclipse/jgit/lib/CommitConfig.java:406: error:
[ComparisonOutOfRange] chars may have a value in the range 0 to 65535;
therefore, this comparison to 0 will always evaluate to true
if (ch >= 0 && ch < inUse.length) {
^
see https://errorprone.info/bugpattern/ComparisonOutOfRange
Matthias Sohn [Fri, 8 Sep 2023 20:57:05 +0000 (22:57 +0200)]
Use ShutdownHook to gracefully handle JVM shutdown
in all classes which already registered their own shutdown hook
- CloneCommand
- GC#PidLock
- FS#FileStoreAttributes
- LocalDiskRepositoryTestCase#Cleanup
Matthias Sohn [Fri, 8 Sep 2023 00:14:20 +0000 (02:14 +0200)]
Add ShutdownHook to cleanup FileLocks on graceful JVM shutdown
This should avoid stale lock files if the JVM is terminated gracefully.
Implement a ShutdownHook which can register/unregister listeners which
need to do some cleanup during graceful JVM shutdown. This hook is
registered as a Java shutdown hook and when the JVM shuts down
calls #onShutdown of registered listeners using a parallel stream
to let them run concurrently.
See https://docs.oracle.com/javase/8/docs/technotes/guides/lang/hook-design.html