BatchRefUpdate: Skip saving conflicting ref names and prefixes in memory
Rather than getting all ref names and prefixes and saving them
in memory to perform the check for conflicting names, rely on
RefDirectory.isNameConflicting as it is no longer an expensive
call after it was optimized in Ie994fc.
The old optimization to save ref names and prefixes in memory
was targeted towards making clones faster. With this change,
the clone performance is unaffected when tests were done with
repos containing many(~500k) refs.
Here are few recorded elapsed times for creating 10 branches
using BatchRefUpdate on NFS based repositories with varying
loose refs count. As seen here, this change helps improve the
BatchRefUpdate performance from O(n^2) to O(1).
loose_refs_count with_change without_change
50 241 ms 310 ms
300 263 ms 1502 ms
1k 181 ms 4241 ms
2k 204 ms 6440 ms
9k 158 ms 25930 ms
20k 154 ms 60443 ms
50k 171 ms 135199 ms
110k 157 ms 329450 ms
160k 209 ms 396328 ms
This update improves the Gerrit notedb migration performance
as it uses BatchRefUpdate to write change meta refs similar to
the test performed above.
Update tests to record the number of events fired post-setup and only
assert for events fired during BatchRefUpdate.execute. For tests which
use writeLooseRef to setup refs, create new tests which assert the
number of RefsChangedEvent(s) rather than updating the existing ones
to call RefDirectory.exactRef as it changes the code path.
Avoid having to scan over ALL loose refs to determine if the
name is nested within or is a container of an existing reference.
This can get really expensive if there are too many loose refs.
Instead use exactRef and getRefsByPrefix which scan based on a
prefix.
With a simple shell script(like below) using jgit client to create
1k refs in a new repository on NFS, this change brings down the time
from 12mins to 7mins.
for ref in $(seq 1 1000); do
jgit branch "$ref"
done
Here are few recorded elapsed times to create a new branch on NFS
based repositories with varying loose refs count. As we see here,
this change improves the name conflicting check from O(n^2) to O(1).
loose_refs_count with_change without_change
50 44 ms 164 ms
300 45 ms 1193 ms
1k 38 ms 2610 ms
2k 44 ms 6003 ms
9k 46 ms 27860 ms
20k 45 ms 48591 ms
50k 51 ms 135471 ms
110k 43 ms 294252 ms
160k 52 ms 430976 ms
Matthias Sohn [Mon, 10 May 2021 22:56:57 +0000 (00:56 +0200)]
Merge branch 'stable-5.9' into stable-5.10
* stable-5.9:
LockFile: create OutputStream only when needed
Remove ReftableNumbersNotIncreasingException
Fix stamping to produce stable file timestamps
Thomas Wolf [Tue, 4 May 2021 21:48:56 +0000 (23:48 +0200)]
LockFile: create OutputStream only when needed
Don't create the stream eagerly in lock(); that may cause JGit to
exceed OS or JVM limits on open file descriptors if many locks need
to be created, for instance when creating many refs. Instead create
the output stream only when one really needs to write something.
Bug: 573328
Change-Id: If9441ed40494d46f594a896d34a5c4f56f91ebf4 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
In a distributed setting, one can have multiple datacenters use
reftables for serving, while the ground truth for the Ref database is
administered centrally. In this setting, replication delays combined
with compaction can cause update-index ranges to overlap.
Such a setting is used at Google, and the JGit code already handles
this correctly (modulo a bugfix that applied in change I8f8215b99a).
Remove the restriction that was applied at FileReftableDatabase.
Matthias Sohn [Fri, 2 Oct 2020 19:54:53 +0000 (21:54 +0200)]
Don't install 3rd party dependency bundles via features
Instead provide them only in the p2 repository. This way they are
available when installing from the jgit p2 repository but we are not
enforcing the version we bring but can also use the version available in
Eclipse if it matches our requirements.
Bug: 514326
Bug: 566475
Change-Id: I3e8d0bad12cfb0c1003ade3e6f13e9af35626f14 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Wed, 2 Dec 2020 14:49:35 +0000 (15:49 +0100)]
Merge branch 'master' into stable-5.10
* master:
Prepare 5.3.10-SNAPSHOT builds
JGit v5.3.9.202012012026-r
Prepare 5.1.16-SNAPSHOT builds
JGit v5.1.15.202012011955-r
Fix PackInvalidException when fetch and repack run concurrently
Upgrade maven-pmd-plugin to 3.14.0
Update Orbit to R20201130205003 for 2020-12
Use new protocol version constants
PacketLineInTest: test for END and DELIM being distinguishable
Add constants for parsing git wire protocol version
Ignore missing javadoc tags in test bundles
Bazel: Allow to build and run the tests with JDK 15
[releng] japicmp: update last release version
Add support for reading symrefs from pack capabilities
Change-Id: I5afbbb912f502991f0cf9c2501b024f5f00144ba Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Petr Hrebejk [Mon, 30 Nov 2020 19:19:53 +0000 (20:19 +0100)]
Fix PackInvalidException when fetch and repack run concurrently
We are running several servers with jGit. We need to run repack from
time to time to keep the repos performant. I.e. after push we test how
many small packs are in the repo and when a threshold is reached we run
the repack.
After upgrading jGit version we've found that if someone does the clone
at the time repack is running the clone sometimes (not always) fails
because the repack removes .pack file used by the clone. Server
exception and client error attached.
I've tracked down the cause and it seems to be introduced between jGit
5.2 (which we upgraded from) and 5.3 and being caused by this commit:
Move throw of PackInvalidException outside the catch -
https://github.com/eclipse/jgit/commit/afef866a44cd65fef292c174cad445b3fb526400
The problem is that when the throw was inside of the try block the last
catch block catched the exception and called openFailed(false) method.
It is true that it called it with invalidate = false, which is wrong.
The real problem though is that with the throw outside of the try block
the openFail is not called at all and the fields activeWindows and
activeCopyRawData are not set to 0. Which affects the later called tests
like: if (++activeCopyRawData == 1 && activeWindows == 0).
The fix for this is relatively simple keeping the throw outside of the
try block and still having the invalid field set to true. I did
exhaustive testing of the change running concurrent clones and pushes
indefinitely and with the patch applied it never fails while without the
patch it takes relatively short to get the error.
David Ostrovsky [Sun, 29 Nov 2020 12:06:00 +0000 (13:06 +0100)]
Add constants for parsing git wire protocol version
This would allow other JGit users to access and reuse the constants.
Change-Id: I1608802f45586af5f8582afa592e26679e9cebe3 Signed-off-by: David Ostrovsky <david@ostrovsky.org> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Thu, 19 Nov 2020 00:55:35 +0000 (01:55 +0100)]
Ignore missing javadoc tags in test bundles
It seems Eclipse 4.18 reports them as error whereas earlier versions
ignored this maybe since we don't require javadoc comments for all the
test bundles.
Change-Id: I3f4d42ce681ea5c2b4b302991d2641290ac8561d Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
David Ostrovsky [Mon, 23 Nov 2020 19:04:07 +0000 (20:04 +0100)]
Bazel: Allow to build and run the tests with JDK 15
To avoid JDK specific bugs in future, like Bug: 568950, and given that
upcoming Bazel release 4.0.0 added support to JDK 15 java toolchain,
add definition for remote JDK 15 to WORKSPACE file and add build and
test instructions.
To build and execute the tests with JDK 15 on Linux run:
Lee Worrall [Sat, 10 Oct 2020 03:31:19 +0000 (11:31 +0800)]
Add support for reading symrefs from pack capabilities
A SymbolicRef is added to the advertised refs for any symref in
capabilities whose target is an advertised ref; this may replace an
existing entry, such as HEAD.
When cloning, if any advertised HEAD is symbolic then use the target
rather than looking for an advertised ref with a matching objectId.
Add --symref option to LsRemote command.
Bug: 514052
Change-Id: Idfb48e6f6e8dcfe57a6896883fe6d84d533aa9d0 Signed-off-by: Lee Worrall <worrall.la@gmail.com>
Matthias Sohn [Wed, 25 Nov 2020 16:00:09 +0000 (17:00 +0100)]
Ensure that GC#deleteOrphans respects pack lock
If pack or index files are guarded by a pack lock (.keep file)
deleteOrphans() should not touch the respective files protected by the
lock file. Otherwise it may interfere with PackInserter concurrently
inserting a new pack file and its index.
The problem was caused by the following race.
All mentioned files are located in "objects/pack/".
File endings relevant in "pack" dir:
.pack
.keep
.idx
.bitmap
When ReceivePack receives a pack file it executes the following steps:
ReceivePack.service():
receivePackAndCheckConnectivity():
receivePack():
receive the pack
parse the pack, returns packLock (.keep file)
PackInserter.flush():
write tmpPck file: "insert_<random>.pack"
write tmpIdx file: "insert_<random>.idx"
real pack name: "pack-<SHA1>.pack"
real index name: "pack-<SHA1>.idx"
atomic rename tmpPack to realPack
atomic rename tmpIdx to tmpIdx
execute commands
unlock pack by removing .keep file
trigger auto gc if enabled
When PackInserter.flush() renames the temporary pack to the final
"pack-xxx.pack" file the temporary pack index file "insert_xxx.idx"
has no matching .pack file with the same base name for a short interval.
If deleteOrphans() ran during that interval it deduced the pack index
file was orphaned. Subsequently the missing pack index caused
MissingObjectExceptions since objects contained in the pack couldn't be
looked up anymore.
Bug: https://bugs.chromium.org/p/gerrit/issues/detail?id=13544
Change-Id: I559c81e4b1d7c487f92a751bd78b987d32c98719 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Wed, 25 Nov 2020 16:39:25 +0000 (17:39 +0100)]
Merge branch 'master' into stable-5.10
* master:
Update Orbit to S20201118210000 and add target for 4.18
PacketLineIn: ensure that END != DELIM
PacketLineIn: ensure that END != DELIM
Allow to resolve a conflict by checking out a file
Update Orbit to I20201111205634
Document that setLastModified sets time of symlink target
Fix bug in PerformanceLogContext
Fix IOException occurring during gc
Change-Id: I2980552381d1ae61b9b2d81d7289de37d6bf4cae Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Thomas Wolf [Thu, 19 Nov 2020 12:23:05 +0000 (13:23 +0100)]
PacketLineIn: ensure that END != DELIM
Just allocate the two string objects directly. The previously used
new StringBuilder(0).toString() returns the same object for both END
and DELIM when run on Java 15, which breaks the wire protocol since
then END and DELIM cannot be distinguished anymore.
Bug: 568950
Change-Id: I9d54d7bf484948c24b51a094256bd9d38b085f35 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
(cherry picked from commit 7da0f0a8f37e35e9c3108588f1e6f7a7381d8f77)
Thomas Wolf [Thu, 19 Nov 2020 12:23:05 +0000 (13:23 +0100)]
PacketLineIn: ensure that END != DELIM
Just allocate the two string objects directly. The previously used
new StringBuilder(0).toString() returns the same object for both END
and DELIM when run on Java 15, which breaks the wire protocol since
then END and DELIM cannot be distinguished anymore.
Bug: 568950
Change-Id: I9d54d7bf484948c24b51a094256bd9d38b085f35 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
(cherry picked from commit 7da0f0a8f37e35e9c3108588f1e6f7a7381d8f77)
Thomas Wolf [Thu, 19 Nov 2020 12:23:05 +0000 (13:23 +0100)]
PacketLineIn: ensure that END != DELIM
Just allocate the two string objects directly. The previously used
new StringBuilder(0).toString() returns the same object for both END
and DELIM when run on Java 15, which breaks the wire protocol since
then END and DELIM cannot be distinguished anymore.
Bug: 568950
Change-Id: I9d54d7bf484948c24b51a094256bd9d38b085f35 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Sun, 15 Nov 2020 17:45:12 +0000 (18:45 +0100)]
Allow to resolve a conflict by checking out a file
DirCacheEditor unconditionally applied a PathEdit to all stages in the
index. This gives wrong results if one wants to check out a file from
some commit to resolve a conflict: JGit would update the working tree
file multiple times (once per stage), and set all stages to point to
the checked-out blob.
C git replaces the stages by the entry for the checked-out file.
To support this, add a DirCacheEntry.setStage() method so that
CheckoutCommand can force the stage to zero. In DirCacheEditor, keep
only the zero stage if the PathEdit re-set the stage.
Bug: 568038
Change-Id: Ic7c635bb5aaa06ffaaeed50bc5e45702c56fc6d1 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Matthias Sohn [Mon, 17 Aug 2020 12:23:29 +0000 (14:23 +0200)]
Document that setLastModified sets time of symlink target
Due to Java bug JDK-8220793 [1] Java cannot set timestamps of a symlink
but only of the symlink target. This bug was fixed in Java 13. Since we
don't have a use case to set the timestamp of the symlink itself simply
document the current behavior of setLastModified methods.
Alexa Panfil [Fri, 23 Oct 2020 17:23:23 +0000 (17:23 +0000)]
Fix bug in PerformanceLogContext
PerformanceLogContext threw NullPointerException when multiple threads
tried to add an event to the PerformanceLogContext. The cause for this
is that the ThreadLocal was initialized only in the class constructor
for the first thread; for subsequent threads it was null.
To fix this initialize eventRecords for each thread.
Change-Id: I18ef67dff8f0488e3ad28c9bbc18ce73d5168cf9 Signed-off-by: Alexa Panfil <alexapizza@google.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Nail Samatov [Tue, 27 Oct 2020 13:56:24 +0000 (16:56 +0300)]
Fix IOException occurring during gc
Fix IOException occurring when calling
GC on a repository with absent objects/pack folder.
Change-Id: I5be1333a0726f4d7491afd25ddac85451686c30a Signed-off-by: Nail Samatov <sanail@yandex.ru> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
I had misunderstood how protocol V2 works. This implementation only
works if the negotiation during fetch is done in one round.
Fixing this is substantial work in BasePackFetchConnection. Basically
I think I'd have to change back negotiate to the V0 version, and have
a doFetch() that does
if protocol V2
doFetchV2()
else
doFetchV0()
with doFetchV0 the old code, and doFetchV2 completely new.
Plus there would need to be a HTTP test case requiring several
negotiation rounds.
This is a couple of days work at least, and I don't know when I will
have the time to revisit this. So although the rest of the code is
fine I prefer to back this out completely and not leave a only half
working implementation in the code for an indeterminate time.
Bug: 553083
Change-Id: Icbbbb09882b3b83f9897deac4a06d5f8dc99d84e Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>