BatchRefUpdate: Skip saving conflicting ref names and prefixes in memory
Rather than getting all ref names and prefixes and saving them
in memory to perform the check for conflicting names, rely on
RefDirectory.isNameConflicting as it is no longer an expensive
call after it was optimized in Ie994fc.
The old optimization to save ref names and prefixes in memory
was targeted towards making clones faster. With this change,
the clone performance is unaffected when tests were done with
repos containing many(~500k) refs.
Here are few recorded elapsed times for creating 10 branches
using BatchRefUpdate on NFS based repositories with varying
loose refs count. As seen here, this change helps improve the
BatchRefUpdate performance from O(n^2) to O(1).
loose_refs_count with_change without_change
50 241 ms 310 ms
300 263 ms 1502 ms
1k 181 ms 4241 ms
2k 204 ms 6440 ms
9k 158 ms 25930 ms
20k 154 ms 60443 ms
50k 171 ms 135199 ms
110k 157 ms 329450 ms
160k 209 ms 396328 ms
This update improves the Gerrit notedb migration performance
as it uses BatchRefUpdate to write change meta refs similar to
the test performed above.
Change-Id: I853ac6c7feb4b39c3156c01876b38cbd182accfe
Signed-off-by: Kaushik Lingarkar <quic_kaushikl@quicinc.com>
Avoid having to scan over ALL loose refs to determine if the
name is nested within or is a container of an existing reference.
This can get really expensive if there are too many loose refs.
Instead use exactRef and getRefsByPrefix which scan based on a
prefix.
With a simple shell script(like below) using jgit client to create
1k refs in a new repository on NFS, this change brings down the time
from 12mins to 7mins.
for ref in $(seq 1 1000); do
jgit branch "$ref"
done
Here are few recorded elapsed times to create a new branch on NFS
based repositories with varying loose refs count. As we see here,
this change improves the name conflicting check from O(n^2) to O(1).
loose_refs_count with_change without_change
50 44 ms 164 ms
300 45 ms 1193 ms
1k 38 ms 2610 ms
2k 44 ms 6003 ms
9k 46 ms 27860 ms
20k 45 ms 48591 ms
50k 51 ms 135471 ms
110k 43 ms 294252 ms
160k 52 ms 430976 ms
Change-Id: Ie994fc184b8f82811bfb37b111eb9733dbe3e6e0
Signed-off-by: Kaushik Lingarkar <quic_kaushikl@quicinc.com>
Don't create the stream eagerly in lock(); that may cause JGit to
exceed OS or JVM limits on open file descriptors if many locks need
to be created, for instance when creating many refs. Instead create
the output stream only when one really needs to write something.
Bug: 573328
Change-Id: If9441ed40494d46f594a896d34a5c4f56f91ebf4
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
In a distributed setting, one can have multiple datacenters use
reftables for serving, while the ground truth for the Ref database is
administered centrally. In this setting, replication delays combined
with compaction can cause update-index ranges to overlap.
Such a setting is used at Google, and the JGit code already handles
this correctly (modulo a bugfix that applied in change I8f8215b99a).
Remove the restriction that was applied at FileReftableDatabase.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Change-Id: I6f9ed0fbd7fbc5220083ab808b22a909215f13a9
Fix PackInvalidException when fetch and repack run concurrently
We are running several servers with jGit. We need to run repack from
time to time to keep the repos performant. I.e. after push we test how
many small packs are in the repo and when a threshold is reached we run
the repack.
After upgrading jGit version we've found that if someone does the clone
at the time repack is running the clone sometimes (not always) fails
because the repack removes .pack file used by the clone. Server
exception and client error attached.
I've tracked down the cause and it seems to be introduced between jGit
5.2 (which we upgraded from) and 5.3 and being caused by this commit:
Move throw of PackInvalidException outside the catch -
afef866a44
The problem is that when the throw was inside of the try block the last
catch block catched the exception and called openFailed(false) method.
It is true that it called it with invalidate = false, which is wrong.
The real problem though is that with the throw outside of the try block
the openFail is not called at all and the fields activeWindows and
activeCopyRawData are not set to 0. Which affects the later called tests
like: if (++activeCopyRawData == 1 && activeWindows == 0).
The fix for this is relatively simple keeping the throw outside of the
try block and still having the invalid field set to true. I did
exhaustive testing of the change running concurrent clones and pushes
indefinitely and with the patch applied it never fails while without the
patch it takes relatively short to get the error.
See: https://www.eclipse.org/lists/jgit-dev/msg04014.html
Bug: 569349
Change-Id: I9dbf8801c8d3131955ad7124f42b62095d96da54
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
GC#deleteOrphans: handle failure to list files in pack directory
- log an error
- either there is no list or it is incomplete hence return immediately
Change-Id: Ieee5378ca06304056b9ccc30c1acd5f52360052d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
If pack or index files are guarded by a pack lock (.keep file)
deleteOrphans() should not touch the respective files protected by the
lock file. Otherwise it may interfere with PackInserter concurrently
inserting a new pack file and its index.
The problem was caused by the following race.
All mentioned files are located in "objects/pack/".
File endings relevant in "pack" dir:
.pack
.keep
.idx
.bitmap
When ReceivePack receives a pack file it executes the following steps:
ReceivePack.service():
receivePackAndCheckConnectivity():
receivePack():
receive the pack
parse the pack, returns packLock (.keep file)
PackInserter.flush():
write tmpPck file: "insert_<random>.pack"
write tmpIdx file: "insert_<random>.idx"
real pack name: "pack-<SHA1>.pack"
real index name: "pack-<SHA1>.idx"
atomic rename tmpPack to realPack
atomic rename tmpIdx to tmpIdx
execute commands
unlock pack by removing .keep file
trigger auto gc if enabled
When PackInserter.flush() renames the temporary pack to the final
"pack-xxx.pack" file the temporary pack index file "insert_xxx.idx"
has no matching .pack file with the same base name for a short interval.
If deleteOrphans() ran during that interval it deduced the pack index
file was orphaned. Subsequently the missing pack index caused
MissingObjectExceptions since objects contained in the pack couldn't be
looked up anymore.
Bug: https://bugs.chromium.org/p/gerrit/issues/detail?id=13544
Change-Id: I559c81e4b1d7c487f92a751bd78b987d32c98719
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Just allocate the two string objects directly. The previously used
new StringBuilder(0).toString() returns the same object for both END
and DELIM when run on Java 15, which breaks the wire protocol since
then END and DELIM cannot be distinguished anymore.
Bug: 568950
Change-Id: I9d54d7bf484948c24b51a094256bd9d38b085f35
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
(cherry picked from commit 7da0f0a8f3)
The current mechanism for updating the unpack error handler requires
that the error handler is replaced entirely, including communicating
the error to the user. Adding a getter means that delegating
implementations can be constructed so that the error can be processed
before sending to the user, for example for logging.
Change-Id: I4b6f78a041d0f6f5b4076a9a5781565ca3857817
Signed-off-by: Jack Wickham <jwickham@palantir.com>
ObjectWritingException and FileNotFoundException are subclasses
of IOException, which is already declared. Error does not need
to be explicitly declared.
Change-Id: I879820a33e10ec3a7ef676adc9c9148d2b3c4b27
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
ObjectDirectory: Further clean up insertUnpackedObject
- The code to move the file is repeated. Split it out into a
utility method.
- Remove the catch block for AtomicMoveNotSupportedException which
is redundant because it's handled in exactly the same way as the
IOException further down. The only exception we need to explicitly
handle differently in this block is NoSuchFileException.
- Improve the comments.
Change-Id: Ifc5490953ffb25ecd1c48a06289eccb3f19910c6
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Add Git#shutdown for releasing resources held by JGit process
The shutdown method releases
- ThreadLocal held by NLS
- GlobalBundleCache used by NLS
- Executor held by WorkQueue
Bug: 437855
Bug: 550529
Change-Id: Icfdccd63668ca90c730ee47a52a17dbd58695ada
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
ApplyCommand: use context lines to determine hunk location
If a hunk does not apply at the position stated in the hunk header
try to determine its position using the old lines (context and
deleted lines).
This is still a far cry from a full git apply: it doesn't do binary
patches, it doesn't handle git's whitespace options, and it's perhaps
not the fastest on big patches. C git hashes the lines and uses these
hashes to speed up matching hunks (and to do its whitespace magic).
Bug: 562348
Change-Id: Id0796bba059d84e648769d5896f497fde0b787dd
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Fix ProtectedMembersInFinalClass warning flagged by error prone
Running recent error prone version complaining on that code:
CharacterHead.java:22: error: [ProtectedMembersInFinalClass] Make
members of final classes package-private: <init>
protected CharacterHead(char expectedCharacter) {
^
(see https://errorprone.info/bugpattern/ProtectedMembersInFinalClass)
Did you mean 'CharacterHead(char expectedCharacter) {'
Bug: 562756
Change-Id: Ic46a0b07e46235592f6e63db631f583303420b73
Signed-off-by: David Ostrovsky <david@ostrovsky.org>
On the first attempt to move the temp file, NoSuchFileException can
be raised if the destination folder does not exist. Instead of handling
this implicitly in the catch of IOException and then continuing to
create the destination folder and try again, explicitly catch it and
create the destination folder. If any other IOException occurs, treat
it as an unexpected error and return FAILURE.
Subsequently, on the second attempt to move the temp file, if ANY kind
of IOException occurs, also consider this an unexpected error and
return FAILURE.
In both catch blocks for IOException, add logging at ERROR level.
Change-Id: I9de9ee3d2b368be36e02ee1c0daf8e844f7e46c8
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
ObjectDirectory: Fail immediately when atomic move is not supported
If atomic move is not supported, AtomicMoveNotSupportedException will
be thrown on the first attempt to move the temp file. There is no
point attempting the move operation a second time because it will only
fail for the same reason.
Add an immediate return of FAILURE on the first occasion. Remove the
unnecessary handling of the exception in the second block.
Change-Id: I4658a8b37cfec2d7ef0217c8346e512968d0964c
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Running recent error prone version complaining on that code:
RefDatabase.java:444: error: [InvalidInlineTag] Tag name `linkObjectId`
is unknown.
* Includes peeled {@linkObjectId}s. This is the inverse lookup of
^
(see https://errorprone.info/bugpattern/InvalidInlineTag)
Bug: 562756
Change-Id: If91da51d5138fb753c0550eeeb9e3883a394123d
Signed-off-by: David Ostrovsky <david@ostrovsky.org>
Motivation: JSch serves as 'default' implementations of the SSH
transport. If a client application does not use it then there is no need
to pull in this dependency.
Move the classes depending on JSch to an OSGi fragment extending the
org.eclipse.jgit bundle and keep them in the same package as before
since moving them to another package would break API. Defer moving them
to a separate package to the next major release.
Add a new feature org.eclipse.jgit.ssh.jsch feature to enable
installation. With that users can now decide which of the ssh client
integrations (JCraft JSch or Apache Mina SSHD) they want to install.
We will remove the JCraft JSch integration in a later step due to the
reasons discussed in bug 520927.
Bug: 553625
Change-Id: I5979c8a9dbbe878a2e8ac0fbfde7230059d74dc2
Also-by: Michael Dardis <git@md-5.net>
Signed-off-by: Michael Dardis <git@md-5.net>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: David Ostrovsky <david@ostrovsky.org>
Motivation: BouncyCastle serves as 'default' implementation of
the GPG Signer. If a client application does not use it there is no need
to pull in this dependency, especially since BouncyCastle is a large
library.
Move the classes depending on BouncyCastle to an OSGi fragment extending
the org.eclipse.jgit bundle. They are moved to a distinct internal
package in order to avoid split packages. This doesn't break public API
since these classes were already in an internal package before this
change.
Add a new feature org.eclipse.jgit.gpg.bc to enable installation. With
that users can now decide if they want to install it.
Attempts to sign a commit if org.eclipse.jgit.gpg.bc isn't available
will result in ServiceUnavailableException being thrown.
Bug: 559106
Change-Id: I42fd6c00002e17aa9a7be96ae434b538ea86ccf8
Also-by: Michael Dardis <git@md-5.net>
Signed-off-by: Michael Dardis <git@md-5.net>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: David Ostrovsky <david@ostrovsky.org>
If the determination of the user home directory produces a Java File
object with an invalid path, spurious exceptions may occur at the
most inopportune moments anytime later. In the case in the linked bug
report, start-up of EGit failed, leading to numerous user-visible
problems in Eclipse.
So validate the return value of FS.userHomeImpl(). If converting that
File to a Path throws an exception, log the problem and fall back to
Java system property user.home. If that also is not valid, use null.
(A null user home directory is allowed by FS, and calling in Java
new File(null, "some_string") is fine and produces a File relative
to the current working directory.)
Bug: 563739
Change-Id: If9eec0f9a31a45bd815231706285c71b09f8cf56
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Make it possible to programmatically suppress the JMX bean
registration. In EGit it is not needed but can be rather costly
because it occurs during plug-in activation and accesses the
git user config.
Bug: 563740
Change-Id: I07ef7ae2f0208d177d2a03862846a8efe0191956
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
RawTextComparator.WS_IGNORE_CHANGE must not compare whitespace
Only the presence or absence of whitespace is significant; but not the
actual whitespace characters. Don't compare whitespace bytes.
Compare the C git implementation at [1].
[1] https://github.com/git/git/blob/0d0e1e8/xdiff/xutils.c#L173
Bug: 563570
Change-Id: I2d0522b637ba6b5c8b911b3376a9df5daa9d4c27
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Revert "PackBitmapIndex: Not buffer inflated bitmap in BasePackBitmapIndex"
This reverts commit 3aee92478c2cbc67cd921533437b824e43ed9798, which
increased fetch latency significantly.
Change-Id: Id31a94dff83bf7ab2121718ead819bd08306a0b6
Signed-off-by: Yunjie Li <yunjieli@google.com>
A builder API provides a more convenient way to define a customized
SshdSessionFactory by hiding the subclassing.
Also provide a new interface SshConfigStore to abstract away the
specifics of reading a ssh config file, and provide a way to customize
the concrete ssh config implementation to be used. This facilitates
using an alternate ssh config implementation that may or may not be
based on files.
Change-Id: Ib9038e8ff2a4eb3a9ce7b3554d1450befec8e1e1
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
TransportHttp: abort on time-out or on SocketException
Avoid trying other authentication methods on SocketException or on
InterruptedIOException. SocketException is rather fatal, such as
nothing listening on the peer's port, connection reset, or it could
be a connection time-out.
Time-outs enforced by Timeout{Input,Output}Stream may result in
InterruptedIOException being thrown.
In both cases, it makes no sense to try other authentication methods,
and doing so may wrongly report "authentication not supported" or
"cannot open git-upload-pack" or some such instead of reporting a
time-out.
Bug: 563138
Change-Id: I0191b1e784c2471035e550205abd06ec9934fd00
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Config core.eol is to be ignored if core.autocrlf is true or input.[1]
JGit didn't do so when core.autocrlf=input was set.
[1] https://git-scm.com/docs/git-config#Documentation/git-config.txt-coreeol
Bug: 561877
Change-Id: I5e62e0510d160b5113c1090319af09c2bc1bcb59
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Attributes: fix handling of text=auto in combination with eol
In Git 2.10.0 the interpretation of gitattributes changed or was fixed
such that "* text=auto eol=crlf" would indeed still do auto-detection
of text vs. binary content.[1] Previously this was identical to
"* text eol=crlf", i.e., treating all files as text.
JGit still did the latter, which caused surprises because it changed
binary files.
[1] https://github.com/git/git/blob/master/Documentation/RelNotes/2.10.0.txt#L248
Bug: 561341
Change-Id: I5b6fb97b5e86fd950a98537b6b8574f768ae30e5
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Update dependency to Bouncy Castle to 1.65.
Add the IssuerFingerprint as a hashed sub-packet in the signature. If
added unhashed, GPG ignores it.
Bug: 553206
Change-Id: I6807e8e2385e6ec5790f388e4753a44aa9474ebb
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Suppress API error for new method BitmapIndex.Bitmap#retrieveCompressed
OSGi semantic versioning allows breaking implementers in a minor
release.
Change-Id: Ib55dc43dd3b50b0ef39a7094190f230210aee4b6
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>