Thomas Wolf [Sun, 25 Jul 2021 13:44:35 +0000 (15:44 +0200)]
[test] Create keystore with the keytool of the running JDK
Call keytool with the absolute path of "java.home". Otherwise a keytool
for a different, maybe even newer Java version might be picked up, and
then the keystore may not be readable by the JVM used to run the tests.
BatchRefUpdate: Skip saving conflicting ref names and prefixes in memory
Rather than getting all ref names and prefixes and saving them
in memory to perform the check for conflicting names, rely on
RefDirectory.isNameConflicting as it is no longer an expensive
call after it was optimized in Ie994fc.
The old optimization to save ref names and prefixes in memory
was targeted towards making clones faster. With this change,
the clone performance is unaffected when tests were done with
repos containing many(~500k) refs.
Here are few recorded elapsed times for creating 10 branches
using BatchRefUpdate on NFS based repositories with varying
loose refs count. As seen here, this change helps improve the
BatchRefUpdate performance from O(n^2) to O(1).
loose_refs_count with_change without_change
50 241 ms 310 ms
300 263 ms 1502 ms
1k 181 ms 4241 ms
2k 204 ms 6440 ms
9k 158 ms 25930 ms
20k 154 ms 60443 ms
50k 171 ms 135199 ms
110k 157 ms 329450 ms
160k 209 ms 396328 ms
This update improves the Gerrit notedb migration performance
as it uses BatchRefUpdate to write change meta refs similar to
the test performed above.
Update tests to record the number of events fired post-setup and only
assert for events fired during BatchRefUpdate.execute. For tests which
use writeLooseRef to setup refs, create new tests which assert the
number of RefsChangedEvent(s) rather than updating the existing ones
to call RefDirectory.exactRef as it changes the code path.
Avoid having to scan over ALL loose refs to determine if the
name is nested within or is a container of an existing reference.
This can get really expensive if there are too many loose refs.
Instead use exactRef and getRefsByPrefix which scan based on a
prefix.
With a simple shell script(like below) using jgit client to create
1k refs in a new repository on NFS, this change brings down the time
from 12mins to 7mins.
for ref in $(seq 1 1000); do
jgit branch "$ref"
done
Here are few recorded elapsed times to create a new branch on NFS
based repositories with varying loose refs count. As we see here,
this change improves the name conflicting check from O(n^2) to O(1).
loose_refs_count with_change without_change
50 44 ms 164 ms
300 45 ms 1193 ms
1k 38 ms 2610 ms
2k 44 ms 6003 ms
9k 46 ms 27860 ms
20k 45 ms 48591 ms
50k 51 ms 135471 ms
110k 43 ms 294252 ms
160k 52 ms 430976 ms
Thomas Wolf [Tue, 4 May 2021 21:48:56 +0000 (23:48 +0200)]
LockFile: create OutputStream only when needed
Don't create the stream eagerly in lock(); that may cause JGit to
exceed OS or JVM limits on open file descriptors if many locks need
to be created, for instance when creating many refs. Instead create
the output stream only when one really needs to write something.
Bug: 573328
Change-Id: If9441ed40494d46f594a896d34a5c4f56f91ebf4 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Petr Hrebejk [Mon, 30 Nov 2020 19:19:53 +0000 (20:19 +0100)]
Fix PackInvalidException when fetch and repack run concurrently
We are running several servers with jGit. We need to run repack from
time to time to keep the repos performant. I.e. after push we test how
many small packs are in the repo and when a threshold is reached we run
the repack.
After upgrading jGit version we've found that if someone does the clone
at the time repack is running the clone sometimes (not always) fails
because the repack removes .pack file used by the clone. Server
exception and client error attached.
I've tracked down the cause and it seems to be introduced between jGit
5.2 (which we upgraded from) and 5.3 and being caused by this commit:
Move throw of PackInvalidException outside the catch -
https://github.com/eclipse/jgit/commit/afef866a44cd65fef292c174cad445b3fb526400
The problem is that when the throw was inside of the try block the last
catch block catched the exception and called openFailed(false) method.
It is true that it called it with invalidate = false, which is wrong.
The real problem though is that with the throw outside of the try block
the openFail is not called at all and the fields activeWindows and
activeCopyRawData are not set to 0. Which affects the later called tests
like: if (++activeCopyRawData == 1 && activeWindows == 0).
The fix for this is relatively simple keeping the throw outside of the
try block and still having the invalid field set to true. I did
exhaustive testing of the change running concurrent clones and pushes
indefinitely and with the patch applied it never fails while without the
patch it takes relatively short to get the error.
Matthias Sohn [Wed, 25 Nov 2020 16:00:09 +0000 (17:00 +0100)]
Ensure that GC#deleteOrphans respects pack lock
If pack or index files are guarded by a pack lock (.keep file)
deleteOrphans() should not touch the respective files protected by the
lock file. Otherwise it may interfere with PackInserter concurrently
inserting a new pack file and its index.
The problem was caused by the following race.
All mentioned files are located in "objects/pack/".
File endings relevant in "pack" dir:
.pack
.keep
.idx
.bitmap
When ReceivePack receives a pack file it executes the following steps:
ReceivePack.service():
receivePackAndCheckConnectivity():
receivePack():
receive the pack
parse the pack, returns packLock (.keep file)
PackInserter.flush():
write tmpPck file: "insert_<random>.pack"
write tmpIdx file: "insert_<random>.idx"
real pack name: "pack-<SHA1>.pack"
real index name: "pack-<SHA1>.idx"
atomic rename tmpPack to realPack
atomic rename tmpIdx to tmpIdx
execute commands
unlock pack by removing .keep file
trigger auto gc if enabled
When PackInserter.flush() renames the temporary pack to the final
"pack-xxx.pack" file the temporary pack index file "insert_xxx.idx"
has no matching .pack file with the same base name for a short interval.
If deleteOrphans() ran during that interval it deduced the pack index
file was orphaned. Subsequently the missing pack index caused
MissingObjectExceptions since objects contained in the pack couldn't be
looked up anymore.
Bug: https://bugs.chromium.org/p/gerrit/issues/detail?id=13544
Change-Id: I559c81e4b1d7c487f92a751bd78b987d32c98719 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Michael Keppler [Tue, 14 Apr 2020 07:31:50 +0000 (09:31 +0200)]
Remove double blank from sentence start
Multiple whitespaces are not normalized when reading properties files,
therefore leading to unwanted space/indentation in console or UI output.
Change-Id: I1f5224fe359e0cac493e0237872afc75dc8b9fbe Signed-off-by: Michael Keppler <Michael.Keppler@gmx.de> Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
(cherry picked from commit ebbc3efce73278d6e0dbb1acd099db2446b1bed9)
Thomas Wolf [Thu, 2 Apr 2020 19:15:18 +0000 (21:15 +0200)]
FS.runInShell(): handle quoted filters and hooksPath containing blanks
Revert commit 2323d7a. Using $0 in the shell command call results in
the command string being taken literally. That was introduced to fix
a problem with backslashes, but is actually not correct.
First, the problem with backslashes occurred only on Win32/Cygwin,
and has been properly fixed in commit 6f268f8.
Second, this is used only for hooks (which don't have backslashes in
their names) and filter commands from the git config, where the user
is responsible for properly quoting or escaping such that the commands
work.
Third, using $0 actually breaks correctly quoted filter commands
like in the bug report. The shell really takes the command literally,
and then doesn't find the command because of quotes.
So revert this change.
At the same time there's a related problem with hooks. If the path to
the hook contains blanks, runInShell() would also fail to find the
hook. In this case, the command doesn't come from user input but is
just a Java File object with an absolute path containing blanks. (Can
occur if core.hooksPath points to such a path with blanks, or if the
repository has such a path.)
The path to the hook as obtained from the file system must be quoted.
Thomas Wolf [Wed, 25 Mar 2020 08:13:20 +0000 (09:13 +0100)]
Handle non-normalized index also for executable files
Commit 60cf85a4 corrected the handling of check-in for files where
the index version is non-normalized, i.e., contains CR-LF line endings.
However, it did so only for regular files, not executable files.
Bug: 561438
Change-Id: I372cc990c5efeb00315460f36459c0652d5d1e77 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Alex Blewitt [Tue, 25 Feb 2020 21:14:59 +0000 (21:14 +0000)]
Expose FileStoreAttributes.setBackground()
The FS.setAsyncFileStoreAttributes() static method calls
FileStoreAttributes.setBackground() as its implementation, but there are
other public attributes on this inner class already and there isn't a
real reason why this needs to be private.
By making it public we allow callers to be able to invoke it directly.
Although it doesn't appear that it would make a difference, by calling a
static method on the FS class, all static fields and the transitive
closure of class dependencies must be loaded and initialised, which can
be non-trivial.
Callers referring to FS.setAsyncFileStoreAttributes() may be replaced
with FS.FileStoreAttributes.setBackground() with no change of behaviour
other than improved performance due to less class loading required.
Bug: 560527
Change-Id: I9538acc90da8d18f53fd60d74eb54496857f93a5 Signed-off-by: Alex Blewitt <alex.blewitt@gmail.com>
Han-Wen Nienhuys [Tue, 18 Feb 2020 19:44:10 +0000 (20:44 +0100)]
Update reftable storage repo layout
The change Ic0b974fa (c217d33, "Documentation/technical/reftable:
improve repo layout") defines a new repository layout, which was
agreed with the git-core mailing list.
It addresses the following problems:
* old git clients will not recognize reftable-based repositories, and
look at encompassing directories.
* Poorly written tools might write directly into
.git/refs/heads/BRANCH.
Since we consider JGit reftable as experimental (git-core doesn't
support it yet), we have no backward compatibility. If you created a
repository with reftable between mid-Nov 2019 and now, you can do the
following to convert:
Thomas Wolf [Thu, 27 Feb 2020 19:04:47 +0000 (20:04 +0100)]
Cygwin expects forward slashes for commands to be run via sh.exe
FS_Win32_Cygwin replaces backslashes by / as a side-effect of
relativize(). When support for core.hooksPath was added, paths were
relativized in a different place using Path.resolve(), which doesn't
do that transformation. As a result hooks could not be run on Cygwin
in some cases.
Do the transformation in FS_Win32_Cygwin.runInShell(). In all other
places, File or Path objects are used, which give no guarantee about
the file separator (typically the system-dependent default separator),
so doing the transformation earlier still wouldn't guarantee that
sh.exe indeed gets a command string using forward slashes.
Bug: 558577
Change-Id: I3c07eb85f0ac7c5628a2e92f990e5cdb7ecf532f Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
David Pursehouse [Thu, 27 Feb 2020 11:27:31 +0000 (20:27 +0900)]
Move array designators from the variable to the type
As reported by Sonar Lint:
Array designators should always be located on the type for better code
readability. Otherwise, developers must look both at the type and the
variable name to know whether or not a variable is an array.
Change-Id: If6b41fed3483d0992d402d8680552ab4bef89ffb Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Resolving the hostname comes with a performance penalty. We no longer
store the timestamp resolution in the global git config which might be
copied around to other machines but in a dedicated jgit config meant for
automatically determined options like timestamp resolution. Hence there
is no strong reason anymore to have a hardware specific identifier in
the subsection name of file timestamp resolution options.
Bug: 560414
Change-Id: If8dcabe981eb1792db84643850faa6033f14b1cf Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Han-Wen Nienhuys [Tue, 21 Jan 2020 18:06:30 +0000 (19:06 +0100)]
Simplify ReftableCompactor
The ReftableCompactor supported a byteLimit, but this is currently
unused. The FileReftableStack has a more sophisticated strategy that
amortizes compaction costs.
Rename min/maxUpdateIndex to reflogExpire{Min,Max}UpdateIndex to
reflect their purpose more accurately.
Since reflogs are generally pruned chronologically (oldest entries are
expired first), one can only prune entries on full compaction, so they
should not be set by default.
Rephrase the function Reader#minUpdateIndex and maxUpdateIndex. These
vars are documented to affect log entries, but semantically, they are
about ref entries. Since ref entries have their timestamps
delta-compressed, it is important for the min/maxUpdateIndex values to
be coherent between different tables.
The logical timestamps for log entries do not have to be coherent in
different tables, as the timestamps of a log entry is part of the key.
For example, a table written at update index 20 may contain a tombstone
log entry at timestamp 1.
Therefore, we set ReftableWriter's min/maxUpdateIndex from the merged
tables we are compacting, rather than from the compaction settings
(which should only control reflog expiry.)
The previous behavior could drop log entries erroneously, especially
in the presence of tombstone log entries. Unfortunately, testing this
properly requires both an API for adding log tombstones, and a more
refined API for controlling automatic compaction. Hence, no test.
Thomas Wolf [Mon, 3 Feb 2020 20:00:46 +0000 (21:00 +0100)]
Restore behavior of CloneCommand
Commit 6216b0de changed the behavior of the setMirror(),
setCloneAllBranches(), and setBranchesToClone() operations. Before
that commit, these could be set and reset independently and only in
call() it would be determined what exactly to do. Since that commit,
the last of these calls would determine the operation. This means
that the sequence
cloneCommand.setCloneAllBranches(true);
cloneCommand.setBranchesToClone(/* some list of refs */);
would formerly do a "clone all" giving a fetch refspec with wildcards
+refs/heads/*:refs/remotes/origin/*
which picks up new upstream branches, whereas since commit 6216b0de
individual non-wildcard fetch refspecs would be generated and new
upstream branches would not be fetched anymore.
Undo this behavioral change. Make the operations independently settable
and resettable again, and determine the exact operation only in call():
mirror=true > cloneAll=true > specific refs, where ">" means "takes
precedence over", and if none is set assume cloneAll=true.
Note that mirror=true implies setBare(true).
Bug: 559796
Change-Id: I7162b60e99de5e3e512bf27ff4113f554c94f5a6 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Matthias Sohn [Sat, 1 Feb 2020 00:57:03 +0000 (01:57 +0100)]
Merge branch 'stable-5.5' into stable-5.6
* stable-5.5:
Fix string format parameter for invalidRefAdvertisementLine
WindowCache: add metric for cached bytes per repository
pgm daemon: fallback to user and system config if no config specified
WindowCache: add option to use strong refs to reference ByteWindows
Replace usage of ArrayIndexOutOfBoundsException in treewalk
Add config constants for WindowCache configuration options
Matthias Sohn [Sat, 1 Feb 2020 00:41:52 +0000 (01:41 +0100)]
Merge branch 'stable-5.4' into stable-5.5
* stable-5.4:
Fix string format parameter for invalidRefAdvertisementLine
WindowCache: add metric for cached bytes per repository
pgm daemon: fallback to user and system config if no config specified
WindowCache: add option to use strong refs to reference ByteWindows
Replace usage of ArrayIndexOutOfBoundsException in treewalk
Add config constants for WindowCache configuration options
Matthias Sohn [Sat, 1 Feb 2020 00:28:40 +0000 (01:28 +0100)]
Merge branch 'stable-5.3' into stable-5.4
* stable-5.3:
Fix string format parameter for invalidRefAdvertisementLine
WindowCache: add metric for cached bytes per repository
pgm daemon: fallback to user and system config if no config specified
WindowCache: add option to use strong refs to reference ByteWindows
Replace usage of ArrayIndexOutOfBoundsException in treewalk
Add config constants for WindowCache configuration options
Matthias Sohn [Sat, 1 Feb 2020 00:14:01 +0000 (01:14 +0100)]
Merge branch 'stable-5.2' into stable-5.3
* stable-5.2:
Fix string format parameter for invalidRefAdvertisementLine
WindowCache: add metric for cached bytes per repository
pgm daemon: fallback to user and system config if no config specified
WindowCache: add option to use strong refs to reference ByteWindows
Replace usage of ArrayIndexOutOfBoundsException in treewalk
Add config constants for WindowCache configuration options
David Pursehouse [Fri, 31 Jan 2020 00:12:08 +0000 (09:12 +0900)]
Merge branch 'stable-5.1' into stable-5.2
* stable-5.1:
Fix string format parameter for invalidRefAdvertisementLine
WindowCache: add metric for cached bytes per repository
pgm daemon: fallback to user and system config if no config specified
WindowCache: add option to use strong refs to reference ByteWindows
Change-Id: I741059a1d0d5950ab5bc16ec70352655ee926a24 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
David Pursehouse [Thu, 30 Jan 2020 23:38:24 +0000 (08:38 +0900)]
Fix string format parameter for invalidRefAdvertisementLine
The externalized error message added in f4fc640 ("BasePackConnection:
Check for expected length of ref advertisement", Dec 18, 2019) uses a
malformed string format. Since there is only one formatting argument,
it should be referenced with '{0}' rather than '{1}'.
Change-Id: Ibda864dfb0bb902fe07ae4bba73117b212046e8a Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Matthias Sohn [Fri, 3 Jan 2020 17:16:42 +0000 (18:16 +0100)]
WindowCache: add metric for cached bytes per repository
Since ObjectDatabase and PackFile don't know their repository use the
packfile's grand-grand-parent directory as an identifier for the
repository the packfile resides in.
Remove metric for a repository if the number of cached bytes for the
repository drops to 0 in order to ensure the map of cached bytes per
repository doesn't contain repositories which have no data cached in the
WindowCache.
Change-Id: I969ab8029db0a292e7585cbb36ca0baa797da20b Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>