Thomas Wolf [Sun, 25 Jul 2021 13:44:35 +0000 (15:44 +0200)]
[test] Create keystore with the keytool of the running JDK
Call keytool with the absolute path of "java.home". Otherwise a keytool
for a different, maybe even newer Java version might be picked up, and
then the keystore may not be readable by the JVM used to run the tests.
Antonio Barone [Wed, 2 Jun 2021 15:13:17 +0000 (18:13 +0300)]
Retry loose object read upon "Stale file handle" exception
When reading loose objects over NFS it is possible that the OS syscall
would fail with ESTALE errors: This happens when the open file
descriptor no longer refers to a valid file.
Notoriously it is possible to hit this scenario when git data is shared
among multiple clients, for example by multiple gerrit instances in HA.
If one of the two clients performs a GC operation that would cause the
packing and then the pruning of loose objects, the other client might
still hold a reference to those objects, which would cause an exception
to bubble up the stack.
The Linux NFS FAQ[1] (at point A.10), suggests that the proper way to
handle such ESTALE scenarios is to:
"[...] close the file or directory where the error occurred, and reopen
it so the NFS client can resolve the pathname again and retrieve the new
file handle."
In case of a stale file handle exception, we now attempt to read the
loose object again (up to 5 times), until we either succeed or encounter
a FileNotFoundException, in which case the search can continue to
Packfiles and alternates.
The limit of 5 provides an arbitrary upper bounds that is consistent to
the one chosen when handling stale file handles for packed-refs
files (see [2] for context).
andrewxian2000 [Mon, 14 Jun 2021 21:58:52 +0000 (09:58 +1200)]
Fix garbage collection failing to delete pack file
The loosen() method has opened pack file and the open pack file handle
may prevent it from being deleted e.g. on Windows. Fix this by closing
the pack file only after loosen() finished.
Bug: 574178
Change-Id: Icd59931a218d84c9c97b450eea87b21ed01248ff Signed-off-by: andrew.xian2000@gmail.com Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Tue, 1 Jun 2021 14:36:43 +0000 (16:36 +0200)]
Merge branch 'master' into stable-5.12
* master:
Update Orbit to S20210518003616 and ant to 1.10.10.v20210426-1926
Skip detecting content renames for binary files
RepoCommand: Retry commit on LockFailure
Finish upgrading eclipse-jarsigner-plugin to 1.3.1
Upgrade maven-javadoc-plugin to 3.3.0
Update maven-project-info-reports-plugin to 3.1.2
Update spotbugs-maven-plugin to 4.2.3
This is similar to change Idbc2c29bd that skipped detecting content
renames for large files. With this change, we added a new option in
RenameDetector called "skipContentRenamesForBinaryFiles", that when set,
causes binary files with any slight modification to be identified as
added/deleted. The default for this boolean is false, so preserving
current behaviour.
* changes:
Finish upgrading eclipse-jarsigner-plugin to 1.3.1
Upgrade maven-javadoc-plugin to 3.3.0
Update maven-project-info-reports-plugin to 3.1.2
Update spotbugs-maven-plugin to 4.2.3
Ivan Frade [Mon, 24 May 2021 16:53:08 +0000 (09:53 -0700)]
RepoCommand: Retry commit on LockFailure
When the target repository is receiving commits from other sources,
the repo command commit can fail with a LOCK_FAILURE. We could let
callers retry, but then the command needs to redo all the work (opening
all subrepos to recreate the tree).
Retry the commit in LOCK_FAILURE inside the command. The commit
rewrites the whole tree, so it shouldn't have merge errors. Use an
exponential delay with jitter for the retries.
Change-Id: I517b6f2afd16a4b695e6cf471b5d6cf492024ec4 Signed-off-by: Ivan Frade <ifrade@google.com>
Matthias Sohn [Wed, 26 May 2021 15:21:02 +0000 (17:21 +0200)]
Merge branch 'master' into stable-5.12
* master:
RepoCommand: Do not set 'branch' if the revision is a tag
pgm: rewrite parents when --parents flag is passed
ApplyCommand: fix "no newline at end" detection
ApplyCommand: handle completely empty context lines in text patches
ApplyCommand: use byte arrays for text patches, not strings
ApplyCommand: support binary patches
ApplyCommand: add a stream to apply a delta patch
ApplyCommand: add streams to read/write binary patch hunks
ApplyCommand: add a base-85 codec
ApplyCommand: convert to git internal format before applying patch
SSH config: fix whitespace handling
SSH config: fix negated patterns
Fix @since tag for introduction of PUBKEY_ACCEPTED_ALGORITHMS
Prepare 5.11.2-SNAPSHOT builds
JGit v5.11.1.202105131744-r
Add a cgit interoperability test for LockFile
Add TemporaryBuffer.toString(int limit)
LockFile: create OutputStream only when needed
Add git config for conflict style merge/diff3
Change-Id: If7751ff99079eaea31ed1fce811d141ecf209727 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Ivan Frade [Tue, 25 May 2021 00:31:25 +0000 (17:31 -0700)]
RepoCommand: Do not set 'branch' if the revision is a tag
The "branch" field in the .gitmodules is the signal for gerrit to keep
the superproject autoupdated. Tags are immutable and there is no need to
track them, plus the cgit client requires the field to be a "remote
branch name" but not a tag.
Do not set the "branch" field if the revision is a tag. Keep those tags
in another field ("ref") as they help other tools to find the commit in
the destination repository.
We can still have false negatives when a refname is not fully qualified,
but this check covers e.g. the most common case in android.
Note that the javadoc of #setRecordRemoteBranch already mentions that
"submodules that request a tag will not have branch name recorded".
Change-Id: Ib1c321a4d3b7f8d51ca2ea204f72dc0cfed50c37 Signed-off-by: Ivan Frade <ifrade@google.com>
Matthias Sohn [Wed, 26 May 2021 12:12:17 +0000 (08:12 -0400)]
Merge changes from topic "apply"
* changes:
ApplyCommand: fix "no newline at end" detection
ApplyCommand: handle completely empty context lines in text patches
ApplyCommand: use byte arrays for text patches, not strings
ApplyCommand: support binary patches
ApplyCommand: add a stream to apply a delta patch
ApplyCommand: add streams to read/write binary patch hunks
ApplyCommand: add a base-85 codec
ApplyCommand: convert to git internal format before applying patch
kylezhao [Tue, 30 Mar 2021 03:04:12 +0000 (11:04 +0800)]
pgm: rewrite parents when --parents flag is passed
According to [1], we should rewrite parents in RevWalkTextBuiltin
when variable parents is true.
[1]
https://git-scm.com/docs/git-rev-list#Documentation/git-rev-list.txt---parents
Thomas Wolf [Wed, 10 Mar 2021 18:26:39 +0000 (19:26 +0100)]
ApplyCommand: fix "no newline at end" detection
Check the last line of the last hunk of a file, not the last line of
the whole patch.
Note that C git only checks that this line starts with "\ " and is at
least 12 characters long because of possible different texts when non-
English messages are used.
Change-Id: I0db81699eb3e99ed7b536a3e2b8dc97df1f58a89 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Wed, 10 Mar 2021 17:04:25 +0000 (18:04 +0100)]
ApplyCommand: handle completely empty context lines in text patches
C git treats completely empty lines as empty context lines (which
traditionally have a single blank). Apparently newer GNU diff may
produce such lines; see [1]. ("Newer" meaning "since 2006"...)
Thomas Wolf [Wed, 10 Mar 2021 13:25:37 +0000 (14:25 +0100)]
ApplyCommand: use byte arrays for text patches, not strings
Instead of converting the patch bytes to strings apply the patch on
byte level, like C git does. Converting the input lines and the hunk
lines from bytes to strings and then applying the patch based on
strings may give surprising results if a patch converts a text file
from one encoding to another. Moreover, in the end we don't know which
encoding to use to write the result.
Previous code just wrote the result as UTF-8, which forcibly changed
the encoding if the original input had some other encoding (even if the
patch had the same non-UTF-8 encoding). It was also wrong if the input
was UTF-8, and the patch should have changed the encoding to something
else.
So use ByteBuffers instead of Strings. This has the additional advantage
that all these ByteBuffers can share the underlying byte arrays of the
input and of the patch, so it also reduces memory consumption.
Change-Id: I450975f2ba0e7d0bec8973e3113cc2e7aea187ee Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Sun, 7 Mar 2021 17:50:23 +0000 (18:50 +0100)]
ApplyCommand: support binary patches
Implement applying binary patches. Handles both literal and delta
patches. Note that C git also runs binary files through the clean
and smudge filters. Implement the same safeguards against corrupted
patches as in C git: require the full OIDs to be present in the patch
file, and apply a binary patch only if both pre- and post-image hashes
match.
Add tests for applying literal and delta patches.
Bug: 371725
Change-Id: I71dc214fe4145d7cc8e4769384fb78c7d0d6c220 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Sun, 7 Mar 2021 17:49:01 +0000 (18:49 +0100)]
ApplyCommand: add a stream to apply a delta patch
Add a new BinaryDeltaInputStream that applies a delta provided by
another InputStream to a given base. Because delta application needs
random access to the base, the base itself cannot be yet another
InputStream. But at least this enables streaming of the result.
Add a simple test using delta hunks generated by C git.
Bug: 371725
Change-Id: Ibd26fa2f49860737ad5c5387f7f4870d3e85e628 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Thomas Wolf [Fri, 5 Mar 2021 23:00:15 +0000 (00:00 +0100)]
ApplyCommand: add streams to read/write binary patch hunks
Add streams that can encode or decode git binary patch data on the fly.
Git writes binary patches base-85 encoded, at most 52 un-encoded bytes,
with the unencoded data length prefixed in a one-character encoding, and
suffixed with a newline character.
Add a test for both the new input and the output stream. The test
roundtrips binary data of different lengths in different ways.
Bug: 371725
Change-Id: Ic3faebaa4637520f5448b3d1acd78d5aaab3907a Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Fri, 5 Mar 2021 22:55:18 +0000 (23:55 +0100)]
ApplyCommand: add a base-85 codec
Add an implementation for base-85 encoding and decoding [1]. Git binary
patches use this format.
Base-85 encoding assembles bytes as 32-bit MSB values, then converts
these values to base-85 numbers (always 5 bytes) encoded as printable
ASCII characters. Decoding base-85 is the reverse operation. Note
that decoding may overflow on invalid input as 85^5 > 2^32. Encodings
always have a length that is a multiple of 5. If input length is not
divisible by 4, padding bytes are (logically) added, which are ignored
when decoding. The encoding for n bytes has thus always exactly length
(n + 3) / 4 * 5 in integer arithmetic (truncating division).
Includes tests.
[1] https://datatracker.ietf.org/doc/html/rfc1924
Bug: 371725
Change-Id: Ib5b9a503cd62cf70e080a4fb38c8cd1eeeaebcfe Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
BatchRefUpdate: Skip saving conflicting ref names and prefixes in memory
Rather than getting all ref names and prefixes and saving them
in memory to perform the check for conflicting names, rely on
RefDirectory.isNameConflicting as it is no longer an expensive
call after it was optimized in Ie994fc.
The old optimization to save ref names and prefixes in memory
was targeted towards making clones faster. With this change,
the clone performance is unaffected when tests were done with
repos containing many(~500k) refs.
Here are few recorded elapsed times for creating 10 branches
using BatchRefUpdate on NFS based repositories with varying
loose refs count. As seen here, this change helps improve the
BatchRefUpdate performance from O(n^2) to O(1).
loose_refs_count with_change without_change
50 241 ms 310 ms
300 263 ms 1502 ms
1k 181 ms 4241 ms
2k 204 ms 6440 ms
9k 158 ms 25930 ms
20k 154 ms 60443 ms
50k 171 ms 135199 ms
110k 157 ms 329450 ms
160k 209 ms 396328 ms
This update improves the Gerrit notedb migration performance
as it uses BatchRefUpdate to write change meta refs similar to
the test performed above.
Update tests to record the number of events fired post-setup and only
assert for events fired during BatchRefUpdate.execute. For tests which
use writeLooseRef to setup refs, create new tests which assert the
number of RefsChangedEvent(s) rather than updating the existing ones
to call RefDirectory.exactRef as it changes the code path.
Avoid having to scan over ALL loose refs to determine if the
name is nested within or is a container of an existing reference.
This can get really expensive if there are too many loose refs.
Instead use exactRef and getRefsByPrefix which scan based on a
prefix.
With a simple shell script(like below) using jgit client to create
1k refs in a new repository on NFS, this change brings down the time
from 12mins to 7mins.
for ref in $(seq 1 1000); do
jgit branch "$ref"
done
Here are few recorded elapsed times to create a new branch on NFS
based repositories with varying loose refs count. As we see here,
this change improves the name conflicting check from O(n^2) to O(1).
loose_refs_count with_change without_change
50 44 ms 164 ms
300 45 ms 1193 ms
1k 38 ms 2610 ms
2k 44 ms 6003 ms
9k 46 ms 27860 ms
20k 45 ms 48591 ms
50k 51 ms 135471 ms
110k 43 ms 294252 ms
160k 52 ms 430976 ms
Thomas Wolf [Tue, 2 Mar 2021 08:53:08 +0000 (09:53 +0100)]
ApplyCommand: convert to git internal format before applying patch
Applying a patch on Windows failed if the patch had the (normal)
single-LF line endings, but the file on disk had the usual Windows
CR-LF line endings.
Git (and JGit) compute diffs on the git-internal blob, i.e., after
CR-LF transformation and clean filtering. Applying patches to files
directly is thus incorrect and may fail if CR-LF settings don't
match, or if clean/smudge filtering is involved.
Change ApplyCommand to run the file content through the check-in
filters before applying the patch, and run the result through the
check-out filters. This makes patch application succeed even if the
patch has single-LFs, but the file has CR-LF and core.autocrlf is
true.
Add tests for various combinations of line endings in the file and in
the patch, and a test to verify the clean/smudge handling.
See also [1].
Running the file though clean/smudge may give strange results with
LFS-managed files. JGit's DiffFormatter has some extra code and
applies the smudge filter again after having run the file through
the check-in filters (CR-LF and clean). So JGit can actually produce
a diff on LFS-managed files using the normal diff machinery. (If it
doesn't run out of memory, that is. After all, LFS is intended for
_large_ files.) How such a diff would be applied with either C git
or JGit is entirely unclear; neither has any code for this special
case. Compare also [2].
Note that C git just doesn't know about LFS and always diffs after
the check-in filter chain, so for LFS files, it'll produce a diff
of the LFS pointers.
Thomas Wolf [Sat, 15 May 2021 16:13:04 +0000 (18:13 +0200)]
SSH config: fix negated patterns
Negated patterns were handled wrongly. According to the OpenBSD
ssh_config man page,[1] a negated pattern never matches. Negated
patterns make only sense if there are positive patterns; the
negated pattern then can define exceptions for the positive
patterns.
OpenSshConfigFile did this wrongly. It handled "!foo" as "matching
everything but foo", but actually the semantics is "if the input is
"foo", this entry doesn't apply. If the input is anything else,
other patterns determine whether the entry may apply.".
[1] https://man.openbsd.org/ssh_config
Change-Id: I50f6e46581b7ece4c949eddf62f4a265573ec29e Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Matthias Sohn [Mon, 10 May 2021 22:56:57 +0000 (00:56 +0200)]
Merge branch 'stable-5.9' into stable-5.10
* stable-5.9:
LockFile: create OutputStream only when needed
Remove ReftableNumbersNotIncreasingException
Fix stamping to produce stable file timestamps
Thomas Wolf [Tue, 4 May 2021 21:48:56 +0000 (23:48 +0200)]
LockFile: create OutputStream only when needed
Don't create the stream eagerly in lock(); that may cause JGit to
exceed OS or JVM limits on open file descriptors if many locks need
to be created, for instance when creating many refs. Instead create
the output stream only when one really needs to write something.
Bug: 573328
Change-Id: If9441ed40494d46f594a896d34a5c4f56f91ebf4 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>