Thomas Wolf [Tue, 2 Mar 2021 08:53:08 +0000 (09:53 +0100)]
ApplyCommand: convert to git internal format before applying patch
Applying a patch on Windows failed if the patch had the (normal)
single-LF line endings, but the file on disk had the usual Windows
CR-LF line endings.
Git (and JGit) compute diffs on the git-internal blob, i.e., after
CR-LF transformation and clean filtering. Applying patches to files
directly is thus incorrect and may fail if CR-LF settings don't
match, or if clean/smudge filtering is involved.
Change ApplyCommand to run the file content through the check-in
filters before applying the patch, and run the result through the
check-out filters. This makes patch application succeed even if the
patch has single-LFs, but the file has CR-LF and core.autocrlf is
true.
Add tests for various combinations of line endings in the file and in
the patch, and a test to verify the clean/smudge handling.
See also [1].
Running the file though clean/smudge may give strange results with
LFS-managed files. JGit's DiffFormatter has some extra code and
applies the smudge filter again after having run the file through
the check-in filters (CR-LF and clean). So JGit can actually produce
a diff on LFS-managed files using the normal diff machinery. (If it
doesn't run out of memory, that is. After all, LFS is intended for
_large_ files.) How such a diff would be applied with either C git
or JGit is entirely unclear; neither has any code for this special
case. Compare also [2].
Note that C git just doesn't know about LFS and always diffs after
the check-in filter chain, so for LFS files, it'll produce a diff
of the LFS pointers.
Thomas Wolf [Sat, 15 May 2021 16:13:04 +0000 (18:13 +0200)]
SSH config: fix negated patterns
Negated patterns were handled wrongly. According to the OpenBSD
ssh_config man page,[1] a negated pattern never matches. Negated
patterns make only sense if there are positive patterns; the
negated pattern then can define exceptions for the positive
patterns.
OpenSshConfigFile did this wrongly. It handled "!foo" as "matching
everything but foo", but actually the semantics is "if the input is
"foo", this entry doesn't apply. If the input is anything else,
other patterns determine whether the entry may apply.".
[1] https://man.openbsd.org/ssh_config
Change-Id: I50f6e46581b7ece4c949eddf62f4a265573ec29e Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Matthias Sohn [Mon, 10 May 2021 22:56:57 +0000 (00:56 +0200)]
Merge branch 'stable-5.9' into stable-5.10
* stable-5.9:
LockFile: create OutputStream only when needed
Remove ReftableNumbersNotIncreasingException
Fix stamping to produce stable file timestamps
Thomas Wolf [Tue, 4 May 2021 21:48:56 +0000 (23:48 +0200)]
LockFile: create OutputStream only when needed
Don't create the stream eagerly in lock(); that may cause JGit to
exceed OS or JVM limits on open file descriptors if many locks need
to be created, for instance when creating many refs. Instead create
the output stream only when one really needs to write something.
Bug: 573328
Change-Id: If9441ed40494d46f594a896d34a5c4f56f91ebf4 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Tue, 4 May 2021 21:48:56 +0000 (23:48 +0200)]
LockFile: create OutputStream only when needed
Don't create the stream eagerly in lock(); that may cause JGit to
exceed OS or JVM limits on open file descriptors if many locks need
to be created, for instance when creating many refs. Instead create
the output stream only when one really needs to write something.
Bug: 573328
Change-Id: If9441ed40494d46f594a896d34a5c4f56f91ebf4 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Mon, 12 Apr 2021 21:50:54 +0000 (23:50 +0200)]
Implement ours/theirs content conflict resolution
Git has different conflict resolution strategies:
* There is a tree merge strategy "ours" which just ignores any changes
from theirs ("-s ours"). JGit also has the mirror strategy "theirs"
ignoring any changes from "ours". (This doesn't exist in C git.)
Adapt StashApplyCommand and CherrypickCommand to be able to use those
tree merge strategies.
* For the resolve/recursive tree merge strategies, there are content
conflict resolution strategies "ours" and "theirs", which resolve
any conflict hunks by taking the "ours" or "theirs" hunk. In C git
those correspond to "-Xours" or -Xtheirs". Implement that in
MergeAlgorithm, and add API to set and pass through such a strategy
for resolving content conflicts.
* The "ours/theirs" content conflict resolution strategies also apply
for binary files. Handle these cases in ResolveMerger.
Note that the content conflict resolution strategies ("-X ours/theirs")
do _not_ apply to modify/delete or delete/modify conflicts. Such
conflicts are always reported as conflicts by C git. They do apply,
however, if one side completely clears a file's content.
Bug: 501111
Change-Id: I2c9c170c61c440a2ab9c387991e7a0c3ab960e07 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Allow file mode conflicts in virtual base commit on recursive merge.
Similar to https://git.eclipse.org/r/c/jgit/jgit/+/175166, ignore
path that have conflicts on attributes, so that the virtual base could
be used by RecursiveMerger.
Change-Id: I99c95445a305558d55bbb9c9e97446caaf61c154 Signed-off-by: Marija Savtchouk <mariasavtchouk@google.com>
Thomas Wolf [Mon, 22 Mar 2021 11:20:52 +0000 (12:20 +0100)]
sshd: don't lock the known_hosts files on reading
Similar to git config file reading lock the file only when writing.
There may still be lock conflicts on writing, but those in the worst
case result in an entry not being added and thus being asked for later
again.
Because the OpenSshServerkeyDatabase and its HostKeyFiles may be (and
usually are) shared between different SSH sessions, we still need to
ensure in-process mutual exclusion.
Bug: 559548
Change-Id: I4af97628deff9eaac2520576917c856949f2680d Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Sat, 20 Mar 2021 17:54:17 +0000 (18:54 +0100)]
Allow info messages in UsernamePasswordCredentialsProvider
o.e.j.ssh.apache produces passphrase prompts containing
InformationalMessage items to show the fingerprint of the key
the passphrase is being asked for. Allow this so that the credentials
provider can be used with o.e.j.ssh.apache.
Change-Id: Ibc2ffd3a987d3118952726091b9b80442972dfd8 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Apache MINA sshd has an implementation of this, but it doesn't comply
to RFC 8308 [1] and it is buggy. (See SSHD-1141 [2].)
Add a simpler KexExtensionHandler and if the server sends extension
server-sig-algs, use its value to re-order the chosen signature
algorithms such that the algorithms the server announced as supported
are at the front.
If the server didn't tell us anything, don't do anything. RFC 8308
suggests for RSA to default to ssh-rsa, but says once rsa-sha2-* was
"widely enough" adopted, defaulting to that might be OK.
Currently we seem to be in a transition phase; Fedora 33 has already
disabled ssh-rsa by default, and openssh is about to do so. Whatever
we might do without info from the server, it'd be good for some servers
and bad for others. So don't do anything and let the user re-order via
ssh config PubkeyAcceptedAlgorithms on a case-by-case basis.
Matthias Sohn [Fri, 26 Mar 2021 08:55:58 +0000 (09:55 +0100)]
Merge branch 'stable-5.11'
* stable-5.11:
Refactor CommitCommand to improve readability
CommitCommand: fix formatting
CommitCommand: remove unncessary comment
Ensure post-commit hook is called after index lock was released
sshd: try all configured signature algorithms for a key
sshd: modernize ssh config file parsing
sshd: implement ssh config PubkeyAcceptedAlgorithms
Change-Id: Ic3235ffd84c9d7537a1fe5ff4f216578e6e26724 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Thomas Wolf [Fri, 19 Mar 2021 08:35:34 +0000 (09:35 +0100)]
sshd: try all configured signature algorithms for a key
For RSA keys, there may be several configured signature algorithms:
rsa-sha2-512, rsa-sha2-256, and ssh-rsa. Upstream sshd has bug
SSHD-1105 [1] and always and unconditionally uses only the first
configured algorithm. With the default order, this means that it cannot
connect to a server that knows only ssh-rsa, like for instance Apache
MINA sshd servers older than 2.6.0.
This affects for instance bitbucket.org or also AWS Code Commit.
Re-introduce our own pubkey authenticator that fixes this.
Note that a server may impose a penalty (back-off delay) for subsequent
authentication attempts with signature algorithms unknown to the server.
In such cases, users can re-order the signature algorithm list via the
PubkeyAcceptedAlgorithms (formerly PubkeyAcceptedKeyTypes) ssh config.
Apache MINA sshd 2.6.0 appears to use only the first appropriate
public key signature algorithm for a particular key. See [1]. For
RSA keys, that is rsa-sha2-512. This breaks authentication at servers
that only know the older (and deprecated) ssh-rsa algorithm.
With PubkeyAcceptedAlgorithms, users can re-order algorithms in
the ssh config file per host, if needed. Setting
PubkeyAcceptedAlgorithms ^ssh-rsa
will put "ssh-rsa" at the front of the list of algorithms, and then
authentication at such servers with RSA keys works again.
Adithya Chakilam [Tue, 23 Feb 2021 19:58:03 +0000 (13:58 -0600)]
Optimize RevWalkUtils.findBranchesReachableFrom()
In [1], improved RevWalk.getMergedInto() is introduced to avoid repeated
work while performing RevWalk.isMergedInto() on many refs. Modify
findBranchesReachableFrom() to use it.
In cases where we need to determine if a given commit is merged
into many refs, using isMergedInto(base, tip) for each ref would
cause multiple unwanted walks.
getMergedInto() marks the unreachable commits as uninteresting
which would then avoid walking that same path again.
Using the same api, also introduce isMergedIntoAny() and
isMergedIntoAll()
Change-Id: I65de9873dce67af9c415d1d236bf52d31b67e8fe Signed-off-by: Adithya Chakilam <quic_achakila@quicinc.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
There are two code paths for detecting renames: one on tree diffs
(using DiffFormatter#scan) and the other on single file diffs (using
DiffFormatter#format). The latter skips binary and large files
for rename detection - check [1], but the former doesn't.
This change skips content rename detection for the tree diffs case for
large files. This is essential to avoid expensive computations while
reading the file, especially for callers who don't want to pay that
cost. Content renames are those which involve files with slightly
modified content. Exact renames will still be identified.
The default threshold for file sizes is reused from
PackConfig.DEFAULT_BIG_FILE_THRESHOLD: 50 MB.
Thomas Wolf [Tue, 9 Mar 2021 21:23:14 +0000 (22:23 +0100)]
HTTP cookies: do tilde expansion on http.cookieFile
Git config http.cookieFile must have ~ expansion, compare [1].
It also should be an absolute path. While a relative path is allowed,
C git just passes the value on to libcurl, so it'll be relative to the
current working directory and thus not work in all directories.
Log a warning if the path is relative.
(Alternatives would be to throw an exception, or to resolve the path
relative to the .git directory, or relative to the working tree root,
or relative to the config file it occurs in. But C git does not seem
to do either.)
Matthias Sohn [Tue, 9 Mar 2021 17:00:55 +0000 (18:00 +0100)]
Merge branch 'master' into stable-5.11
* master:
Manually set status of jmh dependencies
Update DEPENDENCIES report for 5.11.0
Add dependency to dash-licenses
PackFile: Add id + ext based constructors
GC: deleteOrphans: Use PackFile
PackExt: Convert to Enum
Restore preserved packs during missing object seeks
Pack: Replace extensions bitset with bitmapIdx PackFile
PackDirectory: Use PackFile to ensure we find preserved packs
GC: Use PackFile to de-dup logic
Create a PackFile class for Pack filenames
Matthias Sohn [Sun, 7 Mar 2021 17:41:05 +0000 (18:41 +0100)]
Manually set status of jmh dependencies
The following jmh dependencies were approved as works-with:
- jmh-core/1.21 has GPL-2.0 license and was approved in CQ20517
- jmh-generator-annprocess/1.21 has GPL-2.0 license and was approved in
CQ20518
Nasser Grainawi [Thu, 4 Mar 2021 21:14:43 +0000 (14:14 -0700)]
PackFile: Add id + ext based constructors
Add new constructors to PackFile to improve a common use case where
callers know the directory, id, and extension, but previously needed to
construct a valid file name (with prefix, '.', etc) to create a
PackFile. Most callers can use the variant that has id as an ObjectId,
but provide an id as String variant too.
Martin Fick [Tue, 15 Dec 2020 21:20:44 +0000 (14:20 -0700)]
Restore preserved packs during missing object seeks
Provide a recovery path for objects being referenced during the pack
pruning race. Due to the pack pruning race, it is possible for objects
to become referenced after a pack has been deemed safe to prune, but
before it actually gets pruned. If this happened previously, the newly
referenced objects would be missing and potentially result in a
corrupted ref.
Add the ability to recover from this situation when an object is missing
but happens to still be available in a pack in the "preserved"
directory. This is likely only useful when used in conjunction with the
--preserve-old-packs GC option, which prunes packs by hard-linking to
the preserved directory. If an object is missing and found in a pack in
the preserved directory, immediately recover that pack and its
associated files (idx, bitmaps...) by moving them back to the original
pack directory, and then retry the operation that would have failed due
to the missing object. This retry can now succeed and the repository
may avoid corruption. This approach should drastically reduce the
chance of a corrupt repository during pack pruning at very little extra
cost. This extra cost should only be incurred when objects are missing
and a failure would normally occur.
Change-Id: I2a704e3276b88cc892159d9bfe2455c6eec64252 Signed-off-by: Martin Fick <quic_mfick@quicinc.com> Signed-off-by: Nasser Grainawi <quic_nasserg@quicinc.com>
Nasser Grainawi [Thu, 11 Feb 2021 06:26:17 +0000 (23:26 -0700)]
Pack: Replace extensions bitset with bitmapIdx PackFile
The only extension that was ever consulted from the bitmap was the
bitmap index. We can simplify the Pack code as well as the code of
all the callers if we focus on just that usage.
Nasser Grainawi [Thu, 11 Feb 2021 06:33:43 +0000 (23:33 -0700)]
PackDirectory: Use PackFile to ensure we find preserved packs
Update scanPacksImpl and listPackDirectory (renamed to
getPackFilesByExtById) to use the new PackFile functionality to
validate file names and complete pack file sets (.pack, .idx, etc).
Most importantly, this allows a later change to rely on scanPacks() to
complete a packList that contains packs with the 'old-' prefix in their
extension.
This also eliminates duplication of logic for how to identify and
construct pack files.
Nasser Grainawi [Thu, 11 Feb 2021 05:51:05 +0000 (22:51 -0700)]
Create a PackFile class for Pack filenames
The PackFile class is intended to be a central place to do all
common pack filename manipulation and parsing to help reduce repeated
code and bugs. Use the PackFile class in the Pack class and in many
tests to ensure it works well in a variety of situations. Later changes
will expand use of PackFiles to even more areas.
Change-Id: I921b30f865759162bae46ddd2c6d669de06add4a Signed-off-by: Nasser Grainawi <quic_nasserg@quicinc.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Thomas Wolf [Mon, 1 Mar 2021 07:30:09 +0000 (08:30 +0100)]
HTTP: cookie file stores expiration in seconds
A cookie file stores the expiration in seconds since the Linux Epoch,
not in milliseconds. Correct reading and writing cookie files; with
a backwards-compatibility hack to read files that contain a millisecond
timestamp.
Add a test, and fix tests not to rely on the actual current time so
that they will also run successfully after 2030-01-01 noon.
Bug: 571574
Change-Id: If3ba68391e574520701cdee119544eedc42a1ff2 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Fri, 29 Jan 2021 22:03:44 +0000 (23:03 +0100)]
LFS: handle invalid pointers better
Make sure that SmudgeFilter calls LfsPointer.parseLfsPointer() with
a stream that supports mark/reset, and make sure that parseLfsPointer()
resets the stream properly if it decides that the stream content is not
a LFS pointer.
Add a test.
Bug: 570758
Change-Id: I2593d67cff31b2dfdfaaa48e437331f0ed877915 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
In a distributed setting, one can have multiple datacenters use
reftables for serving, while the ground truth for the Ref database is
administered centrally. In this setting, replication delays combined
with compaction can cause update-index ranges to overlap.
Such a setting is used at Google, and the JGit code already handles
this correctly (modulo a bugfix that applied in change I8f8215b99a).
Remove the restriction that was applied at FileReftableDatabase.
Matthias Sohn [Sun, 27 Dec 2020 01:11:47 +0000 (02:11 +0100)]
Fix errorprone configuration for maven-compiler-plugin with javac
See https://errorprone.info/docs/installation.
Add new profile jdk8 to enable running errorprone with javac on java 8
and java 11. Remove errorprone configuration from benchmark module,
didn't find a way to make it work and this module does not contain any
productive code.
Change-Id: I6a84195af05e6cea9e7c04ad5cd4c79742e80cb3 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Wed, 24 Feb 2021 13:54:52 +0000 (14:54 +0100)]
Merge branch 'master' into stable-5.11
* master: (35 commits)
[releng] japicmp: update last release version
IgnoreNode: include path to file for invalid .gitignore patterns
FastIgnoreRule: include bad pattern in log message
init: add config option to set default for the initial branch name
init: allow specifying the initial branch name for the new repository
Fail clone if initial branch doesn't exist in remote repository
GPG: fix reading unprotected old-format secret keys
Update Orbit to S20210216215844
Add missing bazel dependency for o.e.j.gpg.bc.test
GPG: handle extended private key format
dfs: handle short copies
[GPG] Provide a factory for the BouncyCastleGpgSigner
Fix boxing warnings
GPG: compute the keygrip to find a secret key
GPG signature verification via BouncyCastle
Post commit hook failure should not cause commit failure
Allow to define additional Hook classes outside JGit
GitHook: use default charset for output and error streams
GitHook: use generic OutputStream instead of PrintStream
Update jetty to 9.4.36.v20210114
...
Thomas Wolf [Tue, 23 Feb 2021 17:10:08 +0000 (18:10 +0100)]
IgnoreNode: include path to file for invalid .gitignore patterns
Include the full file path of the .gitignore file and the line number
of the invalid pattern. Also include the pattern itself.
.gitignore files inside the repository are reported with their
repository-relative path; files outside (from git config
core.excludesFile or .git/info/exclude) are reported with their
full absolute path.
Bug: 571143
Change-Id: Ibe5969679bc22cff923c62e3ab9801d90d6d06d1 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Tue, 23 Feb 2021 12:11:56 +0000 (13:11 +0100)]
FastIgnoreRule: include bad pattern in log message
When a .gitignore pattern cannot be parsed include the pattern in the
log message. Just reporting "not closed bracket" isn't helpful if the
user doesn't know in which pattern the problem occurred.
Even better would be to include the full path of the .gitignore file
that contained the offending pattern. This is not implemented in this
change; it may need new API and needs more thought.
Bug: 571143
Change-Id: Id5b16d9cf550544ba3ad409a02041946fa8516ab Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Matthias Sohn [Mon, 25 Jan 2021 01:43:18 +0000 (02:43 +0100)]
init: add config option to set default for the initial branch name
We introduced the option --initial-branch=<branch-name> to allow
initializing a new repository with a different initial branch.
To allow users to override the initial branch name more permanently
(i.e. without having to specify the name manually for each 'git init'),
introduce the 'init.defaultBranch' option.
This option was added to git in 2.28.0.
See https://git-scm.com/docs/git-config#Documentation/git-config.txt-initdefaultBranch
Bug: 564794
Change-Id: I679b14057a54cd3d19e44460c4a5bd3a368ec848 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Mon, 25 Jan 2021 00:54:03 +0000 (01:54 +0100)]
init: allow specifying the initial branch name for the new repository
Add option --initial-branch/-b to InitCommand and the CLI init command.
This is the first step to implement support for the new option
init.defaultBranch. Both were added to git in release 2.28.
See https://git-scm.com/docs/git-init#Documentation/git-init.txt--bltbranch-namegt
Bug: 564794
Change-Id: Ia383b3f90b5549db80f99b2310450a7faf6bce4c Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>