Matthias Sohn [Sun, 31 Oct 2021 23:53:10 +0000 (00:53 +0100)]
Make BinaryBlobException stackless
We use BinaryBlobException to signal a binary blob was found and never
make use of its stack trace. Suppress filling in the stack trace to
avoid the performance penalty coming with that.
See https://shipilev.net/blog/2014/exceptional-performance/
Thomas Wolf [Sat, 13 Nov 2021 17:10:13 +0000 (18:10 +0100)]
ssh: Handle "ProxyJump none" from SSH config file
Since OpenSSH 7.8, the ProxyJump directive accepts the value "none"[1]
to override and clear a setting that might otherwise be contributed by
another (wildcard) host entry.
Thomas Wolf [Sat, 13 Nov 2021 12:09:58 +0000 (13:09 +0100)]
ssh: use a single SecureRandom instance for hashing hostnames
According to Spotbugs, that's better practice. It's questionable
whether it makes a big difference, though, especially since the
hash is the cryptographically weak SHA1.
Change-Id: Id293de2bad809d9cc19230bd720184786dc6c226 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Sat, 13 Nov 2021 12:09:01 +0000 (13:09 +0100)]
OpenSshConfigFile: line comments and quoted strings
Bring our SSH config parser up-to-date with respect to changes in
OpenSSH. In particular, they fixed[1] the handling of line comments
such that #-characters inside strings are not considered. This means
that we have to parse strings with escaped quotes correctly.
Thomas Wolf [Wed, 27 Oct 2021 12:03:10 +0000 (14:03 +0200)]
[sshd agent] Introduce ConnectorDescriptor
Once a factory supports different SSH agents on the same platform,
which is planned for Windows once we use Apache MINA sshd 2.8.0,
client code may need to have a way to specify which SSH agent shall
be used when the SSH config doesn't define anything.
Add a mechanism by which a ConnectorFactory can tell what Connectors
it may provide. Client code can use this to set the identityAgent
parameter of ConnectorFactory.create() to the wanted default if it
would be null otherwise.
A ConnectorDescriptor is a pair of strings: an internal name, and a
display name. The latter is included because client code might want to
communicate agent names to the user, be it in error messages or in some
chooser dialog where a user could define which of several alternative
SSH agents should be used as default. The internal name is intended to
be used in the IdentityAgent directive in ~/.ssh/config.
Also make the ConnectorFactory discovered via the ServiceLoader
accessible and overrideable. Provide static get/setDefault() methods,
similar to the SshSessionFactory itself.
Change-Id: Ie3d077395d32dfddc72bc8627e92b23636938182 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Matthias Sohn [Wed, 10 Nov 2021 13:51:29 +0000 (14:51 +0100)]
Update Orbit to S20211108222137
and update dependencies:
- com.google.gson to 2.8.8.v20211029-0838
- com.googlecode.javaewah to 1.1.13.v20211029-0839
- net.i2p.crypto.eddsa to 0.3.0.v20210923-1401
- org.apache.ant to 1.10.12.v20211102-1452
- org.apache.commons.compress to 1.21.0.v20211103-2100
- org.bouncycastle.bcprov to 1.69.0.v20210923-1401
- org.junit to 4.13.2.v20211018-1956
Thomas Wolf [Sun, 31 Oct 2021 16:29:04 +0000 (17:29 +0100)]
Simplify SshdFtpChannel
Apache MINA sshd has simpler API for reading directories, and it has a
functional interface suitable for us. So no need to use our own
interface, or to deal with low-level abstractions like CloseableHandle.
Change-Id: Ic125c587535670504983f157a696b41ed6a76bb7 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Wed, 13 Oct 2021 20:21:52 +0000 (22:21 +0200)]
[test] test OpenSshConfigFile directly, not via the JSch config
This is a prerequisite for removing the JSch support bundle; otherwise
OpenSshConfigFile would be left without tests.
Copy OpenSshConfigTest from the JSch support bundle and adapt all tests
to perform the equivalent checks on OpenSshConfigFile directly. Add a
new lookupDefault() method to the SshConfigStore interface and implement
it so that it behaves the same and the tests work identically.
Thomas Wolf [Tue, 2 Nov 2021 17:48:25 +0000 (18:48 +0100)]
sshd: add support for ssh-agent
Add a simple SSH agent connector using JNA. Include com.sum.jna and
com.sun.jna.platform in the target platform.
JNA is used to communicate through Unix domain sockets with ssh-agent,
and if on Windows, to communicate via shared memory with Pageant.
The new bundle o.e.j.ssh.apache.agent is an OSGi fragment so that
the java.util.ServiceLoader can find the provided factory without
further ado in OSGi environments.
Adapt both maven and bazel builds to include the new bundle.
Manually tested on OS X, CentOS 7, and Win10 with Pageant 0.76. Tested
by installing JGit built from this change into freshly downloaded
Eclipse 2021-12 M1, and then doing git fetches via SSH with different
~/.ssh/config settings (explicit IdentityFile, without any but a key in
the agent, with no keys and a key in the agent and IdentitiesOnly=yes
(must fail)).
Bug: 541274
Bug: 541275
Change-Id: I34e85467293707dbad1eb44d1f40fc2e70ba3622 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Wed, 20 Oct 2021 07:51:43 +0000 (09:51 +0200)]
sshd: prepare for using an SSH agent
Add interfaces Connector and ConnectorFactory. A "connector" is just
something that knows how to connect to an ssh-agent and then can make
simple synchronous RPC-style requests (request-reply).
Add a way to customize an SshdSessionFactory with a ConnectorFactory.
Provide a default setup using the Java ServiceLoader mechanism to
discover an ConnectorFactory.
Implement an SshAgentClient in the internal part. Unfortunately we
cannot re-use the implementation in Apache MINA sshd: it's hard-wired
to Apache Tomcat APR, and it's also buggy.
No behavior changes yet since there is nothing that would provide an
actual ConnectorFactory. So for Apache MINA sshd, the SshAgentFactory
remains null as before.
Change-Id: I963a3d181357df2bdb66298bc702f2b9a6607a30 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Sun, 31 Oct 2021 13:57:30 +0000 (14:57 +0100)]
[doc] Add README and package-info to the SSH bundles
Explain in the JSch bundle that it is essentially unmaintained. Add
descriptions in both bundles explaining how to use it, or how to use
an alternate implementation.
Change-Id: Idaf46c33b14543279f78a55cb7c6bd42b06ee6b8 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Sat, 30 Oct 2021 23:35:52 +0000 (01:35 +0200)]
Binary and CR-LF detection: lone CRs -> binary
C git considers not only files containing NUL bytes as binary but also
files containing lone CRs. Implement this also for JGit.
C git additionally counts printable vs. non-printable characters and
considers files that have non_printable_count > printable_count / 128
also as binary. This is not implemented because such counting probably
only makes sense if one looks at the full file or blob content. The
Auto[CR]LF* streams in JGit look only at the first few KiB of a stream
in order not to buffer too much.
Thomas Wolf [Sat, 30 Oct 2021 17:37:44 +0000 (19:37 +0200)]
Factor out parsing git-style size numbers to StringUtils
Move the code to parse numbers with an optional 'k', 'm', or 'g' suffix
from the config file handling to StringUtils. This enables me to re-use
it in EGit, which has duplicate code in StorageSizeFieldEditor.
As this is generally useful functionality, providing it in the library
makes sense.
Change-Id: I86e4f5f62e14f99b35726b198ba3bbf1669418d9 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Tue, 12 Oct 2021 22:28:38 +0000 (00:28 +0200)]
Make the buffer size for text/binary detection configurable
The various streams used in JGit for text/binary and CR-LF detection
used different buffer sizes. Most used 8000, but one used 8KiB, and one
used 8096 (SIC!) bytes.
Considering only the first 8kB of a file/blob is not sufficient; it
may give behavior incompatible with C git. C git considers the whole
blob; since it uses memory-mapped files it can do so with acceptable
performance. Doing this in JGit would most likely incur a noticeable
performance penalty. But 8kB is a bit small; in the file in bug 576971
the limit was hit before the first CR-LF, which occurred on line 155
at offset 9759 in the file.
Make RawText.FIRST_FEW_BYTES only a default and minimum setting, and
set it to 8KiB. Make the actual buffer size configurable: provide
static methods getBufferSize() and setBuffersize(), and use
getBufferSize() throughout instead of the constant.
This enables users of the JGit library to set their own possibly larger
buffer size.
Bug: 576971
Change-Id: I447762c9a5147a521f73d2864ba59ed89f555d54 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Matthias Sohn [Wed, 27 Oct 2021 12:48:15 +0000 (14:48 +0200)]
Merge branch 'master' into stable-6.0
* master:
Fix checkout of files with mixed line endings on text=auto eol=crlf
Don't rely on an implicit default character set
Fix bad indentation in pom.xml
Minor code-clean-up in OpenSshConfigFile
Remove use of deprecated getAllRefs() in UploadPack
DFS block cache: fix lock issue and support parallel index loading
JSch: fix service publication for ServiceLoader
Set JSch global config values only if not set already
Fix missing peel-part in lsRefsV2 for loose annotated tags
DFS block cache: allow multiple passes for blocks before eviction
Fix RevWalk.getMergedInto() ignores annotated tags
Optimize RevWalk.getMergedInto()
GarbageCollectCommand: add numberOfBitmaps to statistics
reftable: drop code for truncated reads
reftable: pass on invalid object ID in conversion
Update eclipse-jarsigner-plugin to 1.3.2
Fix running benchmarks from bazel
Update eclipse-jarsigner-plugin to 1.3.2
Add org.bouncycastle.bcutil to p2 repository
Thomas Wolf [Tue, 12 Oct 2021 20:14:33 +0000 (22:14 +0200)]
Fix checkout of files with mixed line endings on text=auto eol=crlf
Add tests for files having been checked in with mixed LF and CR-LF
line endings, and then being checked out with text or text=auto and
eol=lf or eol=crlf. These test cases were missing, and thus commit efd1cc05 missed that AutoCRLFOutputStream needs the same detection
as AutoLFOutputStream.
Fix AutoCRLFOutputStream to not convert line endings if the blob in
the repository contains CR-LF.
Bug: 575393
Change-Id: Id0c7ae772e282097e95fddcd3f1f9d82aae31e43 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Fri, 3 Sep 2021 17:49:27 +0000 (19:49 +0200)]
Don't rely on an implicit default character set
JEP 400 (Java 18) will change the default character set to UTF-8
unconditionally.[1] Introduce SystemReader.getDefaultCharset() that
provides the locale-dependent charset the way JEP 400 recommends.
Change all code locations using Charset.defaultCharset() to use the
new SystemReader method instead.
[1] https://openjdk.java.net/jeps/400
Change-Id: I986f97a410d2fc70748b6f93228a2d45ff100b2c Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
DFS block cache: fix lock issue and support parallel index loading
This change is a fix to http://git.eclipse.org/r/c/jgit/jgit/+/183562
that was reverted in http://git.eclipse.org/r/c/jgit/jgit/+/184978
due to deadlocks. Separate locks in DfsBlockFile are removed to rely
on getting value from DfsBlockCache with region locking in place.
With this change bitmap index creation is not blocked on index and
reverse index full initialization in DfsPackFile. Now bitmap index
and index could be read from storage in parallel in separate threads.
Matthias Sohn [Fri, 15 Oct 2021 21:10:12 +0000 (23:10 +0200)]
Merge branch 'stable-5.13'
* stable-5.13:
Fix missing peel-part in lsRefsV2 for loose annotated tags
Fix RevWalk.getMergedInto() ignores annotated tags
Optimize RevWalk.getMergedInto()
reftable: drop code for truncated reads
reftable: pass on invalid object ID in conversion
Update eclipse-jarsigner-plugin to 1.3.2
Fix running benchmarks from bazel
Update eclipse-jarsigner-plugin to 1.3.2
Add org.bouncycastle.bcutil to p2 repository
Matthias Sohn [Fri, 15 Oct 2021 20:58:21 +0000 (22:58 +0200)]
Merge branch 'stable-5.12' into stable-5.13
* stable-5.12:
Fix missing peel-part in lsRefsV2 for loose annotated tags
Fix RevWalk.getMergedInto() ignores annotated tags
Optimize RevWalk.getMergedInto()
reftable: drop code for truncated reads
reftable: pass on invalid object ID in conversion
Update eclipse-jarsigner-plugin to 1.3.2
Fix running benchmarks from bazel
Update eclipse-jarsigner-plugin to 1.3.2
Matthias Sohn [Fri, 15 Oct 2021 20:48:01 +0000 (22:48 +0200)]
Merge branch 'stable-5.11' into stable-5.12
* stable-5.11:
Fix missing peel-part in lsRefsV2 for loose annotated tags
reftable: drop code for truncated reads
reftable: pass on invalid object ID in conversion
Update eclipse-jarsigner-plugin to 1.3.2
Fix running benchmarks from bazel
Update eclipse-jarsigner-plugin to 1.3.2
Matthias Sohn [Fri, 15 Oct 2021 20:45:18 +0000 (22:45 +0200)]
Merge branch 'stable-5.10' into stable-5.11
* stable-5.10:
Fix missing peel-part in lsRefsV2 for loose annotated tags
reftable: drop code for truncated reads
reftable: pass on invalid object ID in conversion
Update eclipse-jarsigner-plugin to 1.3.2
Fix running benchmarks from bazel
Update eclipse-jarsigner-plugin to 1.3.2
Matthias Sohn [Fri, 15 Oct 2021 20:32:00 +0000 (22:32 +0200)]
Merge branch 'stable-5.9' into stable-5.10
* stable-5.9:
Fix missing peel-part in lsRefsV2 for loose annotated tags
reftable: drop code for truncated reads
reftable: pass on invalid object ID in conversion
Update eclipse-jarsigner-plugin to 1.3.2
Fix running benchmarks from bazel
Update eclipse-jarsigner-plugin to 1.3.2
Unfortunately, the existing UploadPackTest didn't reveal this issue
although it provided the test case needed to do so: testV2LsRefsPeel.
This is because The UploadPackTest uses InMemoryRepository which
internally uses Dfs* implementations. The issue is only reproducible
when using the FileRepository.
It is a non-trivial task to refactor the UploadPackTest to work against
both InMemoryRepository and FileRepository and this change is not trying
to do that. This change creates a new test:
UploadPackLsRefsFileRepositoryTest and copies the necesssary code from
the UploadPackTest.
DFS block cache: allow multiple passes for blocks before eviction
Let certain pack extensions that are expensive to load from storage
(e.g. pack index, bitmap index) stay in DFS block cache longer than
others by overriding default cache count through DfsBlockCacheConfig
Don't change default behavior when cache override map is empty. Use int
cacheCount instead of boolean hot for Ref<T>
Find out which branches is 2 merged into:
After we calculated that master contains 2, we can mark 6 as TEMP_MARK
to avoid unwanted walks.
When we want to figure out if 2 is merge into the topic, the traversal
path would be [7, 6] instead of [7, 6, 5, 3, 2].
Test:
This change can significantly improve performance for tags.
On a copy of the Linux repository, the command 'git tag --contains
<commit>' had the following performance improvement:
The current version ignores tags, even though the tag is a type of
the ref.
Follow-up commits I'll fix it.
Change-Id: Ie6295ca4d16070499912af462239e679a97cce47 Signed-off-by: kylezhao <kylezhao@tencent.com> Reviewed-by: Christian Halstrick <christian.halstrick@sap.com> Reviewed-by: Martin Fick <mfick@codeaurora.org>
Matthias Sohn [Wed, 29 Sep 2021 00:10:34 +0000 (02:10 +0200)]
Delete old target platforms and corresponding Orbit releases
We updated the baseline target platform to jgit-4.17 corresponding to
Eclipse 4.17 (2020-09). Delete all target platforms older than this
version and the corresponding Orbit releases.
Matthias Sohn [Wed, 29 Sep 2021 00:03:02 +0000 (02:03 +0200)]
Update tycho to 2.5.0 and target platform to jgit-4.17
jgit-4.17 corresponds to Eclipse 2020-09, which is the first Eclipse
version that required Java 11, and which is JGit's new baseline as of
JGit 6.0.
Remove pack200 which is deprecated in Java 11 and isn't supported by
tycho 2.5.0 anymore. It was specified in JSR 200 (J2SE 1.5) [1],
deprecated in JEP 336 (Java SE 11) [2] and removed in JEP 367 (Java SE
14) [3].
Matthias Sohn [Tue, 21 Sep 2021 21:24:47 +0000 (23:24 +0200)]
Enable using JMH annotation processor on Java>=9
See
- https://issues.apache.org/jira/browse/MCOMPILER-369
- https://stackoverflow.com/questions/53927874/jmh-doesnt-run-inside-a-java-module-unable-to-find-the-resource-meta-inf-ben
Matthias Sohn [Mon, 28 Dec 2020 17:51:57 +0000 (18:51 +0100)]
Enable compiler option --release
This ensures the compiler compiles against the public, supported and
documented API for a specific VM version (here 11) [1]. This also means
that
we don't need EE descriptors in Eclipse anymore in order to ensure that
only supported APIs of the selected Java version can be used.
According to [2] if option --release is used --source and --target
options can't be used.
While we are at it also add default value for all new jdt core options
added in Eclipse 4.21.
The reftable format is a block based format, but allows for variably
sized blocks. This obviously happens for reflog blocks (which are zlib
compressed), but is also accepted for index blocks: In the spec, this
is motivated as
To achieve constant O(1) disk seeks for lookups the index must be
a single level, which is permitted to exceed the file's
configured block size, but not the format's max block size of
15.99 MiB.
Hence, when parsing a block, one cannot be sure of its exact size:
after reading a default-size block (eg. 4kb), the block header may
state that the block is in fact larger.
Before, the code would mark the block as `truncated`, noting
// Its OK during sequential scan for an index block to have been
// partially read and be truncated in-memory. This happens when
// the index block is larger than the file's blockSize. Caller
// will break out of its scan loop once it sees the blockType.
This looks like either
* a remnant of never-implemented functionality. There is no reason to
ever sequentially scan an index block.
* alluding to sequential scan of the data blocks before the index
blocks (eg. scanning refs, which ends when we find the first ref index
block, and we can then ignore the index block).
This comment is followed by code that populates the
restartTbl/restartCnt fields relative to the (possibly truncated)
buffer. If the buffer is truncated, this essentially reads garbage,
leading to OOB array access when using the index block.
Fix this by dropping the truncated logic and issuing a second read if
the first read was short.
Add a test.
We have never observed this failure scenario at Google. We use 64kb
blocksize, which requires us to need fewer index entries. The reftable
spec mentions an Android repo of size 36M. With 64kb blocks, that's
just 562 index entries. Even with historical growth, we are long from
requiring an index whose size exceeds a single block.
When adding the analogous test for seeking refs, there was no failure.
This points to another possibility which is that the code tries to
avoid writing large index blocks for refs.
Matthias Sohn [Thu, 9 Sep 2021 12:41:30 +0000 (14:41 +0200)]
Merge branch 'master' into next
* master: (38 commits)
Revert "DFS block cache: Refactor to enable parallel index loading"
GitServlet: allow to override default error handlers
Silence API error for new interface method ProtocolV2Hook#onObjectInfo
transport: add object-info capability
Ignore IllegalStateException if JVM is already shutting down
Update orbit to R20210825222808 for 2021-09
Update spotbugs-maven-plugin to 4.3.0
Update ant to 1.10.11 also in pom.xml
DFS block cache: add additional stats to DfsReaderIoStats
Update Orbit to S20210817231813
[gpg] Better GPG home directory determination
FS: cleanup use of final modifier
Ensure FS#searchPath only selects executable files
RevWalk: getMergedInto's result is wrong on the second call
DFS block cache: Refactor to enable parallel index loading
[test] Create keystore with the keytool of the running JDK
[gpg] Update to Bouncy Castle 1.69
[test] Create keystore with the keytool of the running JDK
[sshd] Minor code clean-up
Support commit.template config property
...
Matthias Sohn [Wed, 1 Sep 2021 14:31:13 +0000 (16:31 +0200)]
Merge branch 'master' into stable-5.13
* master:
GitServlet: allow to override default error handlers
Silence API error for new interface method ProtocolV2Hook#onObjectInfo
transport: add object-info capability
Ignore IllegalStateException if JVM is already shutting down
Update orbit to R20210825222808 for 2021-09
RevWalk: getMergedInto's result is wrong on the second call
Antonio Barone [Wed, 1 Sep 2021 10:04:07 +0000 (12:04 +0200)]
GitServlet: allow to override default error handlers
GitServlet delegates repository access over HTTP to the GitFilter
servlet.
GitServlet, in turn, can be extended by jgit consumers to provide custom
logic when handling such operations.
This is the case, for example, with Gerrit Code Review, which provides
custom behavior with a GitOverHttpServlet [1].
Among possible customizations, the ability of specifying a custom error
handler for UploadPack and ReceivePack was already introduced in
GitFilter by Idd3b87d6b and I9c708aa5a2, respectively.
However the `setUploadPackErrorHandler` and `setReceivePackErrorHandler`
methods were never added to the GitServlet.
Expose the `setUploadPackErrorHandler` and `setReceivePackErrorHandler`
methods to the GitServlet, so that consumers of the jgit library might
specify custom error handlers.
Sometimes it is useful to obtain metadata associated with an object
without the need to first download it locally. This is specially useful
when using partial clones.
This change implements the object-info capability that allows clients to
query the remote server for object metadata (currently only size). This
is a backport of the same capability that was recently added to the Git
project a2ba162cda (object-info: support for retrieving object info,
2021-04-20).
Signed-off-by: Bruno Albuquerque <bga@google.com>
Change-Id: I4dc9828e1c247f08b0976b8810be92d124366165