Matthias Sohn [Fri, 11 Nov 2022 16:54:06 +0000 (17:54 +0100)]
Add option to allow using JDK's SHA1 implementation
The change If6da9833 moved the computation of SHA1 from the JVM's
JCE to a pure Java implementation with collision detection.
The extra security for public sites comes with a cost of slower
SHA1 processing compared to the native implementation in the JDK.
When JGit is used internally and not exposed to any traffic from
external or untrusted users, the extra cost of the pure Java SHA1
implementation can be avoided, falling back to the previous
native MessageDigest implementation.
Matthias Sohn [Thu, 27 Oct 2022 18:31:31 +0000 (20:31 +0200)]
Ignore IllegalStateException if JVM is already shutting down
Trying to register/unregister a shutdown hook when the JVM is already in
shutdown throws an IllegalStateException. Ignore this exception since we
can't do anything about it.
Matthias Sohn [Wed, 29 Jun 2022 12:58:17 +0000 (14:58 +0200)]
UploadPack: don't prematurely terminate timer in case of error
In uploadWithExceptionPropagation don't prematurely terminate timer in
case of error to enable reporting it to the client. Expose a close
method so that callers can terminate it at the appropriate time.
If the timer is already terminated when trying to report it to the
client this failed with the error java.lang.IllegalStateException:
"Timer already terminated".
Do not create reflog for remote tracking branches during clone
When using JGit on a non-bare repository, the CloneCommand
it previously created local reflogs for all branches including remote
tracking ones, causing the generation of a potentially large
number of files on the local filesystem.
The creation of the remote-tracking branches (refs/remotes/*) during
clone is not an issue for the local filesystem because all of them are
stored in a single packed-refs file. However, the creation of a large
number of ref logs on a local filesystem IS an issue because it
may not be tuned or initialised in term of inodes to contain a very
large number of files.
When a user (or a CI system) performs the CloneCommand against
a potentially large repository (e.g., millions of branches), it is
interested in working or validating a single branch or tag and is
unlikely to work with all the remote-tracking branches.
The eager creation of a reflogs for all the remote-tracking branches is
not just a performance issue but may also compromise the ability to
use JGit for cloning a large repository.
The behaviour implemented in this change is also consistent with the
optimisation done in the C code-base [1].
We differentiate between clone and fetch commands using --branch
<initialBranch> option, that is only available in clone command,
and is set as HEAD per default.
Luca Milanesio [Thu, 19 May 2022 10:12:28 +0000 (11:12 +0100)]
UploadPack: do not check reachability of visible SHA1s
When JGit needs to serve a Git client requesting SHA1s
during the want phase, it needs to make a full reachability
check from the advertised refs to the ones requested to
keep all objects in the correct scope of confidentiality
allowed by the avertised refs.
The check is also performed when the SHA1 corresponds to
one of the tips of the advertised refs which is a waste of
resources.
eric.steele [Wed, 1 Jun 2022 01:03:17 +0000 (18:03 -0700)]
AmazonS3: Add support for AWS API signature version 4
Updating the AmazonS3 class to support AWS Signature version 4 because
version 2 is no longer supported in all AWS regions. The version can be
selected with the new 'aws.api.signature.version' property (defaults to
2 for backwards compatibility). When set to '4', the user must also
specify the AWS region via the 'region' property. The 'region' property
must match the region that the 'domain' property resolves to.
Saša Živkov [Fri, 3 Jun 2022 14:36:43 +0000 (16:36 +0200)]
Fix connection leak for smart http connections
SmartHttpPushConnection: close InputStream and OutputStream after
processing. Wrap IOExceptions which aren't TransportExceptions already
as a TransportException.
Also-By: Matthias Sohn <matthias.sohn@sap.com>
Change-Id: I8e11d899672fc470c390a455dc86367e92ef9076
Remove stray files (probes or lock files) created by background threads
NOTE: port back from master branch.
On process exit, it was possible that the filesystem timestamp
resolution measurement left behind .probe files or even a lock file
for the jgit.config.
Ensure the SAVE_RUNNER is shut down when the process exits (via
System.exit() or otherwise). Move lf.lock() into the try-finally
block when saving the config file.
Delete .probe files on JVM shutdown -- they are created in daemon
threads that may terminate abruptly, not executing the "finally"
clause that normally removes these files.
BasePackConnection::readAdvertisedRefsImpl was creating an exception by
calling `noRepository`, and then blindly calling `initCause` on it. As
`noRepository` can be overridden, it's not guaranteed to be missing a
cause.
BasePackPushConnection overrides `noRepository` and initiates a fetch,
which may throw a `NoRemoteRepositoryException` with a cause.
In this case calling `initCause` threw an `IllegalStateException`.
In order to throw the correct exception, we now return the
BasePackPushConnection exception and suppress the one thrown by
BasePackConnection
Marcin Czech [Wed, 22 Dec 2021 16:42:36 +0000 (17:42 +0100)]
UploadPack v2 protocol: Stop negotiation for orphan refs
The fetch of a single orphan ref (for example Gerrit meta ref:
refs/changes/21/21/meta) did not stop the negotiation so client
had to advertise all refs. This impacts the fetch performance
on repositories with a large number of refs (for example on
Gerrit repository it takes 20 seconds to fetch meta ref
comparing to 1.2 second to fetch ref with parent).
To avoid this issue UploadPack, used on the server side,
now checks if all `want` refs have parents, if not this
means that client doesn't need any extra objects, hence
the server responds with `ready` and finishes the
negotiation phase.
Matthias Sohn [Thu, 16 Dec 2021 17:42:45 +0000 (18:42 +0100)]
Use slf4j-simple instead of log4j for logging
JGit uses slf4j-api as logging API.
The libraries
- org.eclipse.jgit.http.test
- org.eclipse.jgit.pgm
- org.eclipse.jgit.ssh.apache.test
- org.eclipse.jgit.test
used the outdated log4j 1.2.15 which is EOL since years.
Since both jgit command line and also the tests don't need sophisticated
logging features replace log4j with the much simpler slf4j-simple log
implementation. The org.slf4j.binding.simple 1.7.30 archive has only
25kB instead of 429kB for log4j 1.2.15
Applications using jgit are free to choose any other log implementation
supporting slf4j API.
Matthias Sohn [Mon, 13 Dec 2021 23:07:10 +0000 (00:07 +0100)]
Update orbit to R20211213173813
and update
- com.google.gson to 2.8.8.v20211029-0838
- javaewah to 1.1.13.v20211029-0839
- net.i2p.crypto.eddsa to 0.3.0.v20210923-1401
- org.apache.ant to 1.10.12.v20211102-1452
- org.apache.commons.compress to 1.21.0.v20211103-2100
- org.bouncycastle.bcprov to 1.69.0.v20210923-1401
- org.junit to 4.13.2.v20211018-1956
Matthias Sohn [Fri, 3 Dec 2021 21:45:05 +0000 (22:45 +0100)]
Update maven plugins
- build-helper-maven-plugin to 3.2.0
- jacoco-maven-plugin to 0.8.7
- maven-antrun-plugin to 3.0.0
- maven-dependency-plugin to 3.2.0
- maven-enforcer-plugin to 3.0.0
- maven-jar-plugin to 3.2.0
- maven-javadoc-plugin to 3.3.1
- maven-jxr-plugin to 3.1.1
- maven-pmd-plugin to 3.15.0
- maven-project-info-reports-plugin to 3.1.2
- maven-resources-plugin to 3.2.0
- maven-shade-plugin to 3.2.4
- maven-site-plugin to 3.9.1
- maven-source-plugin to 3.2.1
- maven-surefire-plugin to 3.0.0-M5
- spotbugs-maven-plugin to 4.3.0
- tycho and tycho-extras to 1.7.0
RefDirectory.scanRef: Re-use file existence check done in snapshot creation
Return immediately in scanRef if the loose ref was identified as
missing when a snapshot was attempted for the ref. This will help
performance of scanRef when the ref is packed but has a corresponding
empty dir in 'refs/'.
For example, consider the case where we create 50k sharded refs in
a new namespace called 'new-refs' using an atomic 'BatchRefUpdate'.
The refs are named like 'refs/new-refs/01/1/1', 'refs/new-refs/01/1/2',
'refs/new-refs/01/1/3' and so on. After the refs are created, the
'new-refs' namespace looks like below:
$ find refs/new-refs -type f | wc -l
0
$ find refs/new-refs -type d | wc -l
5101
At this point, an 'exactRef' call on each of the 50k refs without
this change takes ~2.5s, where as with this change it takes ~1.5s.
FileSnapshot: Lazy load file store attributes cache
Doing a getFileStoreAttributes call even when the file doesn't
exist is unnecessary. This call is particularly slow on some
filesystems. Instead, do it only when the file exists and load
the appropriate cache.
This update can help speed up RefDirectory.exactRef when the ref
is packed, but has a corresponding empty dir for it under 'refs/'.
This scenario can happen when an atomic 'BatchRefUpdate' creates
new sharded refs.
For example, consider the case where we create 50k sharded refs in
a new namespace called 'new-refs' using an atomic 'BatchRefUpdate'.
The refs are named like 'refs/new-refs/01/1/1', 'refs/new-refs/01/1/2',
'refs/new-refs/01/1/3' and so on. After the refs are created, the
'new-refs' namespace looks like below:
$ find refs/new-refs -type f | wc -l
0
$ find refs/new-refs -type d | wc -l
5101
At this point, an 'exactRef' call on each of the 50k refs without
this change takes ~30s, where as with this change it takes ~2.5s.
Thomas Wolf [Sun, 28 Nov 2021 10:42:20 +0000 (11:42 +0100)]
FS: debug logging only if system config file cannot be found
The command 'git config --system --show-origin --list -z' fails if
the system config doesn't exist. Use debug logging instead of a
warning for failures of that command. Typically the user cannot do
anything about it anyway, and JGit will just work without system
config.
Bug: 577492
Change-Id: If628ab376182183aea57a385c169e144d371bbb2 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Tue, 19 Oct 2021 07:07:14 +0000 (09:07 +0200)]
Set JSch global config values only if not set already
Bug: 576604
Change-Id: I04415f07bf2023bc8435c097d1d3c65221d563f1 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
(cherry picked from commit f8b0c00e6a8f5628babff6dd37254a21589b6e44)
Thomas Wolf [Fri, 19 Nov 2021 18:15:46 +0000 (19:15 +0100)]
Better git system config finding
We've used "GIT_EDITOR=edit git config --system --edit" to determine
the location of the git system config for a long time. But git 2.34.0
always expects this command to have a tty, but there isn't one when
called from Java. If there isn't one, the Java process may get a
SIGTTOU from the child process and hangs.
Arguably it's a bug in C git 2.34.0 to unconditionally assume there
was a tty. But JGit needs a fix *now*, otherwise any application using
JGit will lock up if git 2.34.0 is installed on the machine.
Therefore, use a different approach if the C git found is 2.8.0 or
newer: parse the output of
git config --system --show-origin --list -z
"--show-origin" exists since git 2.8.0; it prefixes the values with
the file name of the config file they come from, which is the system
config file for this command. (This works even if the first item in
the system config is an include.)
Bug: 577358
Change-Id: I3ef170ed3f488f63c3501468303119319b03575d Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
(cherry picked from commit 9446e62733da5005be1d5182f0dce759a3052d4a)
Matthias Sohn [Fri, 15 Oct 2021 20:58:21 +0000 (22:58 +0200)]
Merge branch 'stable-5.12' into stable-5.13
* stable-5.12:
Fix missing peel-part in lsRefsV2 for loose annotated tags
Fix RevWalk.getMergedInto() ignores annotated tags
Optimize RevWalk.getMergedInto()
reftable: drop code for truncated reads
reftable: pass on invalid object ID in conversion
Update eclipse-jarsigner-plugin to 1.3.2
Fix running benchmarks from bazel
Update eclipse-jarsigner-plugin to 1.3.2
Matthias Sohn [Fri, 15 Oct 2021 20:48:01 +0000 (22:48 +0200)]
Merge branch 'stable-5.11' into stable-5.12
* stable-5.11:
Fix missing peel-part in lsRefsV2 for loose annotated tags
reftable: drop code for truncated reads
reftable: pass on invalid object ID in conversion
Update eclipse-jarsigner-plugin to 1.3.2
Fix running benchmarks from bazel
Update eclipse-jarsigner-plugin to 1.3.2
Matthias Sohn [Fri, 15 Oct 2021 20:45:18 +0000 (22:45 +0200)]
Merge branch 'stable-5.10' into stable-5.11
* stable-5.10:
Fix missing peel-part in lsRefsV2 for loose annotated tags
reftable: drop code for truncated reads
reftable: pass on invalid object ID in conversion
Update eclipse-jarsigner-plugin to 1.3.2
Fix running benchmarks from bazel
Update eclipse-jarsigner-plugin to 1.3.2
Matthias Sohn [Fri, 15 Oct 2021 20:32:00 +0000 (22:32 +0200)]
Merge branch 'stable-5.9' into stable-5.10
* stable-5.9:
Fix missing peel-part in lsRefsV2 for loose annotated tags
reftable: drop code for truncated reads
reftable: pass on invalid object ID in conversion
Update eclipse-jarsigner-plugin to 1.3.2
Fix running benchmarks from bazel
Update eclipse-jarsigner-plugin to 1.3.2
Unfortunately, the existing UploadPackTest didn't reveal this issue
although it provided the test case needed to do so: testV2LsRefsPeel.
This is because The UploadPackTest uses InMemoryRepository which
internally uses Dfs* implementations. The issue is only reproducible
when using the FileRepository.
It is a non-trivial task to refactor the UploadPackTest to work against
both InMemoryRepository and FileRepository and this change is not trying
to do that. This change creates a new test:
UploadPackLsRefsFileRepositoryTest and copies the necesssary code from
the UploadPackTest.
Find out which branches is 2 merged into:
After we calculated that master contains 2, we can mark 6 as TEMP_MARK
to avoid unwanted walks.
When we want to figure out if 2 is merge into the topic, the traversal
path would be [7, 6] instead of [7, 6, 5, 3, 2].
Test:
This change can significantly improve performance for tags.
On a copy of the Linux repository, the command 'git tag --contains
<commit>' had the following performance improvement:
The current version ignores tags, even though the tag is a type of
the ref.
Follow-up commits I'll fix it.
Change-Id: Ie6295ca4d16070499912af462239e679a97cce47 Signed-off-by: kylezhao <kylezhao@tencent.com> Reviewed-by: Christian Halstrick <christian.halstrick@sap.com> Reviewed-by: Martin Fick <mfick@codeaurora.org>
The reftable format is a block based format, but allows for variably
sized blocks. This obviously happens for reflog blocks (which are zlib
compressed), but is also accepted for index blocks: In the spec, this
is motivated as
To achieve constant O(1) disk seeks for lookups the index must be
a single level, which is permitted to exceed the file's
configured block size, but not the format's max block size of
15.99 MiB.
Hence, when parsing a block, one cannot be sure of its exact size:
after reading a default-size block (eg. 4kb), the block header may
state that the block is in fact larger.
Before, the code would mark the block as `truncated`, noting
// Its OK during sequential scan for an index block to have been
// partially read and be truncated in-memory. This happens when
// the index block is larger than the file's blockSize. Caller
// will break out of its scan loop once it sees the blockType.
This looks like either
* a remnant of never-implemented functionality. There is no reason to
ever sequentially scan an index block.
* alluding to sequential scan of the data blocks before the index
blocks (eg. scanning refs, which ends when we find the first ref index
block, and we can then ignore the index block).
This comment is followed by code that populates the
restartTbl/restartCnt fields relative to the (possibly truncated)
buffer. If the buffer is truncated, this essentially reads garbage,
leading to OOB array access when using the index block.
Fix this by dropping the truncated logic and issuing a second read if
the first read was short.
Add a test.
We have never observed this failure scenario at Google. We use 64kb
blocksize, which requires us to need fewer index entries. The reftable
spec mentions an Android repo of size 36M. With 64kb blocks, that's
just 562 index entries. Even with historical growth, we are long from
requiring an index whose size exceeds a single block.
When adding the analogous test for seeking refs, there was no failure.
This points to another possibility which is that the code tries to
avoid writing large index blocks for refs.