Matthias Sohn [Tue, 27 Aug 2024 13:13:37 +0000 (15:13 +0200)]
Merge branch 'master' into stable-7.0
* master:
DfsReaderIoStats: getters to object size index micros/bytes
Do not set headers if response is already committed
AmazonS3: Ensure SAXParserFactory sets valid/expected input params
Signing: refactor interfaces
Add a missing license header
LockFile: Retry lock creation if parent dirs were removed
GpgConfig: Add missing @since
DfsReaderIoStats: Order fields and methods consistently
Change Ie8a9d411fc19e8b7bf86c0b4df0b02153a0e9444 broke setting
valid/expected input parameters for the XML parser. This can be fixed
by calling SaxParserFactory#setNamespaceAware, see [1]. Also see earlier
fix in [2].
Thomas Wolf [Tue, 20 Aug 2024 20:41:45 +0000 (22:41 +0200)]
Signing: refactor interfaces
This is a big API-breaking change cleaning up the signing interfaces.
Initially, these interfaces were GPG/OpenPGP-specific. When EGit added
new signers and signature verifiers that called an external GPG
executable, they were found inadequate and were extended to be able to
pass in the GpgConfig to get access to the "gpg.program" setting.
With the introduction of X.509 S/MIME signing, it was discovered that
the interfaces were still not quite adequate, and the "Gpg" prefix on
the class names were confusing.
Since 7.0 is a major version bump, I'm taking this chance to overhaul
these interfaces from ground up.
For signing, there is a new Signer interface. With it goes a
SignerFactory SPI interface, and a final Signers class managing the
currently set signers. By default, signers for the different signature
types are created from the signer factories, which are discovered via
the ServiceLoader. External code can install its own signers, overriding
the default factories.
For signature verification, exactly the same mechanism is used.
This simplifies the setup of signers and signature verifiers, and makes
it all more regular. Signer instances just get a byte[] to sign and
don't have to worry about ObjectBuilders at all. SignatureVerifier
instances also just get the data and signature as byte[] and don't have
to worry about extracting the signature from a commit or tag, or about
what kind of signature it is.
Both Signers and SignatureVerifiers always get passed the Repository
and the GpgConfig. The repository will be needed in an implementation
for SSH signatures because gpg.ssh.* configs may need to be loaded
explicitly, and some of those values need the current workspace
location.
For signature verification, there is exactly one place in core JGit in
SignatureVerifiers that extracts signatures, determines the signature
type, and then calls the right signature verifier.
Change RevTag to recognize all signature types known in git (GPG, X509,
and SSH).
Change-Id: I26d2731e7baebb38976c87b7f328b63a239760d5 Signed-off-by: Thomas Wolf <twolf@apache.org>
LockFile: Retry lock creation if parent dirs were removed
In the small window between creation of the lock file's parent dirs and
the lock file itself, the parent dirs may be cleaned by an external
process packing refs in the repository. When this scenario occurs, retry
creating the lock file (along with its parent dirs).
Matthias Sohn [Tue, 20 Aug 2024 13:51:15 +0000 (15:51 +0200)]
Merge branch 'master' into stable-7.0
* master:
Update tycho to 4.0.8
Update org.eclipse.dash:license-tool-plugin to 1.1.0
[ssh] Bump Apache MINA sshd 2.13.1 -> 2.13.2
ConfigConstants: Add missing @since 7.0
Fix "Comparison of narrow type with wide type in loop condition"
ObjectWalk: Remove duplicated word "the" in class documentation
RepoProject: read the 'dest-branch' attribute of a project
Make RepoProject#setUpstream public
RepoCommand: Add error to ManifestErrorException
RepoCommand: Copy manifest upstream into .gitmodules ref field
RepoProject: read the "upstream" attribute of a project
JGit v5.13.3.202401111512-r
Matthias Sohn [Tue, 20 Aug 2024 13:26:41 +0000 (15:26 +0200)]
Merge branch 'stable-6.10'
* stable-6.10:
Update tycho to 4.0.8
Update org.eclipse.dash:license-tool-plugin to 1.1.0
Fix "Comparison of narrow type with wide type in loop condition"
RepoProject: read the 'dest-branch' attribute of a project
Make RepoProject#setUpstream public
RepoCommand: Add error to ManifestErrorException
RepoCommand: Copy manifest upstream into .gitmodules ref field
RepoProject: read the "upstream" attribute of a project
JGit v5.13.3.202401111512-r
Matthias Sohn [Tue, 20 Aug 2024 13:21:43 +0000 (15:21 +0200)]
Merge branch 'stable-6.9' into stable-6.10
* stable-6.9:
Update tycho to 4.0.8
Update org.eclipse.dash:license-tool-plugin to 1.1.0
Fix "Comparison of narrow type with wide type in loop condition"
JGit v5.13.3.202401111512-r
Matthias Sohn [Tue, 20 Aug 2024 13:20:37 +0000 (15:20 +0200)]
Merge branch 'stable-6.8' into stable-6.9
* stable-6.8:
Update tycho to 4.0.8
Update org.eclipse.dash:license-tool-plugin to 1.1.0
Fix "Comparison of narrow type with wide type in loop condition"
JGit v5.13.3.202401111512-r
Matthias Sohn [Tue, 20 Aug 2024 12:56:04 +0000 (14:56 +0200)]
Merge branch 'stable-6.7' into stable-6.8
* stable-6.7:
Update tycho to 4.0.8
Update org.eclipse.dash:license-tool-plugin to 1.1.0
Fix "Comparison of narrow type with wide type in loop condition"
JGit v5.13.3.202401111512-r
Matthias Sohn [Tue, 20 Aug 2024 12:54:08 +0000 (14:54 +0200)]
Merge branch 'stable-6.6' into stable-6.7
* stable-6.6:
Update tycho to 4.0.8
Update org.eclipse.dash:license-tool-plugin to 1.1.0
Fix "Comparison of narrow type with wide type in loop condition"
JGit v5.13.3.202401111512-r
Matthias Sohn [Tue, 20 Aug 2024 12:28:33 +0000 (14:28 +0200)]
Merge branch 'stable-6.5' into stable-6.6
* stable-6.5:
Update org.eclipse.dash:license-tool-plugin to 1.1.0
Fix "Comparison of narrow type with wide type in loop condition"
JGit v5.13.3.202401111512-r
Matthias Sohn [Sun, 18 Aug 2024 16:35:29 +0000 (18:35 +0200)]
Merge branch 'stable-6.4' into stable-6.5
* stable-6.4:
Update org.eclipse.dash:license-tool-plugin to 1.1.0
Fix "Comparison of narrow type with wide type in loop condition"
JGit v5.13.3.202401111512-r
Matthias Sohn [Fri, 9 Aug 2024 09:53:01 +0000 (11:53 +0200)]
Fix "Comparison of narrow type with wide type in loop condition"
This issue was detected by a GitHub CodeQL security scan run on JGit
source code.
Description of the error raised by the security scan:
"In a loop condition, comparison of a value of a narrow type with a
value of a wide type may always evaluate to true if the wider value is
sufficiently large (or small). This is because the narrower value may
overflow. This can lead to an infinite loop."
Fix this by using type `long` for the local variable `done`.
Matthias Sohn [Wed, 31 Jul 2024 13:02:55 +0000 (15:02 +0200)]
Merge branch 'master' into stable-7.0
* master:
Lib: Fix ssh value for gpg.format throwing an IllegalArgumentException
DfsPackFile: Abstract the loading of pack indexes
PackExtBlockCacheTable: spread extensions over multiple dfs tables
PackObjectSizeIndex: Read all bytes and use the byte[] directly
DfsPackFile: Do not set local reverse index ref from cache callback
Add 4.33 target platform for Eclipse 2024-09
DfsBlockCacheTable: extract stats get* methods to interface
Add worktrees read support
DfsBlockCacheConfig: support configurations for dfs cache tables per extensions
ssh: Remove .orig file
DfsPackFile: Enable/disable object size index via DfsReaderOptions
Ivan Frade [Wed, 10 Apr 2024 20:52:56 +0000 (13:52 -0700)]
DfsPackFile: Abstract the loading of pack indexes
DfsPackFile assumes that the indexes are stored in file streams and
their references need to be cached in DFS. This doesn't allow us to
experiment other storage options, like key-value databases. In these
experiments not all indexes are together in the same storage.
Define an interface per index to load it, so implementors can focus on
the specifics of their index. Put them together in the IndexFactory
interface. The implementation of the IndexFactory chooses the right
combination of storages.
At the moment we do this only for primary and reverse
indexes. Following changes can do the same for other indexes.
Laura Hamelin [Fri, 7 Jun 2024 23:11:21 +0000 (16:11 -0700)]
PackExtBlockCacheTable: spread extensions over multiple dfs tables
The existing DfsBlockCache uses a single table for all
extensions (idx, ridx, ...).
This change introduces an implementation of the table
interface that can keep extensions in different cache
tables.
This selects the appropriate cache to use for a specific
PackExt or DfsStreamKey's PackExt type, allowing the
separation of entries from different pack types to help
limit churn in cache caused by entries of differing sizes.
This is especially useful in fine-tuning caches and
influencing interactions by extension type.
For example, a table holding INDEX types only will
not influence evictions of other PackExt types and
vice versa.
The PackExtBlockCacheTable allowing setting the
underlying DfsBlockCacheTables and mappinh directly,
letting users implement and use custom DfsBlockCacheTables.
Ivan Frade [Fri, 19 Jul 2024 22:44:15 +0000 (15:44 -0700)]
PackObjectSizeIndex: Read all bytes and use the byte[] directly
The parser reads N integers one by one from the stream, assuming the
InputStream does some ahead reading from storage. We see some very
slow loading of indexes and suspect that this preemptive reading is
not happening. The slow loading can be reproduced in clones, and it
produces higher latencies and locks many threads waiting for the
loading.
Read the whole array from storage in one shot to avoid many small IO
reads. Work directly on the resulting byte[], so there is no need of a
second copy to cast to int/long.
This is how other indexes, like primary or commit graph, work.
Ivan Frade [Wed, 10 Apr 2024 17:52:44 +0000 (10:52 -0700)]
DfsPackFile: Do not set local reverse index ref from cache callback
The DfsBlockCache loading callback sets the local reference to the
index in the DfsPackFile. This prevents abstracting the loading to
implement it over multiple backends.
Reorg the code so the loadReverseIndex do only the loading, the caller
sets it into DfsBlockCache and the external code sets the local
reference in DfsPackFile.
This is the same pattern we did with the PackIndex in the parent
commit.
Laura Hamelin [Mon, 15 Jul 2024 18:35:56 +0000 (11:35 -0700)]
DfsBlockCacheTable: extract stats get* methods to interface
Having the DfsBlockCacheTable methods extracted to an interface will
allow alternative implementations of BlockCacheStats not tied to the
current implementation.
Based on deritative work done in Andre's work in [1].
This change focuses on adding support for reading the repository
state when branches are checked out using git's worktrees.
I've refactored original work by removing all unrelevant
changes which were mostly around refactoring to extract
i.e. constants which mostly created noise for a review.
I've tried to address original review comments:
- Not adding non-behavioral changes
- "HEAD" should get resolved from gitDir
- Reftable recently landed in cgit 2.45,
see https://github.com/git/git/blob/master/Documentation/RelNotes/2.45.0.txt#L8
We can add worktree support for reftable in a later change.
- Some new tests to read from a linked worktree which
is created manually as there's no write support.
Laura Hamelin [Fri, 7 Jun 2024 23:12:18 +0000 (16:12 -0700)]
DfsBlockCacheConfig: support configurations for dfs cache tables per extensions
Parse configurations for tables containing a set of extensions,
defined in [core "dfs.*"] sections.
Parse configurations for cache tables according to configurations
defined in [core "dfs.*"] git config sections for sets of
extensions. The current [core "dfs"] is the default to any
extension not listed in any other table.
Configuration falls back to the defaults defined in the
DfsBlockCacheConfig.java file when not set on each cache
table configuration.
Sample format for individual cache tables:
In this example:
1. PACK types would go to the "default" table
2. INDEX and BITMAP_INDEX types would go to the
"multipleExtensionCache" table
3. REFTABLE types would go to the "reftableCache" table
[core "dfs"] // Configuration for the "default" cache table.
blockSize = 512
blockLimit = 100
concurrencyLevel = 5
(...)
Ivan Frade [Mon, 1 Jul 2024 19:24:36 +0000 (12:24 -0700)]
DfsPackFile: Enable/disable object size index via DfsReaderOptions
DfsPackFile always uses the object size index if available. That is
the desired final state, but for a safe rollout, we should be able to
disable using the object size index.
Add an option (dfs.useObjectSizeIndex) to enable/disable the usage of
the object size index. False by default.
This changes the default from true to false. It only makes a different
for the DFS stack when writing of the index was explicitely
enabled. This is an optimization, so it shouldn't cause any
regression. Operators can restore previous behaviour setting
"dfs.useObjectSizeIndex" to true.
RepoProject: read the 'dest-branch' attribute of a project
The manifest spec [1] defines a "dest-branch" attribute. Parse its
value and store it in the RepoProject. Also, create a getter/setter
for dest-branch.
Applications using JGit such as Gerrit plugins may have their own
manifest parsers. They can start using RepoProject to some extent
with this change. Eventually, they can be migrated to use the
ManifestParser in JGit, however until then, this change can help
make the migration incremental.
Ivan Frade [Thu, 6 Jun 2024 19:01:04 +0000 (12:01 -0700)]
RepoCommand: Add error to ManifestErrorException
RepoCommand wraps errors in the manifest in a ManifestErrorException
with a fixed message ("Invalid manifest"). Callers like supermanifest
plugin cannot return a meaningful error to the client without digging
into the cause chain.
Add the actual error message to the ManifestErrorException, so callers
can rely on #getMessage() to see what happens.
Ivan Frade [Thu, 30 May 2024 21:04:56 +0000 (14:04 -0700)]
RepoCommand: Copy manifest upstream into .gitmodules ref field
Project entries in the manifest with a specific sha1 as revision can
use the "upstream" field to report the ref pointing to that sha1. This
information is very valuable for downstream tools, as they can limit
their search for a blob to the relevant ref, but it gets lost in the
translation to .gitmodules.
Save the value of the upstream field when available/relevant in the
ref field of the .gitmodules entry.
Ivan Frade [Thu, 30 May 2024 17:56:20 +0000 (10:56 -0700)]
RepoProject: read the "upstream" attribute of a project
The manifest spec [1] defines the "upstream" attribute: "name of the
git ref in which a sha1 can be found", when the revision is a
sha1. The parser is ignoring it, but RepoCommand could use it to
populate the "ref=" field of pinned submodules.
Parse the value and store it in the RepoProject.
RepoProject is public API and the current constructors are not
telescopic, so we cannot just add a new constructor with an extra
argument. Use plain getter/setters.j
RepoProject: read the 'dest-branch' attribute of a project
The manifest spec [1] defines a "dest-branch" attribute. Parse its
value and store it in the RepoProject. Also, create a getter/setter
for dest-branch.
Applications using JGit such as Gerrit plugins may have their own
manifest parsers. They can start using RepoProject to some extent
with this change. Eventually, they can be migrated to use the
ManifestParser in JGit, however until then, this change can help
make the migration incremental.
Ivan Frade [Tue, 9 Apr 2024 20:25:43 +0000 (13:25 -0700)]
DfsPackFile: Do not set primery index local ref from cache callback
DfsPackFile assumes the indices are in pack streams, but we would like
to consider other formats and storage. Currently, the local ref in the
DfsPackFile to the index is set in the cache loading callback, which
prevents abstracting the loading.
Reorganize the code so: the loadPackIndex function just parses the bytes
returning a reference and the caller sets the loaded index in the local
ref and DfsBlockCache.
We will follow this pattern with other indices in follow-up
changes. Note that although DfsPackFile is used only in one thread,
the loading in DfsBlockCache can happen from multiple threads
concurrently and we want to keep only one ref around.
Ivan Frade [Thu, 6 Jun 2024 19:01:04 +0000 (12:01 -0700)]
RepoCommand: Add error to ManifestErrorException
RepoCommand wraps errors in the manifest in a ManifestErrorException
with a fixed message ("Invalid manifest"). Callers like supermanifest
plugin cannot return a meaningful error to the client without digging
into the cause chain.
Add the actual error message to the ManifestErrorException, so callers
can rely on #getMessage() to see what happens.
Matthias Sohn [Tue, 4 Jun 2024 15:18:14 +0000 (17:18 +0200)]
Merge branch 'next'
* next:
Bump jetty version to 12.0.9 and servlet-api to 6.0
Bump jetty version to 11.0.20
Update minimum Java version to 17
Prepare 7.0.0-SNAPSHOT builds
Ivan Frade [Fri, 31 May 2024 19:12:57 +0000 (12:12 -0700)]
CommitGraphWriter: Move path diff calculation to its own class
To verify that we have the right paths between commits we are writing
the bloom filters, reading them and querying. The path diff
calculation is tricky enough for correctness and performance that
should be tested on its own.
Move the path diff calculation to its own class, so we can test it on
its own.
This is a noop refactor so we can verify later the steps taken in the
walk.
Ivan Frade [Thu, 30 May 2024 21:04:56 +0000 (14:04 -0700)]
RepoCommand: Copy manifest upstream into .gitmodules ref field
Project entries in the manifest with a specific sha1 as revision can
use the "upstream" field to report the ref pointing to that sha1. This
information is very valuable for downstream tools, as they can limit
their search for a blob to the relevant ref, but it gets lost in the
translation to .gitmodules.
Save the value of the upstream field when available/relevant in the
ref field of the .gitmodules entry.