Matthias Sohn [Tue, 26 Nov 2024 13:46:00 +0000 (14:46 +0100)]
Merge branch 'master' into stable-7.1
* master:
PackDirectory: Filter out tmp GC pack files
Add pack-refs command to the CLI
Test advertised capabilities with protocol V0 and allow*Sha1InWant
Align request policies with CGit
GitTimeParser: Fix multiple errorprone and style comments
PersonIdent: Preserve the timezone when copying with new time
PersonIdent: Revert @since of #getZoneId
tests/BasicTest: Use java.time constructors for PersonIdent
RawParseUtils test: Use java.time to create PersonIdents
Change default similarity score to 50(%) to match git's default
Pack.java: Recover more often in Pack.copyAsIs2()
Matthias Sohn [Tue, 26 Nov 2024 13:37:00 +0000 (14:37 +0100)]
Merge branch 'stable-7.0'
* stable-7.0:
PackDirectory: Filter out tmp GC pack files
Test advertised capabilities with protocol V0 and allow*Sha1InWant
Align request policies with CGit
Pack.java: Recover more often in Pack.copyAsIs2()
Matthias Sohn [Tue, 26 Nov 2024 13:17:25 +0000 (14:17 +0100)]
Merge branch 'stable-6.10' into stable-7.0
* stable-6.10:
PackDirectory: Filter out tmp GC pack files
Test advertised capabilities with protocol V0 and allow*Sha1InWant
Align request policies with CGit
Pack.java: Recover more often in Pack.copyAsIs2()
Martin Fick [Sat, 23 Nov 2024 01:08:57 +0000 (17:08 -0800)]
PackDirectory: Filter out tmp GC pack files
git repack passes a ".tmp-XXXX-" prefix to git pack-objects when
repacking. git pack-objects then adds a "pack-XXXXX.pack" to this to
create the name of new packfiles it writes to. PackDirectory was
previously very lenient and would allow these files to be added to its
list of known packfiles. Fix PackDirectory to filter these out since
they are not meant to be consumed yet, and doing so can cause user
facing errors.
Change-Id: I072e57d9522e02049db17d3f4977df7eda14bba7 Signed-off-by: Martin Fick <mfick@nvidia.com>
Yash Chaturvedi [Thu, 14 Nov 2024 14:33:08 +0000 (20:03 +0530)]
Add pack-refs command to the CLI
This command can be used to optimize storage of references.
For a RefDirectory database, it packs non-symbolic, loose refs into
packed-refs. By default, only the references under '$GIT_DIR/refs/tags'
are packed. The '--all' option can be used to pack all the references
under '$GIT_DIR/refs'.
For Reftable, all refs are compacted into a single table.
pszlazak [Sun, 17 Nov 2024 22:24:09 +0000 (23:24 +0100)]
Test advertised capabilities with protocol V0 and allow*Sha1InWant
The advertised capabilities with protocol V0 were untested
leading to potential regressions when advertising what
SHA1 should or should not be on the list of capabilities.
Verify that allow-tip-sha1-in-want and allow-reachable-sha1-in-want
are properly advertised with the allow*Sha1InWant is set in
jgit.config.
pszlazak [Sun, 17 Nov 2024 22:11:02 +0000 (23:11 +0100)]
Align request policies with CGit
CGit defines the SHA request policies using a bitmask
that represents which policy is implied by another policy.
For example, in CGit the ALLOW_TIP_SHA1 is 0x01 and ALLOW_REACHABLE_SHA1
is 0x02, which are associated to two different bit in a 3-bit value.
The ALLOW_ANY_SHA1 value is 0x07 which denotes a different policy that
implies the previous two ones, because is represented with a 3-bit
bitmask having all ones.
Associate the JGit RequestPolicy enum to the same CGit bitmask values
and use the same logic for the purpose of advertising the server
capabilities.
The JGit code becomes easier to read and associate with its counterpart
in CGit, especially during the capabilities advertising phase.
Also add a new utility method RequestPolicy.implies() which is more
readable than a direct bitmask and operator.
Matthias Sohn [Thu, 21 Nov 2024 08:30:30 +0000 (09:30 +0100)]
Merge branch 'stable-7.0' into stable-7.1
* stable-7.0:
DiffDriver: fix doc for rust built-in
DiffDriver: fix formatting of javadoc
Add numberOfObjectsSinceBitmap to RepoStatistics
Support built-in diff drivers for hunk header function names
Don't fail when trying to prune pack which is already gone
Rename numberOfPackFilesAfterBitmap to numberOfPackFilesSinceBitmap
Matthias Sohn [Wed, 20 Nov 2024 23:31:00 +0000 (00:31 +0100)]
Merge branch 'stable-7.0'
* stable-7.0:
DiffDriver: fix doc for rust built-in
DiffDriver: fix formatting of javadoc
Add numberOfObjectsSinceBitmap to RepoStatistics
Support built-in diff drivers for hunk header function names
Don't fail when trying to prune pack which is already gone
Rename numberOfPackFilesAfterBitmap to numberOfPackFilesSinceBitmap
Matthias Sohn [Wed, 20 Nov 2024 23:24:53 +0000 (00:24 +0100)]
Merge branch 'stable-6.10' into stable-7.0
* stable-6.10:
DiffDriver: fix doc for rust built-in
DiffDriver: fix formatting of javadoc
Add numberOfObjectsSinceBitmap to RepoStatistics
Support built-in diff drivers for hunk header function names
Don't fail when trying to prune pack which is already gone
Rename numberOfPackFilesAfterBitmap to numberOfPackFilesSinceBitmap
Ivan Frade [Wed, 20 Nov 2024 21:09:24 +0000 (13:09 -0800)]
PersonIdent: Preserve the timezone when copying with new time
The PersonIdent(PersonIdent,Date) constructor must create a copy with
the same author/email/timezone but different time. When we changed
the implementation to the new Instant/ZoneId version, we forgot to
pass the timezone. This made fail some tests downstream.
Jacek Centkowski [Thu, 31 Oct 2024 17:30:02 +0000 (18:30 +0100)]
Add numberOfObjectsSinceBitmap to RepoStatistics
Introduce a numberOfObjectsSinceBitmap that contains the number of
objects stored in pack files and as loose objects created since the
latest bitmap generation.
Note that the existing
GcNumberOfPackFilesAfterBitmapStatisticsTest.java was renamed to
GcSinceBitmapStatisticsTest.java and extended to cover also this
statistic.
Support built-in diff drivers for hunk header function names
The regexes defined for each built-in driver will be used to determine
the function name for a hunk header. Each driver can specify a list of
regexes to negate and match. They can also define pattern compilation
flags if needed. These drivers only apply to text files with unified
patch type.
Following built-in drivers have been added:
- cpp
- dts
- java
- python
- rust
Support for more languages can be added as needed to match the cgit
implementation.
Jacek Centkowski [Mon, 11 Nov 2024 11:48:20 +0000 (12:48 +0100)]
Don't fail when trying to prune pack which is already gone
Update the TestRepository.prunePacked so that it doesn't fail if a pack
to be pruned is already gone.
It is especially handy when the prunePacked function is called in
`TestRepository.packAndPrune` function after the repo moves on after
GC was performed.
Ivan Frade [Fri, 15 Nov 2024 21:14:47 +0000 (13:14 -0800)]
RawParseUtils test: Use java.time to create PersonIdents
The constructor with long/int for time/tz is deprecated in favor of
Instant/ZoneId.
Update first the expectations in the tests to the new constructors, so
we know we are creating the same PersonIdents than before. We update
the code later, knowing there is no regression in e.g. format or
precision.
Matthias Sohn [Tue, 19 Nov 2024 13:58:31 +0000 (14:58 +0100)]
Merge branch 'master' into stable-7.1
* master:
RecursiveMerger: fix boxing warning
UploadPackTest: fix unclosed resource warning
Suppress non-externalized string warnings
Remove unused API problem filters
PersonIdent: Use java.time instead of older Date and milliseconds
GitTimeParser: A date parser using the java.time API
Configure JDT to not raise error on deprecated class linked in javadoc
Update Jetty to 12.0.15
PullCommandTest: assert git status in some simple tests
SystemReader: add method to get LocalDateTime
SystemReader#now: make it a concrete method
[errorprone] RawText: Add parenthesis for explicit op precedence
MockSystemReader: create the right time zone
RawText: improve performance of isCrLfText and isBinary
SystemReader: Give a default implementation to #getTimezoneAt()
[errorprone] ssh: suppress warning for arrays in records
Don't fail when trying to prune pack which is already gone
Ivan Frade [Fri, 15 Nov 2024 18:26:21 +0000 (10:26 -0800)]
PersonIdent: Use java.time instead of older Date and milliseconds
From errorprone: Date has a bad API that leads to bugs; prefer
java.time.Instant or LocalDate.
Replace the long with milliseconds and int with minutes offset with an
Instant and a ZoneOffset. Create new constructors and deprecate
variants with Date, milliseconds and minute offsets.
When comparing instances of PersonIdent truncate the timestamp precision
to 1 second since git commit timestamps are persisted with 1 second
precision [1].
Ivan Frade [Mon, 11 Nov 2024 21:13:59 +0000 (13:13 -0800)]
GitTimeParser: A date parser using the java.time API
Replacement of GitDateParser that uses java.time classes instead of
the obsolete Date. Updating GitDateParser would have been a mess of
deprecation and methods with confusing names, so I opted for writing a
parallel class with the new types.
Some differences:
* The new DateTimeFormatter is thread-safe, so we don't need the
LocalThread cache
* No code seems to use other locale than the default, we don't need to
cache per locale either
Ivan Frade [Thu, 14 Nov 2024 00:25:42 +0000 (16:25 -0800)]
SystemReader#now: make it a concrete method
Abstract methods break subclasses (e.g. DelegateSystemReader in
gerrit). Updating jgit and gerrit is simpler if we do not add them. I
am not sure why some methods are abstract and others dont, but now()
can be a concrete method.
Make now() concrete. Implement it by default based on
getCurrentTime(), so subclasses overriding that method get the same
value.
Ivan Frade [Wed, 13 Nov 2024 19:58:06 +0000 (11:58 -0800)]
SystemReader: Give a default implementation to #getTimezoneAt()
This abstract method forces subclasses (e.g. DelegateSystemReader in
gerrit) to update their code, but there is no strong reason to make it
abstract (subclasses can override it if needed).
Make the method concrete using the current default implementation
(which is the same in the mock).
Jacek Centkowski [Mon, 11 Nov 2024 11:48:20 +0000 (12:48 +0100)]
Don't fail when trying to prune pack which is already gone
Update the TestRepository.prunePacked so that it doesn't fail if a pack
to be pruned is already gone. It is especially handy when prunePacked
function is called in `TestRepository.packAndPrune` function after repo
moves on after the GC was performed.
Matthias Sohn [Mon, 11 Nov 2024 23:06:18 +0000 (00:06 +0100)]
Merge branch 'master' into stable-7.1
* master:
errorprone: Disable javadoc checks in tests
Rename numberOfPackFilesAfterBitmap to numberOfPackFilesSinceBitmap
Replace custom encoder Constants#encodeASCII by JDK implementation
Replace custom encoder `Constants#encode` by JDK implementation
DfsGarbageCollector: #setReflogExpire with Instant instead of Date
ssh: Minor simplification in SerialRangeSet
DfsBlockCacheConfig: propagate hotmap configs to pack ext cache configs
SystemReader: Offer methods with java.time API
Add `numberOfPackFilesAfterBitmap` to RepoStatistics
Enhance CommitBuilder#parent to tolerate null parent
GPG: use BC PGP secret key parsing out of the box
[errorprone] bc: Remove unused SExprParser#parseSecretKey
Update bouncycastle to 1.79
Update bytebuddy to 1.15.10
DfsPackCompactor: write object size index
[errorprone] BaseRepositoryBuilder: Use #split(sep, limit)
[errorprone] Remove deprecated security manager
[errorprone] RefDatabase: #getConflictingNames immutable return
DfsGarbageCollector: Add setter for reflog expiration time.
[errorprone] SeparateClassloadertTestRunner: use #split(String,int)
[errorprone] HttpConnection: Add missing summary in java
[errorprone] PackWriter: Fix javadoc tag in new #writeIndex method
[errorprone] ByteArraySet: Add summary fragment to javadoc
[errorprone] util.Stats: Add summary fragment to javadoc
DfsInserter: Read minBytesForObjectSizeIndex only from repo config
PackWriter: make PackWriter.writeIndex() take a PackIndexWriter
dfs: update getBlockCacheStats to return a List of BlockCacheStats
Update mockito to 5.14.2
Update bytebuddy to 1.15.7
Remove unnecessary argument handler in MergeBase.java
Replace custom encoder Constants#encodeASCII by JDK implementation
Ivan Frade [Mon, 11 Nov 2024 19:07:21 +0000 (11:07 -0800)]
errorprone: Disable javadoc checks in tests
Errorprone finds many problems in the tests javadocs. This is noisy in
the logs, but fixing them also disturbs the project history and can
complicate merges.
Disable the javadoc checks in the tests packages. We can fix those
javadocs if some other modification happen in the file (as we fix
older coding style).
Martin Fick [Mon, 23 Sep 2024 19:10:17 +0000 (12:10 -0700)]
Pack.java: Recover more often in Pack.copyAsIs2()
The PACK class is designed to throw
StoredObjectRepresentationNotAvailableException at times when it cannot
find an object which previously was believed to be in its packfile and
it is still possible for the caller, PackWriter.writeObjectImpl(), to
retry copying the object from another file and potentially avoid
causing a user facing error for this fairly common expected situation.
This retry helps handle when repacking causes a packfile to be replaced
by new files with the same objects. Improve copyAsIs2() to drastically
make recovery possible in more situations.
Once any data for a specific object, has been sent it is very difficult
to recover from that object being relocated to another pack. But if a
read error is detected in copyAsIs2() before sending the object header
(and thus any data), then it should still be recoverable. Fix three
places where we could have recovered because we hadn't sent the header
yet, and adjust another place to send the header a bit later, after
having read some data from the object successfully. Basically, if the
header has not been written yet, throw
StoredObjectRepresentationNotAvailableException to signal that this is
still recoverable.
These fixes should drastically improve the chances of recovery since due
to unix file semantics, if the partial read succeeds, then the full read
will very likely succeed. This is because while the file may no longer
be open when the first read is done (the WindowCache may have evicted
it), once the first read completes it will likely still be open and even
if the file is deleted the WindowCache will continue to be able to read
from it until it closes it.
Change-Id: Ib87e294e0dbacf71b10db55be511e91963b4a84a Signed-off-by: Martin Fick <mfick@nvidia.com>
Ivan Frade [Fri, 8 Nov 2024 16:49:09 +0000 (08:49 -0800)]
DfsGarbageCollector: #setReflogExpire with Instant instead of Date
The Date API is full of major design flaws and pitfalls and should be
avoided at all costs. Prefer the java.time APIs, specifically,
java.time.Instant (for physical time) and java.time.LocalDate[Time]
(for civil time). [1]
Replace the Date with Instant in the
DfsGarbageCollector#setReflogExpire method.
Laura Hamelin [Wed, 6 Nov 2024 21:41:30 +0000 (13:41 -0800)]
DfsBlockCacheConfig: propagate hotmap configs to pack ext cache configs
CacheHotMap is currently only set on the base DfsBlockCacheConfig and is
not propagated down to PackExt specific caches.
Because CacheHotMap is set from a method call rather than from Configs,
this change sets per-PackExt CacheHotMap configs on PackExt cache
configs both when DfsBlockCacheConfig#setCacheHotMap(...) is called, and
when DfsBlockCacheConfig#configure(...) is called after setCacheHotMap.
The outer DfsBlockCacheConfig keeps the full CacheHotMap for the same
reason that the CacheHotMap config is propagated in both setCacheHotMap
and configure: the order of operations setting the configuration from
Configs and calling setCacheHotMap is not guaranteed.
Ivan Frade [Mon, 4 Nov 2024 22:21:24 +0000 (14:21 -0800)]
SystemReader: Offer methods with java.time API
Error prone explains: The Date API is full of major design flaws and
pitfalls and should be avoided at all costs. Prefer the java.time
APIs, specifically, java.time.Instant (for physical time) and
java.time.LocalDate[Time] (for civil time).
Add to SystemReader methods to get the time and timezone in the new
java.time classes (Instant/ZoneId) and mark as deprecated their old
counterparts.
The mapping of methods is:
* #getCurrentTime -> #now (returns Instant instead of int)
* #getTimezone -> #getTimeZoneAt (returns ZoneOffset intead of int)
* #getTimeZone -> #getTimeZoneId (return ZoneId instead of TimeZone)
Jacek Centkowski [Fri, 20 Sep 2024 06:47:13 +0000 (08:47 +0200)]
Add `numberOfPackFilesAfterBitmap` to RepoStatistics
Introduce a `numberOfPackFilesAfterBitmap` that contains the number of
packfiles created since the latest bitmap generation.
Notes:
* the `repo.getObjectDatabase().getPacks()` that obtains the list of
packs (in the existing `getStatistics` function) uses
`PackDirectory.scanPacks` that boils down to call
`PackDirectory.scanPacksImpl` which is sorting packs prior returning
them therefore the `numberOfPackFilesAfterBitmap` is just all packs
before the one that has bitmap attached
* the improved version of `packAndPrune` function (one that skips
non-existent packfiles) was introduced for testing
Thomas Wolf [Wed, 6 Nov 2024 18:14:47 +0000 (19:14 +0100)]
GPG: use BC PGP secret key parsing out of the box
Remove the custom S-expression parsing; BC has gotten many
improvements in 1.79 regarding PGP ed25519 keys, AES/OCB
encryption, and generally parsing key files. It now can do
all we need.
Change-Id: I392443e040cce150a9575d18795a7cb8195a3515 Signed-off-by: Thomas Wolf <twolf@apache.org>
errorprone complains about using Date in the SExprParser class. All
the usages are in a variant of the parseSecretKey method that doesn't
have any callers.
Matthias Sohn [Tue, 5 Nov 2024 00:29:08 +0000 (01:29 +0100)]
Merge branch 'stable-7.1'
* stable-7.1:
Add missing @since 7.1 to UploadPack#implies
ResolveMerger: Allow setting the TreeWalk AttributesNodeProvider
Add Union merge strategy support
Nasser Grainawi [Tue, 29 Oct 2024 23:22:15 +0000 (17:22 -0600)]
ResolveMerger: Allow setting the TreeWalk AttributesNodeProvider
When a merger is created without a Repository, no
AttributesNodeProvider is created in the TreeWalk. Since mergers are
often created with a custom ObjectInserter and no repo, they skip any
lookups of attributes from any of the gitattributes files (within a
tree, in the repo info/ dir, or user/global). Since there are
potentially merge-affecting attributes in those files, callers might
want to use both a custom ObjectInserter and an AttributesNodeProvider.
Ivan Frade [Fri, 1 Nov 2024 15:58:27 +0000 (08:58 -0700)]
DfsPackCompactor: write object size index
Currently the compactor is not writing the object size index for
packs. As it is using PackWriter to generate the packs, it needs to
explicitely call the writes of each extension.
Invoke writeObjectSizeIndex in the compactor. The pack writer will
write one if the configuration says so.
Ivan Frade [Thu, 31 Oct 2024 19:17:09 +0000 (12:17 -0700)]
[errorprone] Remove deprecated security manager
Errorprone warns about this deprecated classes. The recommendation is
stop using SecurityManager all together.
The Security Manager is deprecated and subject to removal in a future
release. There is no replacement for the Security Manager. See JEP 411
[1] for discussion and alternatives.
Errorprone reports that: This method returns both mutable and
immutable collections or maps from different paths. This may be
confusing for users of the method.
Saril Sudhakaran [Tue, 29 Oct 2024 05:17:01 +0000 (00:17 -0500)]
DfsGarbageCollector: Add setter for reflog expiration time.
JGit reftable writer/compator knows how to prune the history, but the
DfsGarbageCollector doesn't expose the time limit.
Add a method to DfsGarbageCollector to set the reflog time limit.
This value is then passed to the reftable compactor. Callers usually
pass here the value from gc.reflogExpire.
The reflog block length is stored in 24 bits [1], limiting the size to
16MB. I have observed that in repositories with frequent commits,
reflogs hit that size in 6-12 months.