Jonathan Tan [Tue, 4 Apr 2023 23:49:30 +0000 (16:49 -0700)]
PatchApplierTest: specify charset to avoid warning
Running
bazel test //org.eclipse.jgit.test:org_eclipse_jgit_patch_PatchApplierTest
results in a DefaultCharset error on my machine, which can be avoided
by explicitly specifying a charset when calling getBytes on a string. In
these tests, the charset used doesn't really matter, so go with UTF_8.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Change-Id: Id7721cc5b7ea650e77c2db47042715487983cae6
Jonathan Tan [Wed, 5 Apr 2023 20:44:59 +0000 (13:44 -0700)]
GcConcurrentTest: @Ignore flaky testInterruptGc
During my development of Id7721cc5b7ea650e77c2db47042715487983cae6, I
have found this test to be flaky when run by CI. As a speculative fix,
mark this test as @Ignore so it won't be run.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Change-Id: Idfe04d7f1fb72a772d4c8d249ca86a9c2eec0b1a
Reason for revert: This change was based on the false claim that the
packedrefs file lock is held while the CAS is being done, but it is
actually released before the CAS (the in memory lock is still held,
however that does not prevent external actors from updating the
packedrefs files and then another thread from subsequently re-reading it
and updating the in memory packedRefList). Although reverting this
change can cause the CAS to fail, it should not actually matter since
the failure would indicate that another thread has already updated the
in memory packedRefList to either the same version this thread was
trying to update it too, or to a more recent version. Either way,
failing the CAS is then appropriate and should not be problematic.
Although this change reverts the code in the RefDirectory class, it
keeps the "improvements" to the test so that it continues to pass
reliably. The reason for the quotes around the word "improvements" is
because I believe the test alteration actually dramatically changes the
intent of the test, and that the original intent of the test is
untestable with the GC and RefDirectory classes as is.
Change-Id: I3acee7527bb542996dcdfaddfb2bdb45ec444db5 Signed-off-by: Martin Fick <quic_mfick@quicinc.com>
RefDirectory.delete: Prevent failures when packed-refs is outdated
The in-memory copy of packed refs might be outdated by the time the
packed-refs lock is acquired, so ensure the one read from disk is
used after acquiring the lock to prevent commit packed-refs from
throwing an exception. As a side-effect, since this updates the
in-memory copy of packed-refs when it is re-read from disk, it can
prevent other callers needing to re-read if it had changed.
RefDirectory.pack: Only rely on packed refs from disk
Since packed-refs is read from disk anyway, don't rely on the
in-memory copy as that is racy and if outdated, could result in
commit of pack-refs throwing an exception. This change also avoids
a possible unnecessary double read of packed-refs from disk.
RefDirectory: Make pack() and commitPackRefs() void
There are no more callers (since Iae71cb3) of these methods that need
the returned value. These methods should not have been returning
anything in the first place as that can introduce bugs such as the
one described in Iae71cb3.
Implement a snapshotting RefDirectory for use in request scope
Introduce a SnapshottingRefDirectory class which allows users to get
a snapshot of the ref database and use it in a request scope (for
example a Gerrit query) instead of having to re-read packed-refs
several times in a request.
This can potentially be further improved to avoid scanning/reading a
loose ref several times in a request. This would especially help
repeated lookups of a packed ref, where we check for the existence of
a loose ref each time.
Since the first attempt to read a ref is not expected to trigger
a RefsChangedEvent, update the test to ensure 'lastNotifiedModCnt'
is not 0 before we start the actual work. The test has been passing
luckily because createBareRepository in setUp() happens to bump
'lastNotifiedModCnt'.
PackedBatchRefUpdate: Ensure updates are applied on latest packed refs
In the window between refs being packed (via refDb.pack) and obtaining
updates (via applyUpdates), packed-refs may have been updated by another
actor and relying on the previously read contents may lead to losing the
updates done by the other actor. To help avoid this, read packed-refs
from disk to ensure we have the latest copy after it is locked and
before committing updates to it.
Ronald Bhuleskar [Wed, 22 Mar 2023 22:07:19 +0000 (15:07 -0700)]
BasePackFetchConnection: support negotiationTip feature
By default, Git will report, to the server, commits reachable from all local refs to find common commits in an attempt to reduce the size of the to-be-received packfile. If specified with negotiation tip, Git will only report commits reachable from the given tips. This is useful to speed up fetches when the user knows which local ref is likely to have commits in common with the upstream ref being fetched.
When negotation-tip is on, use the wanted refs instead of all refs as source of the "have" list to send.
This is controlled by the `fetch.usenegotationtip` flag, false by default. This works only for programmatic fetches and there is no support for it yet in the CLI.
1. For general errors, throw IOException instead of wrapping them with
PatchApplyException. The wrapping was moved (back) to ApplyCommand.
2. For file specific errors, log the errors as part of
PatchApplier::Result.
3. Change applyPatch() to receive the parsed Patch object, so the caller
can decide how to handle parsing errors.
Background: this utility class was extracted from ApplyCommand on V6.4.0.
During the extraction, we left the exception wrapping by
PatchApplyException intact. This attitude made it harder for the callers to
distinguish between the actual error causes.
kylezhao [Mon, 27 Mar 2023 06:48:31 +0000 (14:48 +0800)]
Ensure FileCommitGraph scans commit-graph file if it already exists
When commit-graph file already exists in the repository, a newly
created FileCommitGraph didn't scan CommitGraph until the file was
modified, resulting in wrong result.
Xing Huang [Tue, 21 Mar 2023 22:27:49 +0000 (17:27 -0500)]
GC: Close File.lines stream
From File#lines javadoc: The returned stream from File Lines
encapsulates a Reader. If timely disposal of file system resources is
required, the try-with-resources construct should be used to ensure
that the stream's close method is
invoked after the stream operations are completed.
Xing Huang [Tue, 21 Mar 2023 22:27:49 +0000 (17:27 -0500)]
GC: Close File.lines stream
From File#lines javadoc: The returned stream from File Lines
encapsulates a Reader. If timely disposal of file system resources is
required, the try-with-resources construct should be used to ensure
that the stream's close method is
invoked after the stream operations are completed.
Matthias Sohn [Fri, 3 Mar 2023 15:04:00 +0000 (16:04 +0100)]
Merge branch 'stable-6.5'
* stable-6.5:
[errorprone] Suppress [Finally] warnings
Update Orbit to R20230302014618 for 2023-03
Improve test coverage when core.trustPackedRefsStat set to after_open
Prepare 6.5.0-SNAPSHOT builds
JGit v6.5.0.202302281825-rc1
Prepare 6.5.0-SNAPSHOT builds
JGit v6.5.0.202302221508-m3
Matthias Sohn [Thu, 2 Mar 2023 09:43:10 +0000 (10:43 +0100)]
[errorprone] Suppress [Finally] warnings
In these cases we use Throwable#addSuppressed to ensure the exception
thrown in the catch block preceding the finally block throwing another
exception isn't lost.
Improve test coverage when core.trustPackedRefsStat set to after_open
As of today, we don't have test coverage for RefDirectory when
core.trustPackedRefsStat config is set to after_open. Thus create new
test classes which set core.trustPackedRefsStat config to after_open in
setup and extend RefDirectoryTest and FileRepositoryBuilderTest
respectively.
Matthias Sohn [Tue, 28 Feb 2023 23:11:41 +0000 (00:11 +0100)]
Merge branch 'master' into stable-6.5
* master:
Change config pull.rebase=preserve to pull.rebase=merges
BatchingProgressMonitor: expose time spent per task
PackWriter: offer to write an object-size index for the pack
Fix formatting in GC#doGc
PackExt: Define new extension for the object size index
Matthias Sohn [Thu, 19 Jan 2023 00:46:50 +0000 (01:46 +0100)]
BatchingProgressMonitor: expose time spent per task
Display elapsed time per task if enabled via
ProgressMonitor#showDuration or if system property or environment
variable GIT_TRACE_PERFORMANCE is set to "true". If both the system
property and the environment variable are set the system property takes
precedence.
Ivan Frade [Tue, 28 Dec 2021 22:23:40 +0000 (14:23 -0800)]
PackWriter: offer to write an object-size index for the pack
PackWriter callers tell the writer what do the want to include in the
pack and invoke #writePack(). Afterwards, they can invoke #writeIndex()
to write the corresponding pack index.
Mirror this for the object-size index, adding a #writeObjectSizeIndex()
method.
Matthias Sohn [Wed, 22 Feb 2023 20:07:34 +0000 (21:07 +0100)]
Merge branch 'master' into stable-6.5
* master:
If tryLock fails to get the lock another gc has it
Fix GcConcurrentTest#testInterruptGc
Don't swallow IOException in GC.PidLock#lock
Check if FileLock is valid before using or releasing it
Matthias Sohn [Wed, 22 Feb 2023 20:06:41 +0000 (21:06 +0100)]
Merge branch 'stable-6.4'
* stable-6.4:
If tryLock fails to get the lock another gc has it
Fix GcConcurrentTest#testInterruptGc
Don't swallow IOException in GC.PidLock#lock
Check if FileLock is valid before using or releasing it
Matthias Sohn [Wed, 22 Feb 2023 20:04:31 +0000 (21:04 +0100)]
Merge branch 'stable-6.3' into stable-6.4
* stable-6.3:
If tryLock fails to get the lock another gc has it
Fix GcConcurrentTest#testInterruptGc
Don't swallow IOException in GC.PidLock#lock
Check if FileLock is valid before using or releasing it
Matthias Sohn [Wed, 22 Feb 2023 20:03:52 +0000 (21:03 +0100)]
Merge branch 'stable-6.2' into stable-6.3
* stable-6.2:
If tryLock fails to get the lock another gc has it
Fix GcConcurrentTest#testInterruptGc
Don't swallow IOException in GC.PidLock#lock
Check if FileLock is valid before using or releasing it
Matthias Sohn [Wed, 22 Feb 2023 20:03:22 +0000 (21:03 +0100)]
Merge branch 'stable-6.1' into stable-6.2
* stable-6.1:
If tryLock fails to get the lock another gc has it
Fix GcConcurrentTest#testInterruptGc
Don't swallow IOException in GC.PidLock#lock
Check if FileLock is valid before using or releasing it
Matthias Sohn [Wed, 22 Feb 2023 20:02:47 +0000 (21:02 +0100)]
Merge branch 'stable-6.0' into stable-6.1
* stable-6.0:
If tryLock fails to get the lock another gc has it
Fix GcConcurrentTest#testInterruptGc
Don't swallow IOException in GC.PidLock#lock
Check if FileLock is valid before using or releasing it
Matthias Sohn [Wed, 22 Feb 2023 20:02:09 +0000 (21:02 +0100)]
Merge branch 'stable-5.13' into stable-6.0
* stable-5.13:
If tryLock fails to get the lock another gc has it
Fix GcConcurrentTest#testInterruptGc
Don't swallow IOException in GC.PidLock#lock
Check if FileLock is valid before using or releasing it
Matthias Sohn [Wed, 22 Feb 2023 01:18:10 +0000 (02:18 +0100)]
Merge branch 'master' into stable-6.5
* master:
Use Java 11 ProcessHandle to get pid of the current process
UploadPack: use allow-any-sha1-in-want configuration
Acquire file lock "gc.pid" before running gc
Silence API errors introduced by 9424052f
Matthias Sohn [Wed, 22 Feb 2023 00:29:32 +0000 (01:29 +0100)]
Merge branch 'stable-6.4'
* stable-6.4:
Use Java 11 ProcessHandle to get pid of the current process
Acquire file lock "gc.pid" before running gc
Silence API errors introduced by 9424052f
Matthias Sohn [Wed, 22 Feb 2023 00:28:27 +0000 (01:28 +0100)]
Merge branch 'stable-6.3' into stable-6.4
* stable-6.3:
Use Java 11 ProcessHandle to get pid of the current process
Acquire file lock "gc.pid" before running gc
Silence API errors introduced by 9424052f
Matthias Sohn [Wed, 22 Feb 2023 00:27:50 +0000 (01:27 +0100)]
Merge branch 'stable-6.2' into stable-6.3
* stable-6.2:
Use Java 11 ProcessHandle to get pid of the current process
Acquire file lock "gc.pid" before running gc
Silence API errors introduced by 9424052f
Matthias Sohn [Wed, 22 Feb 2023 00:27:16 +0000 (01:27 +0100)]
Merge branch 'stable-6.1' into stable-6.2
* stable-6.1:
Use Java 11 ProcessHandle to get pid of the current process
Acquire file lock "gc.pid" before running gc
Silence API errors introduced by 9424052f
Matthias Sohn [Wed, 22 Feb 2023 00:26:36 +0000 (01:26 +0100)]
Merge branch 'stable-6.0' into stable-6.1
* stable-6.0:
Use Java 11 ProcessHandle to get pid of the current process
Acquire file lock "gc.pid" before running gc
Silence API errors introduced by 9424052f
kylezhao [Thu, 20 May 2021 03:29:59 +0000 (11:29 +0800)]
UploadPack: use allow-any-sha1-in-want configuration
C git 2.11 supports setting the equivalent of RequestPolicy.ANY with
uploadpack.allowAnySHA1InWant[1]. Parse this into TransportConfig and
use it from UploadPack.
Add additional tests for [2] and this change.
We can execute "git clone --filter=blob:none --no-checkout" successfully
with config uploadPack.allowFilter is true. But when we checkout, the
git will fetch other missing objects required by the checkout(this is
why we need this config).
When both uploadPack.allowFilter and uploadPack.allowAnySHA1InWant are
true, jgit will support partial clone. If you are using an extremely
large monorepo, this feature can help. It allows users to work on an
incomplete repo which reduces disk usage.
Matthias Sohn [Fri, 10 Feb 2023 22:39:20 +0000 (23:39 +0100)]
Acquire file lock "gc.pid" before running gc
Git guards gc by locking a lock file "gc.pid" before starting execution.
The lock file contains the pid and hostname of the process holding the
lock. Git tries to kill the process holding that lock if the lock file
wasn't modified in the last 12 hours and was started from the same host.
Teach JGit to acquire this lock before running gc but skip execution if
another process already holds the lock. Killing the other process could
be undesired if it's a long running application.
If the lock file wasn't modified in the last 12 hours try to lock it and
run gc if locking succeeds.
Register a shutdown hook for the lock file to ensure it is cleaned up if
the process is gracefully killed.
Matthias Sohn [Mon, 20 Feb 2023 21:18:22 +0000 (22:18 +0100)]
Merge branch 'master' into stable-6.5
* master:
Externalize strings introduced in c9552aba
Silence API error introduced by 596c445a
PackConfig: add entry for minimum size to index
Fix getPackedRefs to not throw NoSuchFileException
PackObjectSizeIndex: interface and impl for the object-size index
UInt24Array: Array of unsigned ints encoded in 3 bytes.
PackIndex: expose the position of an object-id in the index
Add pack options to preserve and prune old pack files
DfsPackFile/DfsGC: Write commit graphs and expose in pack
ObjectReader: Allow getCommitGraph to throw IOException
Allow to perform PackedBatchRefUpdate without locking loose refs
Document option "core.sha1Implementation" introduced in 59029aec
UploadPack: consume delimiter in object-info command
PatchApplier fix - init cache with provided tree
Avoid error-prone warning
Fix unused exception error-prone warning
UploadPack: advertise object-info command if enabled
Move MemRefDatabase creation in a separate method.
DfsReaderIoStats: Add Commit Graph fields into DfsReaderIoStats
Matthias Sohn [Mon, 20 Feb 2023 20:26:08 +0000 (21:26 +0100)]
Merge branch 'stable-6.4'
* stable-6.4:
Fix getPackedRefs to not throw NoSuchFileException
Add pack options to preserve and prune old pack files
Allow to perform PackedBatchRefUpdate without locking loose refs
Document option "core.sha1Implementation" introduced in 59029aec
Matthias Sohn [Mon, 20 Feb 2023 20:01:38 +0000 (21:01 +0100)]
Merge branch 'stable-6.3' into stable-6.4
* stable-6.3:
Fix getPackedRefs to not throw NoSuchFileException
Add pack options to preserve and prune old pack files
Allow to perform PackedBatchRefUpdate without locking loose refs
Document option "core.sha1Implementation" introduced in 59029aec
Matthias Sohn [Mon, 20 Feb 2023 19:59:14 +0000 (20:59 +0100)]
Merge branch 'stable-6.2' into stable-6.3
* stable-6.2:
Fix getPackedRefs to not throw NoSuchFileException
Add pack options to preserve and prune old pack files
Allow to perform PackedBatchRefUpdate without locking loose refs
Document option "core.sha1Implementation" introduced in 59029aec
Ivan Frade [Tue, 4 Jan 2022 19:10:04 +0000 (11:10 -0800)]
PackConfig: add entry for minimum size to index
The object size index can have up to #(blobs-in-repo) entries, taking
a relevant amount of memory. Let operators configure the threshold size
to include objects in the size index.
The index will include objects with size *at or above* this
value (with -1 for none). This is more effective for the
filter-by-size case.
Lowering the threshold adds more objects to the index. This improves
performance at the cost of memory/storage space. For the object-size
case, more calls will use the index instead of reading IO. For the
filter-by-size case, lower threshold means better granularity (if
ObjectReader#isSmallerThan is implemented based only on the index).
Matthias Sohn [Thu, 16 Feb 2023 15:59:56 +0000 (16:59 +0100)]
Merge branch 'stable-6.1' into stable-6.2
* stable-6.1:
Fix getPackedRefs to not throw NoSuchFileException
Add pack options to preserve and prune old pack files
Allow to perform PackedBatchRefUpdate without locking loose refs
Document option "core.sha1Implementation" introduced in 59029aec
Matthias Sohn [Thu, 16 Feb 2023 15:56:07 +0000 (16:56 +0100)]
Merge branch 'stable-6.0' into stable-6.1
* stable-6.0:
Add pack options to preserve and prune old pack files
Allow to perform PackedBatchRefUpdate without locking loose refs
Document option "core.sha1Implementation" introduced in 59029aec
Matthias Sohn [Thu, 16 Feb 2023 15:42:58 +0000 (16:42 +0100)]
Merge branch 'stable-5.13' into stable-6.0
* stable-5.13:
Add pack options to preserve and prune old pack files
Allow to perform PackedBatchRefUpdate without locking loose refs
Document option "core.sha1Implementation" introduced in 59029aec
Fix getPackedRefs to not throw NoSuchFileException
Since Files.newInputStream is from java.nio package, it throws
java.nio.file.NoSuchFileException. This was missed in the change
I00da88e. Without this change, getPackedRefs fails with
NoSuchFileException when there is no packed-refs file in a project.
Ivan Frade [Wed, 15 Dec 2021 00:36:54 +0000 (16:36 -0800)]
PackObjectSizeIndex: interface and impl for the object-size index
Operations like "clone --filter=blob:limit=N" or the "object-info"
command need to read the size of the objects from the storage. An
index would provide those sizes at once rather than having to seek in
the packfile.
Introduce an interface for the Object-size index. This index returns
the inflated size of an object. Not all objects could be indexed (to
limit memory usage).
This implementation indexes only blobs (no trees, nor
commits) *above* certain size threshold (configurable). Lower
threshold adds more objects to the index, consumes more memory and
provides better performance. 0 means "all blobs" and -1 "disabled".
If we don't index everything, for the filter use case is more
efficient to index the biggest objects first: the set is small and
most objects are filtered by NOT being in the index. For the
object-size, the more objects in the index the better, regardless
their size. All together, it is more helpful to index above threshold.
Ivan Frade [Fri, 10 Feb 2023 20:21:01 +0000 (12:21 -0800)]
UInt24Array: Array of unsigned ints encoded in 3 bytes.
The object size index stores positions of objects in the main
index (when ordered by sha1). These positions are per-pack and usually
a pack has <16 million objects (there are exceptions but rather
rare). It could save some memory storing these positions in three bytes
instead of four. Note that these positions are sorted and always positive.
Implement a wrapper around a byte[] to access and search "ints" while
they are stored as unsigned 3 bytes.
Ivan Frade [Tue, 17 Jan 2023 18:01:29 +0000 (10:01 -0800)]
PackIndex: expose the position of an object-id in the index
The primary index returns the offset in the pack for an
objectId. Internally it keeps the object-ids in lexicographical order,
but doesn't expose an API to find the position of an object-id in that
list. This is needed for the object-size index, that we want to store
as "position-in-idx, size".
Add a #findPosition(object-id) method to the PackIndex interface to
know where an object-id sits in the ordered list of ids in the pack.
Note that this index position is over the list of ordered object-ids,
while reverse-index position is over the list of objects in packed
order.
Xing Huang [Mon, 6 Feb 2023 20:18:59 +0000 (14:18 -0600)]
DfsPackFile/DfsGC: Write commit graphs and expose in pack
JGit knows how to read/write commit graphs but the DFS stack is not
using it yet.
The DFS garbage collector generates a commit-graph with commits
reachable from any ref. The pack is stored as extra stream in the GC
pack. DfsPackFile mimicks how other indices are loaded storing the
reference in DFS cache.
Xing Huang [Mon, 6 Feb 2023 20:18:16 +0000 (14:18 -0600)]
ObjectReader: Allow getCommitGraph to throw IOException
ObjectReader#getCommitGraph doesn't report errors loading the
commit graph. The caller should be aware of the situation and
ultimately decide what to do.
Add IOException to ObjectReader#getCommitGraph signature. RevWalk
defaults to an empty commit-graph on IO errors.
Saša Živkov [Fri, 21 Oct 2022 14:32:03 +0000 (16:32 +0200)]
Allow to perform PackedBatchRefUpdate without locking loose refs
Add another newBatchUpdate method in the RefDirectory where we can
control if the created PackedBatchRefUpdate will lock the loose refs or
not.
This can be useful in cases when we run programs which have exclusive
access to a Git repository and we know that locking loose refs is
unnecessary and just a performance loss.