Thomas Wolf [Wed, 21 Jun 2017 09:40:04 +0000 (11:40 +0200)]
Fix handling of pushInsteadOf
According to [1], pushInsteadOf is
1. applied to the uris, not to the pushUris
2. ignored if a remote has an explicit pushUri
JGit applied it only to the pushUris. As a result, pushInsteadOf was
ignored for remotes having only a uri, but no pushUri.
This commit implements (1) if there are no pushUris. I did not dare
implement (2) because:
* there are explicit tests for it that expect that pushInsteadOf gets
applied to existing pushUrls, and
* people may actually use and rely on this JGit behavior.
[1] https://git-scm.com/docs/git-config
Bug: 393170
Change-Id: I6dacbf1768a105190c2a8c5272e7880c1c9c943a Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Dave Borowitz [Wed, 2 Aug 2017 20:50:57 +0000 (16:50 -0400)]
Eliminate SectionParser construction boilerplate
Happily, most anonymous SectionParser implementations can be replaced
with FooConfig::new, as long as the constructor takes a single Config
arg. Many of these, the non-public ones, can in turn be inlined. A few
remaining SectionParsers can be lambdas.
Very short abbreviations that are under 8 hex digits do not
have values in w2. Use w1 as the Java hashCode() instead, so
that the prefix of the abbreviation is always included in the
hashing function used by any java.util.Collection type.
Matthias Sohn [Sun, 16 Jul 2017 22:31:13 +0000 (00:31 +0200)]
Replace findbugs by spotbugs
SpotBugs [1] is the spiritual successor of FindBugs, carrying on from
the point where it left off with support of its community.
[1] http://spotbugs.readthedocs.io/
Change-Id: I127f2c54b04265b6565e780116617ffa8a4d7eaf Signed-off-by: Matthias Sohn <matthias.sohn@sap.com> Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Dave Borowitz [Fri, 28 Jul 2017 14:48:25 +0000 (10:48 -0400)]
RefDirectory: Add in-process fair lock for atomic updates
In a server scenario such as Gerrit Code Review, there may be many
atomic BatchRefUpdates contending for locks on both the packed-refs file
and some subset of loose refs. We already retry lock acquisition to
improve this situation slightly, but we can do better by using an
in-process lock. This way, instead of retrying and potentially exceeding
their timeout, different threads sharing the same Repository instance
can wait on a fair lock without having to touch the disk lock. Since a
server is probably already using RepositoryCache anyway, there is a high
likelihood of reusing the Repository instance.
Dave Borowitz [Wed, 26 Jul 2017 20:53:21 +0000 (16:53 -0400)]
RefDirectory: Retry acquiring ref locks with backoff
If a repo frequently uses PackedBatchRefUpdates, there is likely to be
contention on the packed-refs file, so it's not appropriate to fail
immediately the first time we fail to acquire a lock. Add some logic to
RefDirectory to support general retrying of lock acquisition.
Currently, there is a hard-coded wait starting at 100ms and backing off
exponentially to 1600ms, for about 3s of total wait. This is no worse
than the hard-coded backoff that JGit does elsewhere, e.g. in
FileUtils#delete. One can imagine a scheme that uses per-repository
configuration of backoff, and the current interface would support this
without changing any callers.
Zhen Chen [Tue, 20 Jun 2017 22:22:09 +0000 (15:22 -0700)]
Add dfs fsck implementation
JGit already had some fsck-like classes like ObjectChecker which can
check for an individual object.
The read-only FsckPackParser which will parse all objects within a pack
file and check it with ObjectChecker. It will also check the pack index
file against the object information from the pack parser.
Dave Borowitz [Mon, 17 Jul 2017 14:01:10 +0000 (10:01 -0400)]
PackedBatchRefUpdate: Write reflogs
On-disk reflogs are not stored in the packed-refs file, so we cannot
ensure atomic updates. We choose the lesser evil of dropping failed
reflog updates on the floor, rather than throwing an exception even
though the underlying ref updates succeeded.
Dave Borowitz [Mon, 17 Jul 2017 15:49:36 +0000 (11:49 -0400)]
Improve BatchRefUpdateTest readability
* Factor out helpers for setting up and executing updates.
* Use common assert methods, with a special enum type that papers over
the fact that there is no ReceiveCommand.Result for transaction
aborted.
* Static import ReceiveCommand.Type constants.
* Add blank lines to separate repo setup, update execution, and asserts.
Dave Borowitz [Mon, 17 Jul 2017 15:03:38 +0000 (11:03 -0400)]
Move BatchRefUpdate tests to a new file
Run with @Parameterized, so we don't have to duplicate test setup for
each atomic/non-atomic test. We still have to have two different sets of
asserts for the cases where the behavior is different. In fact, this is
a readability win: it emphasizes that performing the exact same setup
except for the atomic setting will have different behavior.
Dave Borowitz [Wed, 5 Jul 2017 17:45:39 +0000 (13:45 -0400)]
Implement atomic BatchRefUpdates for RefDirectory
The existing packed-refs file provides a mechanism for implementing
atomic multi-ref updates without any changes to the on-disk format or
lockfile protocol. We just need to make sure that there are no loose
refs involved in the transaction, which we can achieve by packing the
refs while holding locks on all loose refs. Full details of the
algorithm are in the PackedBatchRefUpdate javadoc.
This change does not implement reflog support, which will come in a
later change.
Dave Borowitz [Fri, 7 Jul 2017 15:38:29 +0000 (11:38 -0400)]
Separate RefUpdate.Result.REJECTED_{MISSING_OBJECT,OTHER_REASON}
ReceiveCommand.Result has a slightly richer set of possibilities, so it
makes sense for RefUpdate.Result to have more values in order to match.
In particular, this allows us to return REJECTED_MISSING_OBJECT from
RefUpdate when an object is missing.
The comment in RefUpdate#safeParse about expecting some old objects to be
missing is only applicable to the old ID, not the new ID. A missing new
ID is a bug or programmer error, and we should not update a ref to point
to one.
Fix various tests that started failing because they depended for no good
reason on setting refs to point to nonexistent objects; it's always easy
to create a real object when necessary.
It is possible that some downstream users of RefUpdate.Result might
choose to handle one of the new statuses differently, for example by
providing a more user-readable error message; that is not done in this
change.
David Pursehouse [Mon, 24 Jul 2017 07:38:42 +0000 (03:38 -0400)]
Merge changes from topic 'packed-batch-ref-update'
* changes:
Add tests for updating single refs to missing objects
Fix deleting symrefs
RefDirectory: Throw exception if CAS of packed ref list fails
ReceiveCommand: Explicitly check constructor preconditions
BatchRefUpdate: Document when getPushOptions is null
Make 'inCoreLimit' of LocalFile used in ResolveMerger configurable
This change makes it possible to configure the 'inCoreLimit' of LocalFile
used in ResolveMerger#insertMergeResult. Since LocalFile itself has some
risks, e.g. it may be left behind as garbage in case of failure. It should
be good to be able to control the size limit for using LocalFile.
dfs: optionally store blockSize in DfsPackDescription
Allow a DFS implementation to report blockSize to DfsPackFile,
bypassing alignment errors and corrections in the DfsBlockCache when
the blockSize of a specific file differs from the cache's configured
blockSize.
When a file uses a different block size (e.g. 500) than the cache
(e.g. 512), and the DfsPackFile's blockSize field has not been
initialized, the cache misaligns block loads. The cache uses its
default of 512 to compute the block alignment instead of the file's
500.
This causes DfsReader try to set an empty range into an Inflater,
resulting in an object being unable to load.
Holding the current DfsBlock in a local variable 'b' may prevent the
Java GC from reclaiming it while loading the next block. Remove the
local variable and rely only on the field.
dfs: test for repositories sharing blocks in DfsBlockCache
Simple test to verify two DfsRepository instances will reuse the same
DfsBlocks in the DfsBlockCache, even though the DfsStreamKey instance
is now different between their DfsPackFile instances.
dfs: only create DfsPackFile if description has PACK
In the future with reftable a DFS implementation may choose to create
a PackDescription that contains only a REFTABLE extension. Filter
these out by only creating a DfsPackFile if the PackDescription as the
expected PackExt.PACK.
dfs: Fix default DfsStreamKey to include DfsRepositoryDescription
Not all DFS implementations use globally unique pack names in the
DfsPackDescription. Most require the DfsRepositoryDescription to
qualify the pack. Include DfsRepositoryDescription in the default
DfsStreamKey implementation, to prevent cache collisions.
Using a HashMap is overkill for this storage. PackExt is a
constrained type that permits no more than 32 unique values in the JVM.
Each is assigned a unique index (getPosition), which can be used as
indexes in a simple long[].
dfs: Fix caching of index, bitmap index, reverse index
When 07f98a8b71 ("Derive DfsStreamKey from DfsPackDescription")
stopped caching DfsPackFile in the DfsBlockCache, the DfsPackFile began
to always load the idx, bitmap, or compute reverse index, as the cache
handles were no longer populated by prior requests.
Rework caching to lookup the objects from the DfsBlockCache if the
local DfsPackFile handle is invalid. This allows the DfsPackFile to
be more of a flyweight instance across requests.
dfs: Use special ForReverseIndex DfsStreamKey wrapper instead of derive
While implementing a custom subclass of DfsStreamKey it became obvious
the required derive(String) was making it impossible to construct an
efficient key in all cases.
Instead, use a special wrapper type ForReverseIndex around the INDEX's
own DfsStreamKey to denote the reverse index stream in the
DfsBlockCache. This adds a smaller layer of boxing, but eliminates
weird issues for DFS implementors using specialized DfsStreamKey
implementations for space efficiency reasons.
Now that DfsStreamKey is reasonably light-weight, avoid allocating the
index and reverse index keys until necessary. DfsPackFile mostly
holds the DfsBlockCache.Ref handle to the object, and only needs the
DfsStreamKey when its looking up the handle.
By making this a deterministic function, DfsBlockCache can stop
retaining a map of every DfsPackDescription it has ever seen. This
fixes a long standing memory leak in DfsBlockCache.
This refactoring also simplifies the idea of setting up more
lightweight objects around streams.
Dave Borowitz [Fri, 7 Jul 2017 14:31:01 +0000 (10:31 -0400)]
Add tests for updating single refs to missing objects
The reader may find it surprising that this succeeds without incident
unless there is peeling or a fast-forward check involved. This behavior
may be changed in the future, but for now, just document the current
behavior.
Dave Borowitz [Fri, 7 Jul 2017 15:24:51 +0000 (11:24 -0400)]
Fix deleting symrefs
The RefDirectory implementation of doDelete never considered whether to
delete a symref or its leaf, because the detachingSymbolicRef bit was
never exposed from RefUpdate. The behavior was thus incorrectly to
always delete the symref, never the leaf.
There was no test for this behavior. The only thing that attempted to be
a test was testDeleteHeadInBareRepo, but this test was broken for
reasons unrelated to this bug. Specifically, it set the leaf to point to
a completely nonexistent object, and then asserted that deleting HEAD
resulted in NO_CHANGE. The only reason this test ever passed is because
of a quirk of updateImpl, which treats a missing object as the same as
null. This quirk aside, the test wasn't really testing the right thing.
Turn this into a real test by writing out a real object and pointing the
leaf at that.
Also, add a test for the detachingSymbolicRef case, i.e. deleting the
symref and leaving the leaf alone.
Dave Borowitz [Wed, 5 Jul 2017 18:19:36 +0000 (14:19 -0400)]
RefDirectory: Throw exception if CAS of packed ref list fails
The contents of the packedRefList AtomicReference should never differ
from what we expect prior to writing, because this segment of the code
is protected by the packed-refs lock file on disk. If it does happen,
whether due to programmer error or a rogue process not respecting the
locking protocol, it's better to let the caller know than to silently
drop the whole commit operation on the floor.
The existing concurrentOnlyOneWritesPackedRefs test is inherently
nondeterministic as written, and was already about 6% flaky as measured
by bazel:
$ bazel test --runs_per_test=200 //org.eclipse.jgit.test:org_eclipse_jgit_internal_storage_file_GcPackRefsTest
...
INFO: Elapsed time: 42.608s, Critical Path: 10.35s
//org.eclipse.jgit.test:org_eclipse_jgit_internal_storage_file_GcPackRefsTest FAILED in 12 out of 200 in 1.6s
Stats over 200 runs: max = 1.6s, min = 1.1s, avg = 1.3s, dev = 0.1s
This flakiness was caused by the assumption that exactly one of the 2
threads would fail, when both might actually succeed in practice due to
racing on the compare-and-swap.
For whatever reason, this change affected the interleaving behavior in
such a way that the flakiness jumped to around 50%. Making the
interleaving of the test fully deterministic is beyond the scope of this
change, but a simple tweak to the assertion is enough to make it pass
consistently 200+ times both before and after this change.
Some downstream code checks whether a ReceiveCommand is a create or a
delete based on the type field. Other downstream code (in particular a
good chunk of Gerrit code I wrote) checks the same thing by comparing
oldId/newId to zeroId. Unfortunately, there were no strict checks in the
constructor that ensures that zeroId is only set for oldId/newId if the
type argument corresponds, so a caller that passed mismatched IDs and
types would observe completely undefined behavior as a result. This is
and always has been a misuse of the API; throw IllegalArgumentException
so the caller knows that it is a misuse.
Similarly, throw from the constructor if oldId/newId are null. The
non-nullness requirement was already documented. Fix RefDirectoryTest to
not do the wrong thing.
This new base class has the minimum set of properties and methods
necessary for DfsBlockCache to manage blocks of a file in the cache.
Subclasses can use DfsBlockCache for any content.
This refactoring opens the door for additional PackExt types other
than PACK to be stored on a block-by-block basis by the DfsBlockCache.
Instead of overloading the pack's DfsStreamKey with negative positions
for the idx, reverse idx and bitmap, assign a unique DfsStreamKey for
each of these related streams.
Dave Borowitz [Fri, 7 Jul 2017 17:51:25 +0000 (13:51 -0400)]
Merge changes from topic 'packed-batch-ref-update'
* changes:
RefList: Support capacity <= 0 on new builders
Short-circuit writing packed-refs if no refs were packed
BatchRefUpdate: Clarify some ref prefix calls
Zhen Chen [Tue, 20 Jun 2017 21:28:25 +0000 (14:28 -0700)]
Make possible to overwrite the object count
Right now, PackParser relies on the object count from the pack header.
However, when creating Dfs INSERT packs, the object count is not known
at the beginning of the operation. And when we append the base to a
RECEIVE pack, we can't modify the pack header for object count in most
Dfs implementations.
Make it possible to tell PackParser the expected object count by adding
a setter for expectedObjectCount, implementation can overwrite the
object count in onPackHeader function.
Dave Borowitz [Wed, 5 Jul 2017 13:57:01 +0000 (09:57 -0400)]
BatchRefUpdate: Clarify some ref prefix calls
Inline the old addRefToPrefixes, since it was just a glorified addAll.
Split getPrefixes into a variant, addPrefixesTo, that doesn't allocate a
small Collection on every invocation. Use this in the tight loop of
getTakenPrefixes.
Shawn Pearce [Tue, 27 Jun 2017 16:51:39 +0000 (09:51 -0700)]
Use read ahead during copyPackThroughCache
If a block is missing from the block cache, open the pack stream,
retain the ReadableChannel, and turn on read-ahead. This should help
to load a medium sized pack into a cold cache more quickly from a
slower IO stream, as the pack is scanned sequentially and missing
blocks are more likely to be available through the read-ahead.
Mathieu Cartaud [Mon, 22 May 2017 08:33:52 +0000 (10:33 +0200)]
Support -merge attribute in binary macro
The merger is now able to react to the use of the merge attribute.
The value unset and the custom value 'binary' are handled (-merge
and merge=binary)
Since the specification of the merge attribute states that when the
attribute is unset, ours version must be kept in case of a conflict, we
don't overwrite the file but keep the local version.
David Turner [Wed, 21 Jun 2017 19:23:03 +0000 (15:23 -0400)]
Add a test for parsing fsck config options and expose FsckMode enum
These config options allow overriding the message type (error, warn or
ignore) of a specific message ID such as missingEmail.
The supported fsck message IDs are defined in ObjectChecker.ErrorType.
Since TransferConfig.FsckMode wasn't public parsing fsck configuration
options like e.g. fsck.missingEmail=ignore failed with an
IllegalAccessException. Fix this by declaring this enum public.
Change-Id: I3f41ff7a76a846250a63ce92a9fd111eb347269f Signed-off-by: David Turner <dturner@twosigma.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Oliver Lockwood [Wed, 21 Jun 2017 16:25:19 +0000 (17:25 +0100)]
Fix bug in multiple tag handling on DescribeCommand
In the case of multiple tags on the same commit, jgit previously
only ever looked at the last of those tags; git behaviour is to
return the first tag (or first matching one if --match is
specified).
Bug: 518377
Change-Id: I3b6b58ad9f8aa3879ae35b84542b7bddc74a27d6 Signed-off-by: Oliver Lockwood <oliver.lockwood@cantab.net>
David Pursehouse [Mon, 12 Jun 2017 04:41:20 +0000 (13:41 +0900)]
Add tests for SubmoduleConfig
Change-Id: Idcc93c2ca95938995d489cffda649c7d7b26c50e Signed-off-by: David Pursehouse <david.pursehouse@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Change-Id: I32024b36093eb095539e02b1788d074bc5237d9f Signed-off-by: David Pursehouse <david.pursehouse@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Terry Parker [Wed, 14 Jun 2017 21:02:45 +0000 (14:02 -0700)]
Add a new singlePack option to PackConfig
If set, "singlePack" will create a single GC pack file for all
objects reachable from refs/*. If not set, the GC pack will contain
object reachable from refs/heads/* and refs/tags/*, and the GC_REST
pack will contain all other reachable objects.
Change-Id: I56bcb6a9da2c10a0909c2f940c025db6f3acebcb Signed-off-by: Terry Parker <tparker@google.com>
David Pursehouse [Sun, 11 Jun 2017 11:24:12 +0000 (20:24 +0900)]
Merge branch 'stable-4.8'
* stable-4.8:
Use a dedicated executor to run auto-gc in command line interface
Allow to use an external ExecutorService for background auto-gc
Fetch: Add --recurse-submodules and --no-recurse-submodules options
Fix capitalization of command help summaries
Change-Id: I7c85f11daa34c11c7f6389de885a2183a686197e Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Matthias Sohn [Sun, 11 Jun 2017 09:51:59 +0000 (11:51 +0200)]
Use a dedicated executor to run auto-gc in command line interface
WorkQueue uses daemon threads so auto-gc would not be executed after
short-lived commands run in command line. Hence use a dedicated executor
which we shutdown when the command finishes.
Change-Id: I0c2429ecfa04387389d159168ba78a020a696228 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Sat, 10 Jun 2017 23:20:34 +0000 (01:20 +0200)]
Allow to use an external ExecutorService for background auto-gc
If set use the external executor, otherwise use JGit's own simple
WorkQueue. Move WorkQueue to an internal package so we can reuse it
without exposing it in the public API.
Change-Id: I060d62ffd6692362a88b4bf13ee07b0dc857abe9 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
David Pursehouse [Fri, 24 Mar 2017 01:26:44 +0000 (10:26 +0900)]
Fetch: Add --recurse-submodules and --no-recurse-submodules options
Add options to control recursion into submodules on fetch.
Add a callback interface on FetchCommand, to allow Fetch to display
an update "Fetching submodule XYZ" for each submodule.
Change-Id: Id805044b57289ee0f384b434aba1dbd2fd317e5b Signed-off-by: David Pursehouse <david.pursehouse@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
David Pursehouse [Sat, 10 Jun 2017 05:14:12 +0000 (14:14 +0900)]
Merge branch 'stable-4.8'
* stable-4.8:
SubmoduleUpdateCommand#setCallback should return 'this'
CloneCommand#setCallback should return 'this'
Prepare 4.7.2-SNAPSHOT builds
JGit v4.7.1.201706071930-r
ArchiveCommand: Create prefix entry with commit time
Run auto GC in the background
Update Orbit to the Oxygen version R20170516192513
Change-Id: Ibf90b4899d097474e7836e6baab8829e66fca524 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Matthias Sohn [Fri, 9 Jun 2017 22:58:23 +0000 (00:58 +0200)]
SubmoduleUpdateCommand#setCallback should return 'this'
The other methods in this class follow the builder pattern, and
return 'this', allowing multiple method calls to be chained in a
single statement.
Update the setCallback method to do the same.
Change-Id: I4ddaacd6d50601f47f61eb6be8b62c8d59cce062 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>