In cases where we need to determine if a given commit is merged
into many refs, using isMergedInto(base, tip) for each ref would
cause multiple unwanted walks.
getMergedInto() marks the unreachable commits as uninteresting
which would then avoid walking that same path again.
Using the same api, also introduce isMergedIntoAny() and
isMergedIntoAll()
Change-Id: I65de9873dce67af9c415d1d236bf52d31b67e8fe
Signed-off-by: Adithya Chakilam <quic_achakila@quicinc.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fix DateRevQueue tie breaks with more than 2 elements
DateRevQueue is expected to give out the commits that have higher
commit time. But in case of tie(same commit time), it should give
the commit that is inserted first. This is inferred from the
testInsertTie test case written for DateRevQueue. Also that test
case, right now uses just two commits which caused it not to fail
with the current implementation, so added another commit to make
the test more robust.
By fixing the DateRevQueue, we would also match the behaviour of
LogCommand.addRange(c1,c2) with git log c1..c2. A test case for
the same is added to show that current behaviour is not the
expected one.
By fixing addRange(), the order in which commits are applied during
a rebase is altered. Rebase logic should have never depended upon
LogCommand.addRange() since the intended order of addRange() is not
the order a rebase should use. So, modify the RebaseCommand to use
RevWalk directly with TopoNonIntermixSortGenerator.
Add a new LogCommandTest.addRangeWithMerge() test case which creates
commits in the following order:
A - B - C - M
\ /
-D-
Using git 2.30.0, git log B..M outputs: M C D
LogCommand.addRange(B, M) without this fix outputs: M D C
LogCommand.addRange(B, M) with this fix outputs: M C D
Change-Id: I30cc3ba6c97f0960f64e9e021df96ff276f63db7
Signed-off-by: Adithya Chakilam <achakila@codeaurora.org>
Move reachability checker generation into the ObjectReader object
Reachability checkers are retrieved from RevWalk and ObjectWalk objects:
* RevWalk.createReachabilityChecker()
* ObjectWalk.createObjectReachabilityChecker()
Since RevWalks and ObjectWalks are themselves directly instantiated
in hundreds of places (e.g. UploadPack...) overriding them in a
consistent way requires overloading 100s of methods, which isn't
feasible. Moving reachability checker generation to a more central
place solves that problem.
The ObjectReader object seems a good place from which to get
reachability checkers, because reachability checkers return
information about relationships between objects. ObjectDatabases
delegate many operations to ObjectReaders, and reachability bitmaps
are attached to ObjectReaders.
The Bitmapped and Pedestrian reachability checker objects were
package private in the org.eclipse.jgit.revwalk package. This change
makes them public and moves them to the
org.eclipse.jgit.internal.revwalk package. Corresponding tests are
also moved.
Motivation:
1) Reachability checking algorithms need to scale. One of the
internal Android repositories has ~2.4 million refs/changes/*
references, causing bad long tail performance in reachability
checks.
2) Reachability check performance is impacted by repository
topography: number of refs, number of objects, amounts of
related vs. unrelated history.
3) Reachability check performance is also affected by per-branch
access (Gerrit branch permissions) since different users can
see different branches.
4) Reachability check performance isn't affected by any state in a
RevWalk or ObjectWalk.
I don't yet know if a single algorithm will work for all cases in #2
and #3. We may need to evolve the ReachabilityChecker interfaces
over time to solve the Gerrit branch permissions case, or use
Gerrit-specific identity information to solve that in an efficient
way.
This change takes the existing public API and moves it to the
ObjectReader/whole repository level, which is where we can do
consistent customizations for #2 and #3. We intend to upstream the
best of whatever works, but anticipate the need for multiple rounds
of experimentation.
Change-Id: I9185feff43551fb387957c436112d5250486833d
Signed-off-by: Terry Parker <tparker@google.com>
Factor out a common ObjectBuilder as super class of CommitBuilder
and TagBuilder, and make the GpgSigner work on ObjectBuilder.
In order not to break API, add the new method for signing an
ObjectBuilder in a new interface GpgObjectSigner.
The signature for a tag is just tacked onto the end of the tag
message. The message of a signed tag must end in LF.
Bug: 386908
Change-Id: I5e021e3c927f4051825cd7355b129113b949455e
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Extract ObjectReachabilityChecker interface from the walk-based
implementation, to add a bitmapped based implementation later.
Refactor the test case to use it for both implementations.
Change-Id: Iaac7c6b037723811956ac22625f27d3b4d742139
Signed-off-by: Ivan Frade <ifrade@google.com>
Preparing the code to optimize the bitmap-based object reachability
checker. We are mirroring first the commit reachability checker
structure (interface + 2 implementations).
Move the walk-base reachability checker to its own class.
This class is public at the moment. Later ObjectWalk will return an
interface and this implementation will be package-private.
Change-Id: Ifac70094e1af137291c3607d95e689992f814b26
Signed-off-by: Ivan Frade <ifrade@google.com>
The error message for an Exception thrown by StartGenerator when given
both the TOPO flag and the TOPO_KEEP_BRANCH_TOGETHER flag mentions a
non-existent flag, TOPO_NON_INTERMIX. The error message was introduced
in commit e498d43.
Replace TOPO_NON_INTERMIX with TOPO_KEEP_BRANCH_TOGETHER in the error
message of an Exception thrown by the StartGenerator when the TOPO flag
is provided together with the TOPO_KEEP_BRANCH_TOGETHER flag.
Signed-off-by: Alex Spradlin <alexaspradlin@google.com>
Change-Id: Id24640dc08e96a196508fe38ce144aa7e035082f
RevWalk: new topo sort to not mix lines of history
The topological sort algorithm in TopoSortGenerator for RevWalk may mix
multiple lines of history, producing results that differ from C git's
git-log whose man page states: "Show no parents before all of its
children are shown, and avoid showing commits on multiple lines of
history intermixed." Lines of history are mixed because
TopoSortGenerator merely delays producing a commit until all of its
children have been produced; it does not immediately produce a commit
after its last child has been produced.
Therefore, add a new RevSort option called TOPO_KEEP_BRANCH_TOGETHER
with a new topo sort algorithm in TopoNonIntermixGenerator. In the
Generator, when the last child of a commit has been produced, unpop
that commit so that it will be returned upon the subsequent call to
next(). To avoid producing duplicates, mark commits that have not yet
been produced as TOPO_QUEUED so that when a commit is popped, it is
produced if and only if TOPO_QUEUED is set.
To support nesting with other generators that may produce the same
commit multiple times like DepthGenerator (for example, StartGenerator
does this), do not increment parent inDegree for the same child commit
more than once.
Commit b5e764abd2 modified the existing
TopoSortGenerator to avoid mixing lines of history, but it was reverted
in e40c38ab08 because the new behavior
caused problems for EGit users. This motivated adding a new Generator
for the new behavior.
Signed-off-by: Alex Spradlin <alexaspradlin@google.com>
Change-Id: Icbb24eac98c00e45c175b01e1c8122554f617933
currVisit could be null if a blob is marked as start point in
ObjectWalk. Add null check before skipping current tree.
Change-Id: Ic5d876fe2800f3373d136979be6c27d1bbd38dc1
Signed-off-by: Yunjie Li <yunjieli@google.com>
Revert "RevWalk: stop mixing lines of history in topo sort"
This reverts commit b5e764abd2.
PlotWalk uses the TopoSortGenerator, which is causing problems for EGit users
who rely on the emission of commits being somewhat based on date as in the
previous topo-sort algorithm.
Bug: 560529
Change-Id: I3dbd3598a7aeb960de3fc39352699b4f11a8c226
Signed-off-by: Alex Spradlin <alexaspradlin@google.com>
RevWalk: stop mixing lines of history in topo sort
The topological sort algorithm in TopoSortGenerator for RevWalk may mix
multiple lines of history, producing results that differ from C git's
git log whose man page states: "Show no parents before all of its
children are shown, and avoid showing commits on multiple lines of
history intermixed." Lines of history are mixed because
TopoSortGenerator merely delays a commit until all of its children have
been produced; it does not immediately produce a commit after its last
child has been produced.
Therefore, when the last child of a commit has been produced, unpop the
commit so that it will be returned upon the subsequent call to next() in
TopoSortGenerator. To avoid producing duplicates, mark commits that
have not yet been produced as TOPO_QUEUED so that when a commit is
popped, it is produced if and only if TOPO_QUEUED is set.
To support nesting with other generators that may produce the same
commit multiple times like DepthGenerator (for example, StartGenerator
does this), do not increment parent inDegree for the same child commit
more than once.
Modify tests that assert that TopoSortGenerator mixes lines of commit
history.
Change-Id: I4ee03c7a8e5265d61230b2a01ae3858745b2432b
Signed-off-by: Alex Spradlin <alexaspradlin@google.com>
BitmappedReachabilityChecker: Use only one bitmap for the whole check
The checker is creating a new bitmap per branch leading to excessive
memory consumption. For the reachability check one bitmap with the
reachability of all branches aggregated is enough.
Build the reachability bitmap with a filter. The filter itself uses it
to emit only commits not reached before and the caller to check what
targets have been reached already.
BitmapCalculator is not required anymore.
Change-Id: Ic5c62f77fe0f188913215b7eaa51d849a9aae6a5
Signed-off-by: Ivan Frade <ifrade@google.com>
ReachabilityChecker: Receive a Stream instead of a Collection
Preparatory change. Converting ObjectIds to RevCommits is potentially
expensive and in the incremental reachability check, it is probably not
required for all elements in the collection.
Pass a Stream to the reachability checker. In the follow up we make
the conversion from ObjectId to RevCommit in the stream (i.e. on
demand). This should reduce the latency of reachability checks over big
sets of references.
Change-Id: I9f310e331de5b0bf8de34143bd7dcd34316d2fba
Signed-off-by: Ivan Frade <ifrade@google.com>
Prevent adding a null parent to a commit's parent array. Doing so
can cause NPEs later on.
Bug: 552160
Change-Id: Ib24b7b9b7b08e0b6f246006b4a4cade7eeb830b9
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
TreeRevFilterTest uses an unncessarily complicated TreeFilter - an
AndTreeFilter - when it should be as simple as possible because this
class tests TreeRevFilter, not AndTreeFilter. Replace the filter with a
simpler one.
Change-Id: I3256a65f6e0042d32fd76a9224b79a835674ff3a
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
RevWalk: Traverse all parents of UNINTERESTING commits
When firstParent is set, RevWalk traverses only the first parent of a
commit, even though that commit is UNINTERESTING. Since we want the
maximal UNINTERESTING set, we shouldn't prune any parents here. This
issue is apparent only when some of the commits being traversed are
unparsed, since walker.carryFlagsImpl() propagates the UNINTERESTING
flag to all parsed ancestors, masking the issue.
Therefore teach RevWalk to traverse all parents when a commit is
UNINTERESTING and not only the first parent. Since this issue is
masked by commit parsing, also test situations when the commits
involved are unparsed.
Signed-off-by: Alex Spradlin <alexaspradlin@google.com>
Change-Id: I95e2ad9ae8f1f50fbecae674367ee7e0855519b1
[error prone] suppress AmbiguousMethodReference in AnyObjectId
Move the implementation of the static equals() method to a new method
and suppress the error. Deprecate the old method to signal that we
intend to remove it in the next major release.
See https://errorprone.info/bugpattern/AmbiguousMethodReference
Change-Id: I5e29c97f4db3e11770be589a6ccd785e2c9ac7f2
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
RevWalk: Add a setFirstParent that mimics C git's --first-parent
RevWalk does not currently provide a --first-parent equivalent and the
feature has been requested.
Add a field to the RevWalk class to specify whether walks should
traverse first parents only. Modify Generator implementations to support
the feature.
Change-Id: I4a9a0d5767f82141dcf6d08659d7cb77c585fae4
Signed-off-by: Dave Borowitz <dborowitz@google.com>
Signed-off-by: Alex Spradlin <alexaspradlin@google.com>
BitmappedReachabilityChecker: Reachability check using bitmaps
The "basic" reachability check walks the graph starting from the tips
marking things as "uninteresting". If the target commit is marked as
"uninteresting" it was reached; it is reachable from those tips.
This requires a lot of walking and can be solved directly with bitmaps.
Most of the time the bitmaps are already calculated or a short walk
away.
This should improve the performance of reachability checks, for example
in Gitiles.
Change-Id: I83d33271f58d95d2dc9ed151967b3eda513c99f7
Signed-off-by: Ivan Frade <ifrade@google.com>
BitmapCalculator: Get the reachability bitmap of a commit
To make reachability checks with bitmaps, we need to get the
reachability bitmap of a commit, which is not always precalculated.
There is already a class returning such bitmap (BitmapWalker) but it
does too much unnecessary work: it calculates ALL reachable objects from
a commit (i.e. including trees and blobs), when for reachability the
commits are just enough.
Introduce BitmapCalculator to get the bitmap of a commit: either because
it is precalculated or generating it with a walk only over commits.
Change-Id: Ibb6c78affe9eeaf1fa362a06daf4fd2d91c1caea
Signed-off-by: Ivan Frade <ifrade@google.com>
ReachabilityChecker: Default implementation with a RevWalk
It is common to check if a certain commit is reachable from some
starting points. For example gitiles does it to check if a commit
is visible to a user based on its permissions.
Offer this functionality in JGit.
Split the interface as the next commit will introduce an implementation
using bitmap indices.
Change-Id: I0933b305c8d734f7a64502910ff4d9ef4fc92ae1
Signed-off-by: Ivan Frade <ifrade@google.com>
In order to support GPG-signed commits, add some methods which will
allow GPG signatures to be parsed out of RevCommit objects.
Later, we can add code to verify the signatures.
Change-Id: Ifcf6b3ac79115c15d3ec4b4eaed07315534d09ac
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Return parsed objects from TestRepository.commit/tree/blob()
It is convenient for TestRepository to return fully parsed
objects from its commit()/tree()/blob() methods, so that test
code doesn't have to remember to parse them before making
assertions about them.
Update TestRepostiory to return fully parsed objects.
Adjust the tests that are affected by this change in behavior.
Change-Id: I09d03d0c80ad22cb7092f4a2eaed99d40a10af63
Signed-off-by: Terry Parker <tparker@google.com>
Correctly handle initialization of shallow commits
In a new RevWalk, if the first object parsed is one of the
shallow commits, the following happens:
1) RevCommit.parseCanonical() is called on a new "r1" RevCommit.
2) RevCommit.parseCanonical() immediately calls
RevWalk.initializeShallowCommits().
3) RevWalk.initializeShallowCommits() calls lookupCommit(id),
creating and adding a new "r2" version of this same object and
marking its parents empty.
4) RevCommit.parseCanonical() initializes the "r1" RevCommit's
fields, including the parents.
5) RevCommit.parseCanonical()'s caller uses the "r1" commit that
has parents, losing the fact that it is a shallow commit.
This change passes the current RevCommit as an argument to
RevWalk.initializeShallowCommits() so that method can set its
parents empty rather than creating the duplicate "r2" commit.
Change-Id: I67b79aa2927dd71ac7b0d8f8917f423dcaf08c8a
Signed-off-by: Terry Parker <tparker@google.com>
Fix trivial usages of deprecated Repository#getAllRefs
Callers of getAllRefs that only iterate over the `values()` of the
returned map can be trivially fixed to call getRefDatabase().getRefs()
instead.
Only fix those where the calling method is already declared to throw
IOException, to avoid potential API changes.
Change-Id: I2b05f785077a1713953cfd42df7bf915f889f90b
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Remove it from
* package private functions.
* try blocks
* for loops
this was done with the following python script:
$ cat f.py
import sys
import re
import os
def replaceFinal(m):
return m.group(1) + "(" + m.group(2).replace('final ', '') + ")"
methodDecl = re.compile(r"^([\t ]*[a-zA-Z_ ]+)\(([^)]*)\)")
def subst(fn):
input = open(fn)
os.rename(fn, fn + "~")
dest = open(fn, 'w')
for l in input:
l = methodDecl.sub(replaceFinal, l)
dest.write(l)
dest.close()
for root, dirs, files in os.walk(".", topdown=False):
for f in files:
if not f.endswith('.java'):
continue
full = os.path.join(root, f)
print full
subst(full)
Change-Id: If533a75a417594fc893e7c669d2c1f0f6caeb7ca
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Use constants from StandardCharsets instead of hard-coded strings
Instead of hard-coding the charset strings "US-ASCII", "UTF-8", and
"ISO-8859-1", use the corresponding constants from StandardCharsets.
UnsupportedEncodingException is not thrown when the StandardCharset
constants are used, so remove the now redundant handling.
Because the encoding names are no longer hard-coded strings, also
remove redundant $NON-NLS warning suppressions.
Also replace existing usages of the constants with static imports.
Change-Id: I0a4510d3d992db5e277f009a41434276f95bda4e
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Suppress "Unlikely argument type for equals()" warnings in tests
This new warning was introduced in Eclipse 4.7 Oxygen [1].
The only instances of the warning are in test code that is asserting
that some class does not compare equal to Strings. As in the Gerrit
project [2] these asserts are arguably overkill, but arguably also
a reasonable test of an equals implementation. Ignore the warning in
these cases.
Note that if the project is opened in an earlier version of Eclipse,
a warning "Unsupported @SuppressWarnings" will be emitted.
[1] https://www.eclipse.org/eclipse/news/4.7/M6/
[2] https://gerrit-review.googlesource.com/#/c/gerrit/+/110339/
Change-Id: I08ea33d71e6009cf0f37e6492a475931f447256b
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
RevFlagSetTest: Fix compilation error flagged by error prone
This fixes error flagged by error prone:
Java compilation in rule '//org.eclipse.jgit.test:jgit' failed: Worker
process sent response with exit code: 1.
org.eclipse.jgit.test/tst/org/eclipse/jgit/revwalk/RevFlagSetTest.java:149:
error: [CollectionIncompatibleType] Argument '"bob"' should not be
passed to this method; its type String is not compatible with its
collection's type argument RevFlag
assertFalse(set.contains("bob"));
Change-Id: I4a971ce92fee55e28b2ab0c7b716ac20fa9c6709
Signed-off-by: David Ostrovsky <david@ostrovsky.org>
Enable and fix warnings about redundant specification of type arguments
Since the introduction of generic type parameter inference in Java 7,
it's not necessary to explicitly specify the type of generic parameters.
Enable the warning in Eclipse, and fix all occurrences.
Change-Id: I9158caf1beca5e4980b6240ac401f3868520aad0
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Enable and fix 'Should be tagged with @Override' warning
Set missingOverrideAnnotation=warning in Eclipse compiler preferences
which enables the warning:
The method <method> of type <type> should be tagged with @Override
since it actually overrides a superclass method
Justification for this warning is described in:
http://stackoverflow.com/a/94411/381622
Enabling this causes in excess of 1000 warnings across the entire
code-base. They are very easy to fix automatically with Eclipse's
"Quick Fix" tool.
Fix all of them except 2 which cause compilation failure when the
project is built with mvn; add TODO comments on those for further
investigation.
Change-Id: I5772061041fd361fe93137fd8b0ad356e748a29c
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Fix JGits merge-base calculation in case of inconsistent commit times.
JGit was potentially failing to compute correct merge-bases when the
commit times where inconsistent (a parent commit was younger than a
child commit). The code in MergeBaseGenerator was aware of the fact that
sometimes the discovery of a merge base x can occur after the parents of
x have been seen (see comment in #carryOntoOne()). But in the light of
inconsistent commit times it was possible that these parents of a
merge-base have already been returned as a merge-base.
This commit fixes the bug by buffering all commits generated by
MergeBaseGenerator. It is expected that this buffer will be small
because the number of merge-bases will be small. Additionally a new
flag is used to mark the ancestors of merge-bases. This allows to filter
out the unwanted commits.
Bug: 507584
Change-Id: I9cc140b784c3231b972bd2c3de61a789365237ab
There was a bug when carrying over flags from a merge commit to its
non-first parents. The first parent of a merge commit was handled
differently and correct but the non-first parents are handled by a
recursive algorithm. Flags should be copied from the root merge commit
to parent-2, to grandparent-2, ... up to the limit of STACK_DEPTH==500
parents-levels. But the recursive algorithm was always copying only to
the direct parents of the merge commit and not the grand*-parents.
This seems to be no problem when commits are handled in a strict date
order because then copying only one level is no problem if children are
handled before parents. But when commits are not seperated anymore by
distinctive correct dates (e.g. because all commits have the same date)
then it may happen that a merge-parent is handled before the merge
commit and when dealing later with the merge commit one has to copy
flags down to more than one level
Bug: 501211
Change-Id: I2d79a7cf1e3bce21a490905ccd9d5e502d7b8421