Terry Parker [Tue, 10 Jul 2018 00:42:21 +0000 (17:42 -0700)]
GC: Trim EWAHCompressedBitmaps to free unused memory
The "Building bitmaps" GC phase fails for large repositories (repos with
10M objects use 1.25MB per uncompressed bitmap, and those with long
histories may build >25k bitmaps). Since these bitmaps xor well against
each other, the actual space needed for each compressed bitmap is
usually no more than a few KB. Calling trim() will ensure we aren't
holding on to excess memory.
Change-Id: I40bf78c730b9f6051da6025f9777ce27220a5b0a Signed-off-by: Terry Parker <tparker@google.com>
Dave Borowitz [Wed, 20 Jun 2018 12:48:37 +0000 (08:48 -0400)]
Make DfsConfig public
This may be convenient for downstream implementers who require a dummy
StoredConfig implementation, rather than making them reimplement the two
abstract StoredConfig methods.
Thomas Wolf [Tue, 3 Jul 2018 06:46:12 +0000 (08:46 +0200)]
Add response message, if any, on HTTP status 404
Try to give as much information as possible. The connection's
response message might contain additional hints as to why the
connection could not be established.
Bug: 536541
Change-Id: I7230e4e0be9417be8cedeb8aaab35186fcbf00a5 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Terry Parker [Wed, 27 Jun 2018 00:22:07 +0000 (17:22 -0700)]
Return parsed objects from TestRepository.commit/tree/blob()
It is convenient for TestRepository to return fully parsed
objects from its commit()/tree()/blob() methods, so that test
code doesn't have to remember to parse them before making
assertions about them.
Update TestRepostiory to return fully parsed objects.
Adjust the tests that are affected by this change in behavior.
Change-Id: I09d03d0c80ad22cb7092f4a2eaed99d40a10af63 Signed-off-by: Terry Parker <tparker@google.com>
Terry Parker [Tue, 26 Jun 2018 23:44:01 +0000 (16:44 -0700)]
Correctly handle initialization of shallow commits
In a new RevWalk, if the first object parsed is one of the
shallow commits, the following happens:
1) RevCommit.parseCanonical() is called on a new "r1" RevCommit.
2) RevCommit.parseCanonical() immediately calls
RevWalk.initializeShallowCommits().
3) RevWalk.initializeShallowCommits() calls lookupCommit(id),
creating and adding a new "r2" version of this same object and
marking its parents empty.
4) RevCommit.parseCanonical() initializes the "r1" RevCommit's
fields, including the parents.
5) RevCommit.parseCanonical()'s caller uses the "r1" commit that
has parents, losing the fact that it is a shallow commit.
This change passes the current RevCommit as an argument to
RevWalk.initializeShallowCommits() so that method can set its
parents empty rather than creating the duplicate "r2" commit.
Change-Id: I67b79aa2927dd71ac7b0d8f8917f423dcaf08c8a Signed-off-by: Terry Parker <tparker@google.com>
tparker [Mon, 25 Jun 2018 03:00:55 +0000 (20:00 -0700)]
Fix a GC scalability issue when selecting commit bitmaps
The previous algorithm selected commits by creating bitmaps at
each branch tip, doing a revwalk to populate each bitmap, and
looping in this way:
1) Select the remaining branch with the most commits (the branch
whose bitmap has the highest cardinality)
2) Select well-spaced bitmaps in that branch
3) Remove commits in the selected branch from the remaining
branch-tip bitmaps
4) Repeat at #1
This algorithm gave good commit selection on all branches but
a more uniform selection on "important" branches, where branch
length is the proxy for "important". However the algorithm
required N bitmaps of size M solely for the purpose of commit
selection, where N is the number of branch tips in the primary
GC pack, and M is the number of objects in the pack.
This new algorithm uses branch modification date as the proxy for
"important" branches, replacing the N*M memory allocation with a
single M-sized bitmap and N revwalks from new branch tips to
shared history (which will be short when there is a lot of shared
history).
GcCommitSelectionTest.testDistributionOnMultipleBranches verifies
that this algorithm still yields good coverage on all branches.
Change-Id: Ib6019b102b67eabb379e6b85623e4b5549590e6e Signed-off-by: Terry Parker <tparker@google.com>
Marco Miller [Thu, 21 Jun 2018 18:18:48 +0000 (14:18 -0400)]
ResolveMerger: Fix encoding with string; use bytes
This change fixes the issue [1]. Before this fix, a merge involving
the caching of consecutive yet similar filenames with Norwegian
characters [2] used to throw an IllegalStateException: Duplicate
stages not allowed. This was caused by inaccurate decoding of the
filenames, using string values assuming default encoding. In the
toString method of DirCacheEntry, used before through getPathString,
UTF-8 encoding is used, but the end result becomes default encoding,
through Object's default toString usage. The special characters in
those two consecutive (particular) filenames [2] were becoming the
very same decoded /single character, lending consecutive -but then
identical- filenames. Thus the perceived duplicate 0-staging of the
file(s).
Replace getPathString usage with getRawPath for this specific case,
or use byte array representations of cached entries instead of string.
Adding a test for this change is not possible, as there is no known
way to change the default encoding for filenames such as [2] (e.g.).
JGitTestUtil does write file contents through UTF-8, but encoding like
so does not apply to the actual file name. Hence there is no way to
create files with names properly made of special characters such as
[2]'s. And the test that is necessary for this case assumes such
Norwegian (or similar characters) filenames. Changing the default
locale programmatically in a test has no effect either. And changing
the LANG value passed to the JVM is only possible upon starting it.
then I expect to have a full history, just as though I had fetched
without --depth in the first place. Instead, it reports success
but does not fetch enough objects:
The false success indicates problems in the client and the server.
Git 2.18-rc2 (the client) ought to have been more defensive, noticing
the incomplete history. The greater error is in JGit (the server),
which neglects to send the objects requested.
When serving protocol v0 requests, JGit sends the correct objects by
taking unshallowCommits into account when generating the pack to send
to the client. Do the same in the protocol v2 code path. I forgot to
do this in v5.0.0.201806050710-rc3~6 (Teach UploadPack shallow fetch
in protocol v2, 2018-03-15).
Reported-by: Russ Cox <rsc@golang.org>
Change-Id: I282b45f47616a641b9e8d6210b4a070d3efdbb9b Signed-off-by: Jonathan Nieder <jrn@google.com>
Thomas Wolf [Fri, 15 Jun 2018 14:01:58 +0000 (16:01 +0200)]
Avoid expensive getAllRefsByPeeledObjectId() in PlotWalk constructor
Instead, do it when we return the first PlotCommit from next().
On a repository with many refs, getAllRefsByPeeledObjectId() can
take a while. Doing a late initialization simplifies the handling
of a PlotWalk.
EGit, for instance, creates and configures an instance, and then
does the real walk in a background job. With late initialization,
the potentially expensive getAllRefsByPeeledObjectId() also occurs
in that background job.
Bug: 485743
Change-Id: I84c020cf8f7afda6f181778786612b8e6ddd7ed8 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Michael Keppler [Tue, 12 Jun 2018 16:02:22 +0000 (18:02 +0200)]
Upgrade Tycho to 1.2.0
Change-Id: I2f1c81839d2d78ddfd10b3992d1145546d10fa8c Signed-off-by: Michael Keppler <Michael.Keppler@gmx.de> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Wed, 13 Jun 2018 22:00:30 +0000 (00:00 +0200)]
Merge branch 'stable-5.0'
* stable-5.0:
Prepare 5.0.1-SNAPSHOT builds
JGit v5.0.0.201806131550-r
JGit v5.0.0.201806131210-r
Downgrade Apache httpclient to 4.5.2.v20170210-0925
RefUpdateTest: Refactor to not use deprecated Repository#getAllRefs
Propagate failure of ssh command to caller of SshSupport
Make JGit describe behaves same as c-git for lightweight tags
Fix issues with LFS on GitHub (SSH)
Change-Id: I0471440919adfdbfc72996711d9e0bbd1f3cf477 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Wed, 13 Jun 2018 07:03:20 +0000 (09:03 +0200)]
Downgrade Apache httpclient to 4.5.2.v20170210-0925
Eclipse platform uses this version from the Oxygen Orbit release for
Photon. In order to avoid that we end up with two slightly different
versions in the same p2 repository of the simultaneous release we
downgrade temporarily from the version 4.5.2.v20180410-1551 in the
Photon Orbit release.
See
https://dev.eclipse.org/mhonarc/lists/cross-project-issues-dev/msg15659.html
Change-Id: Id46a840aa4b1010af7fe311498f17f1f2e5b81e0 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Tue, 12 Jun 2018 12:09:39 +0000 (14:09 +0200)]
Propagate failure of ssh command to caller of SshSupport
When SshSupport.runSshCommand fails since the executed external ssh
command failed throw a CommandFailedException.
If discovery of LFS server fails due to failure of the
git-lfs-authenticate command chain the CommandFailureException to the
LfsConfigInvalidException in order to allow root cause analysis in the
application using that.
Change-Id: I2f9ea2be11274549f6d845937164c248b3d840b2 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Marcel Trautwein [Fri, 23 Feb 2018 06:27:52 +0000 (07:27 +0100)]
Make JGit describe behaves same as c-git for lightweight tags
JGit now considers lightweight tags only if the --tags option is set
i.e. `git.describe().setAllTags(true)` has to be set, else the default
is now as in c git:
Only annotated tags are evaluated unless you pass true
equivalent to --tags (or --all) by the option setAllTags.
Hint: This (still) doesn't address any difference between c-git
`--all` and `!--all --tags` behavior;
perhaps this might be a follow up request
Markus Duft [Mon, 11 Jun 2018 15:12:00 +0000 (17:12 +0200)]
Fix issues with LFS on GitHub (SSH)
* URIish seems to have a tiny feature (bug?). The path of the URI
starts with a '/' only if the URI has a port set (it seems).
* GitHub does not return SSH authorization on a single line as Gerrit
does - need to account for that.
* Increase the SSH git-lfs-authenticate timeout, as GitHub sometimes
responds slower than expected.
* Guard against NPE in case the download action does not contain any
additional headers.
Change-Id: Icd1ead3d015479fd4b8bbd42ed42129b0abfb95c Signed-off-by: Markus Duft <markus.duft@ssi-schaefer.com>
David Pursehouse [Tue, 12 Jun 2018 03:50:32 +0000 (12:50 +0900)]
Merge branch 'stable-5.0'
* stable-5.0:
Empty merge. The stable-4.9 branch was merged directly into stable-5.0
by [1], and then there were separate merges up through stable-4.10 to
stable-4.11 by [2] and [3].
When stable-4.11 was merged in to stable-5.0 in [4] it was an empty
merge, since the change had already been brought in by [1].
Michael Keppler [Sun, 10 Jun 2018 12:15:45 +0000 (14:15 +0200)]
Fix Javadoc typo
Change-Id: Ib4ebc57236bdea663f27295764886413e2550580 Signed-off-by: Michael Keppler <Michael.Keppler@gmx.de> Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Thomas Wolf [Fri, 8 Jun 2018 13:47:05 +0000 (15:47 +0200)]
Ensure Jsch checks all configured algorithms
Jsch checks only for the availability of the algorithms given by
Jsch-internal config keys "CheckCiphers", "CheckKexes", and
"CheckSignatures". If the ssh config defines any algorithms
unknown to Jsch not listed in those keys, it'll still propose them
during the negotiation phase, and run into an NPE later on if the
server happens to propose such an algorithm and it gets chosen.
Jsch reads those "CheckCiphers" and the other values from either a
session-local config, or the global static Jsch config. It bypasses
~/.ssh/config for these values.
Therefore, copy these values from the config as read from
~/.ssh/config into the session-specific config. That makes Jsch
check _all_ configured algorithms up front, discarding any for
which it has no implementation. Thus it proposes only algorithms
it actually can handle.
Bug: 535672
Change-Id: I6a68e54f4d9a3267e895c536bcf3c58099826ad5 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Dave Borowitz [Fri, 5 Jan 2018 18:02:47 +0000 (13:02 -0500)]
Ensure DirectoryStream is closed promptly
From the javadoc for Files.list:
"The returned stream encapsulates a DirectoryStream. If timely disposal
of file system resources is required, the try-with-resources construct
should be used to ensure that the stream's close method is invoked
after the stream operations are completed."
This is the only call to Files#newDirectoryStream that is not already in
a try-with-resources.
Matthias Sohn [Fri, 8 Jun 2018 15:45:00 +0000 (17:45 +0200)]
Validate branch names on branch creation
Since v2.16.0-rc0~89^2~1 (branch: correctly reject
refs/heads/{-dash,HEAD}, 2017-11-14),
native git does not allow branch names
- refs/heads/HEAD
- starting with '-'
Bug: 535655
Change-Id: Ib1c4ec9ea844073901a4ebe6a29ff6cc8ae58e93 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Update maven plugins to fix Zip Slip vulnerability
Zip Slip [1] is an arbitrary file write generic vulnerability, that can
be achieved using a specially crafted zip (or bzip2, gzip, tar, xz, war)
archive, that holds path traversal filenames.
According to Maven's announcement [2] several plugins use plexus-archiver to
unpack dependencies to disk and have been identified as potential triggers
for exposing the vulnerability.
Of those, JGit uses the maven-dependency-plugin and the maven-javadoc-plugin.
Update them to the fixed versions reported in [2].
See the corresponding issues for the maven-dependency-plugin [3] and the
maven-javadoc-plugin [4] for details.
Jonathan Nieder [Tue, 5 Jun 2018 05:22:24 +0000 (22:22 -0700)]
Merge branch 'stable-5.0'
* stable-5.0:
Teach UploadPack "filter" in protocol v2 fetch
Refactor test of capabilities output
Refactor v2 advertisement into own function
Refactor parsing of "filter" into its own method
Disallow unknown args to "fetch" in protocol v2
Teach UploadPack shallow fetch in protocol v2
Refactor unshallowCommits to local variable
Add protocol v2 support in http
Give info/refs services more control over response
Change-Id: I1683902222e076e1091795e94790a264550afb7b Signed-off-by: Jonathan Nieder <jrn@google.com>
Jonathan Tan [Mon, 30 Apr 2018 20:21:43 +0000 (13:21 -0700)]
Teach UploadPack "filter" in protocol v2 fetch
If the configuration variable uploadpack.allowfilter is true, advertise
that "filter" is supported, and support it if the client sends such an
argument.
Change-Id: I7de66c0a0ada46ff71c5ba124d4ffa7c47254c3b Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Jonathan Tan [Wed, 2 May 2018 23:35:48 +0000 (16:35 -0700)]
Refactor test of capabilities output
A subsequent patch will dynamically generate the capability
advertisement, so the capability advertisements produced are not always
the same. Separate the checking of the advertisements into its own test
method.
Change-Id: I768d14b9d1a244d5d886c42ffd62ef3957b518fb Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Jonathan Tan [Wed, 2 May 2018 22:43:50 +0000 (15:43 -0700)]
Refactor v2 advertisement into own function
A subsequent patch needs dynamic generation of this advertisement
depending on a configuration variable in the underlying repository, so
refactor it into a function instead of using a constant list.
Change-Id: Ie00584add1fb56c9e88c7b57f75703981ea5bb85 Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Jonathan Tan [Thu, 3 May 2018 00:17:04 +0000 (17:17 -0700)]
Disallow unknown args to "fetch" in protocol v2
JGit's implementation of the fetch command of protocol v2, unlike its
implementation of ls-refs, currently tolerates unknown arguments.
Tighten fetch to not allow unrecognized arguments and add tests to
verify this behavior for both ls-refs and fetch.
Change-Id: I321161d568bd638252fab1a47b06b924d472a669 Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Jonathan Tan [Thu, 15 Mar 2018 22:56:50 +0000 (15:56 -0700)]
Teach UploadPack shallow fetch in protocol v2
Add support for the "shallow" and "deepen" parameters in the "fetch"
command in the fetch-pack/upload-pack protocol v2. Advertise support for
this in the capability advertisement.
Change-Id: I7ffd80d6c38872f9d713ac7d6e0412106b3766d7 Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jonathan Nieder <jrn@google.com>
Jonathan Tan [Tue, 22 May 2018 22:19:04 +0000 (15:19 -0700)]
Refactor unshallowCommits to local variable
This reduces the amount of state held as instance variables in
UploadPack, and makes it easier for a future patch to contain a clearer
version of UploadPack#processShallow.
Change-Id: I6df80b42f9e5118fda1420692e02e417670cced3 Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jonathan Nieder <jrn@google.com>
Jonathan Tan [Wed, 28 Feb 2018 22:36:44 +0000 (14:36 -0800)]
Add protocol v2 support in http
Teach UploadPack to support protocol v2 with non-bidirectional pipes,
and add support to the HTTP protocol for v2. This is only activated if
the repository's config has "protocol.version" equal to 2.
Change-Id: I093a14acd2c3850b8b98e14936a716958f35a848 Helped-by: Matthias Sohn <matthias.sohn@sap.com> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jonathan Nieder <jrn@google.com>
Jonathan Tan [Wed, 25 Apr 2018 20:59:52 +0000 (13:59 -0700)]
Give info/refs services more control over response
Currently, SmartServiceInfoRefs always prints "# service=serviceName"
followed by a flush packet in response to an info/refs request, and then
hands it off to the specific service class. Printing of "#
service=serviceName" is mandated for protocol v0, but not v2.
Therefore, the existing code works for protocol v0, but whenever a
service that supports protocol v2 receives an info/refs request, it must
first determine which protocol version is to be used (depending on, for
example, the request and any relevant configuration variables), and then
decide if "# service=serviceName" needs to be printed.
Create a new method that v2-supporting service classes can override,
covering the printing of both "# service=serviceName" and everything
that the #advertise method prints. This will be used in a subsequent
commit in which UploadPackServlet (and the other classes it uses) is
updated to support protocol v2.
Change-Id: Ia026b06e96a6b15937514096babd024ef77df1ea Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jonathan Nieder <jrn@google.com>
Change-Id: Ie149c2fee6c552b7a595f029c267292840734192 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Dave Borowitz [Wed, 30 May 2018 00:35:14 +0000 (17:35 -0700)]
Move DfsPackDescription comparators to common location
There are several ways of comparing DfsPackDescriptions for different
purposes, such as object lookup search order and reftable ordering. Some
of these are later compounded into comparators on other objects, so they
appear in the code as Comparator<DfsReftable>, for example.
Put all the DfsPackDescription comparators in static methods on
DfsPackDescription itself. Stop implementing Comparable, to avoid giving
the impression that there is always one true and correct way of sorting
packs.
Dave Borowitz [Wed, 30 May 2018 19:02:35 +0000 (12:02 -0700)]
Use Comparators for PackSource
Rather than requiring callers to do their own computations based on the
package-private "category" number, provide an actual
Comparator<PackSource> instance, and explicitly discourage usage of
default Enum comparison.
Construct the default comparator using a builder pattern based on
defining equivalence classes. This gives us the same behavior as the old
category field in PackSource, with an abstraction that does not leak the
implementation detail of comparing rank numbers.
Dave Borowitz [Wed, 30 May 2018 00:53:26 +0000 (17:53 -0700)]
DfsPackDescription: Disallow null PackSource
In normal operation, the source of a pack should never be null; the DFS
implementation should always know where a pack came from. Existing
implementations in InMemoryRepository and at Google always have the
source available at construction time.
The problem with null PackSources in the previous implementation was it
made the DfsPackDescription#compareTo method intransitive. Specifically,
it skips comparing the sources at all if *either* operand is null.
Suppose we have three descriptions A, B, and C, where all fields are
equal except the PackSource, and:
* A's source is INSERT
* B's source is null
* C's source is RECEIVE
In this case, A.compareTo(B) == 0, and B.compareTo(C) == 0, since all
fields are equal except the source, which is skipped. But
A.compareTo(C) != 0, since A and B have different sources.
Avoid this problem in compareTo by enforcing that the source is never
null. We could of course assign an arbitrary category number to a null
source in order to make comparison transitive[1], but it's simpler to
implement and reason about if the field is non-nullable, and there is no
real-world use case to make it null.
Although a non-null source is required at construction time, the field
is currently still mutable: DfsPackDecscription#setPackSource is used by
DfsInserterTest to mark packs as garbage. This could probably be
avoided as well, allowing us to convert packSource to a final field, but
doing so is beyond the scope of this change.
[1] The astute reader will notice this is already done by
DfsObjDatabase#reftableComparator(). In fact, the reason that
different comparator implementations non-obviously have different
semantics for this nullable field is another reason why it's clearer
to avoid null entirely.
Thomas Wolf [Thu, 24 May 2018 14:26:36 +0000 (16:26 +0200)]
Don't prune symbolic refs when fetch.prune = true
The canonical implementation also doesn't. Compare current
code in remote.c, function get_stale_heads_cb.[1] Not handling
symrefs in this case was introduced in canonical git in [2]
in 2008.