Michael Keppler [Mon, 3 Dec 2018 16:25:51 +0000 (17:25 +0100)]
Do not include log4j implementation in jgit
As discussed in the bug, jgit should not include a logging
implementation, and instead rely on the product containing jgit to
configure the logging.
We have recently run into the situation, that installing egit in a (non
eclipse.org) RCP application breaks all the logging due to incompatible
logging implementations. Removal of the jgit logging implementation
should fix this.
Following further changes have been done for jgit command line:
* added log4j.properties to binary build of jgit.pgm. That file existed
in the git repository, but was not included in the eclipse binary build.
(maybe it is in the bazel build)
* removed apache.commons.logging package import from jgit.pgm. That
import is not used, and makes the logging even more confusing.
Bug: 514326
Change-Id: I6dc7d1462f0acfca9e2b1ac87e705617179ffdda Signed-off-by: Michael Keppler <Michael.Keppler@gmx.de> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Fri, 24 Apr 2020 20:41:39 +0000 (22:41 +0200)]
Decouple JSch from JGit Core
Motivation: JSch serves as 'default' implementations of the SSH
transport. If a client application does not use it then there is no need
to pull in this dependency.
Move the classes depending on JSch to an OSGi fragment extending the
org.eclipse.jgit bundle and keep them in the same package as before
since moving them to another package would break API. Defer moving them
to a separate package to the next major release.
Add a new feature org.eclipse.jgit.ssh.jsch feature to enable
installation. With that users can now decide which of the ssh client
integrations (JCraft JSch or Apache Mina SSHD) they want to install.
We will remove the JCraft JSch integration in a later step due to the
reasons discussed in bug 520927.
Bug: 553625
Change-Id: I5979c8a9dbbe878a2e8ac0fbfde7230059d74dc2 Also-by: Michael Dardis <git@md-5.net> Signed-off-by: Michael Dardis <git@md-5.net> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com> Signed-off-by: David Ostrovsky <david@ostrovsky.org>
Matthias Sohn [Sun, 26 Apr 2020 22:58:28 +0000 (00:58 +0200)]
Decouple BouncyCastle from JGit Core
Motivation: BouncyCastle serves as 'default' implementation of
the GPG Signer. If a client application does not use it there is no need
to pull in this dependency, especially since BouncyCastle is a large
library.
Move the classes depending on BouncyCastle to an OSGi fragment extending
the org.eclipse.jgit bundle. They are moved to a distinct internal
package in order to avoid split packages. This doesn't break public API
since these classes were already in an internal package before this
change.
Add a new feature org.eclipse.jgit.gpg.bc to enable installation. With
that users can now decide if they want to install it.
Attempts to sign a commit if org.eclipse.jgit.gpg.bc isn't available
will result in ServiceUnavailableException being thrown.
Bug: 559106
Change-Id: I42fd6c00002e17aa9a7be96ae434b538ea86ccf8 Also-by: Michael Dardis <git@md-5.net> Signed-off-by: Michael Dardis <git@md-5.net> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com> Signed-off-by: David Ostrovsky <david@ostrovsky.org>
Thomas Wolf [Fri, 29 May 2020 19:57:37 +0000 (21:57 +0200)]
Verify that the user home directory is valid
If the determination of the user home directory produces a Java File
object with an invalid path, spurious exceptions may occur at the
most inopportune moments anytime later. In the case in the linked bug
report, start-up of EGit failed, leading to numerous user-visible
problems in Eclipse.
So validate the return value of FS.userHomeImpl(). If converting that
File to a Path throws an exception, log the problem and fall back to
Java system property user.home. If that also is not valid, use null.
(A null user home directory is allowed by FS, and calling in Java
new File(null, "some_string") is fine and produces a File relative
to the current working directory.)
Bug: 563739
Change-Id: If9eec0f9a31a45bd815231706285c71b09f8cf56 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Fri, 29 May 2020 21:04:44 +0000 (23:04 +0200)]
WindowCache: conditional JMX setup
Make it possible to programmatically suppress the JMX bean
registration. In EGit it is not needed but can be rather costly
because it occurs during plug-in activation and accesses the
git user config.
Bug: 563740
Change-Id: I07ef7ae2f0208d177d2a03862846a8efe0191956 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Thu, 23 Apr 2020 16:30:19 +0000 (18:30 +0200)]
Builder API to configure SshdSessionFactories
A builder API provides a more convenient way to define a customized
SshdSessionFactory by hiding the subclassing.
Also provide a new interface SshConfigStore to abstract away the
specifics of reading a ssh config file, and provide a way to customize
the concrete ssh config implementation to be used. This facilitates
using an alternate ssh config implementation that may or may not be
based on files.
Change-Id: Ib9038e8ff2a4eb3a9ce7b3554d1450befec8e1e1 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Wed, 13 May 2020 18:06:23 +0000 (20:06 +0200)]
TransportHttp: abort on time-out or on SocketException
Avoid trying other authentication methods on SocketException or on
InterruptedIOException. SocketException is rather fatal, such as
nothing listening on the peer's port, connection reset, or it could
be a connection time-out.
Time-outs enforced by Timeout{Input,Output}Stream may result in
InterruptedIOException being thrown.
In both cases, it makes no sense to try other authentication methods,
and doing so may wrongly report "authentication not supported" or
"cannot open git-upload-pack" or some such instead of reporting a
time-out.
Bug: 563138
Change-Id: I0191b1e784c2471035e550205abd06ec9934fd00 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Mon, 23 Mar 2020 15:55:35 +0000 (16:55 +0100)]
Attributes: fix handling of text=auto in combination with eol
In Git 2.10.0 the interpretation of gitattributes changed or was fixed
such that "* text=auto eol=crlf" would indeed still do auto-detection
of text vs. binary content.[1] Previously this was identical to
"* text eol=crlf", i.e., treating all files as text.
JGit still did the latter, which caused surprises because it changed
binary files.
David Ostrovsky [Sun, 6 Oct 2019 13:28:44 +0000 (15:28 +0200)]
Bazel: Remove superfluous dependencies flagged by unused_deps
Bazel buildtools project includes in addition to buildifier also unused
deps and buildozer utilities, that detect unused dependencies and fix
them by applying the removal to the build files. This change is created
by installing unused_deps from buildtools@HEAD and running:
$ unused_deps //...
and applying the suggested modifications.
Change-Id: Iad74ec2fa719475b29391586f40b13ae30477004 Signed-off-by: David Ostrovsky <david@ostrovsky.org>
* changes:
PackBitmapIndex: Set distance threshold
PackBitmapIndex: Not buffer inflated bitmap in BasePackBitmapIndex
PackBitmapIndex: Remove convertedBitmaps in the Remapper
PackBitmapIndex: Reduce memory usage in GC
PackBitmapIndex: Add AddToBitmapWithCacheFilter class
PackBitmapIndex: Add util methods and builder to BitmapCommit
PackBitmapIndex: Move BitmapCommit to a top-level class
Refactor: Make retriveCompressed an method of the Bitmap class
Yunjie Li [Wed, 29 Apr 2020 22:27:47 +0000 (15:27 -0700)]
PackBitmapIndex: Set distance threshold
Setting the distance threshold to 2000 in PackWriterBitmapPreparer to
reduce memory usage in garbage collection. When the threshold is 0, GC
for the msm repository would use about 37 GB memory to complete. After
setting it to 2000, GC can finish in 75 min with about 10 GB memory.
Change-Id: I39783eeecbae58261c883735499e61ee1cac75fe Signed-off-by: Yunjie Li <yunjieli@google.com>
Yunjie Li [Thu, 23 Apr 2020 22:12:15 +0000 (15:12 -0700)]
PackBitmapIndex: Not buffer inflated bitmap in BasePackBitmapIndex
Currently we're buffering the inflated bitmap entry in BasePackBitmapIndex
to optimize running time. However, this will use lots of memory during
the construction of the pack bitmap index file which may cause failure of
garbage collection.
The running time didn't increase significantly, if there's any increase,
after removing the buffering here. The report about usage of time/memory
will come in the next commit.
Change-Id: I874503ecc85714acab7ca62a6a7968c2dc0b56b3 Signed-off-by: Yunjie Li <yunjieli@google.com>
Yunjie Li [Thu, 23 Apr 2020 22:11:14 +0000 (15:11 -0700)]
PackBitmapIndex: Remove convertedBitmaps in the Remapper
The convertedBitmaps serves for time-optimization purpose. But it's
actually not saving time much but using lots of memory. So remove the
field here to save memory.
Currently the remapper class is only used in the construction of the
bitmap index file. And during the preparation of the file, we're only
getting bitmaps from the remapper when finding objects accessible from
a commit, so bitmap associated with each commit will only be fetched once
and thus the convertedBitmaps would hardly be read, which means that it's
not saving time.
Change-Id: Ic942a8e485135fb177ec21d09282d08ca6646fdb Signed-off-by: Yunjie Li <yunjieli@google.com>
Yunjie Li [Tue, 11 Feb 2020 19:06:33 +0000 (11:06 -0800)]
PackBitmapIndex: Reduce memory usage in GC
Currently, the garbage collection is consistently failing for some large
repositories in the building bitmap phase, e.g.Linux-MSM project:
https://source.codeaurora.org/quic/la/kernel/msm-3.18
Historically, bitmap index creation happened in 3 phases:
1. Select the commits to which bitmaps should be attached.
2. Create all bitmaps for these commits, stored in uncompressed format
in the PackBitmapIndexBuilder.
3. Deltify the bitmaps and write them to disk.
We investigated the process. For phase 2 it's most efficient to create
bitmaps starting with oldest commit and moving to the newest commit,
because the newer commits are able to reuse the work for the old ones.
But for bitmap deltification in phase 3, it's better when a newer
commit's bitmap is the base, and the current disk format writes bitmaps
out for the newest commits first.
This change introduces a new collection to hold the deltified and
compressed representations of the bitmaps, keeping a smaller subset of
commits in the PackBitmapIndexBuilder to help make the bitmap index
creation more memory efficient.
And in this commit, we're setting DISTANCE_THRESHOLD to 0 in the
PackWriterBitmapPreparer, which means the garbage collection will not
have much behavoir change and will still use as much memory as before.
Change-Id: I6ec2c3e8dde11805af47874d67d33cf1ef83660e Signed-off-by: Yunjie Li <yunjieli@google.com>
Yunjie Li [Tue, 11 Feb 2020 01:02:13 +0000 (17:02 -0800)]
PackBitmapIndex: Add AddToBitmapWithCacheFilter class
Add a new revwalk filter, AddToBitmapWithCachedFilter. This filter updates
a client-provided {@code BitmapBuilder} as a side effect of a revwalk.
Similar to {@code AddToBitmapFilter}, it short circuits the walk when it
encounters a commit which is included in the provided bitmap's BitmapIndex.
It also short circuits the walk if it encounters the client-provided
cached commit.
Change-Id: I62cb503016f4d3995d648d92b82baab7f93549a9 Signed-off-by: Yunjie Li <yunjieli@google.com>
Yunjie Li [Wed, 22 Apr 2020 20:12:05 +0000 (13:12 -0700)]
PackBitmapIndex: Move BitmapCommit to a top-level class
Move BitmapCommit from inside the PackWriterBitmapPreparer to a new
top-level class in preparation for improving the memory footprint of GC's
bitmap generation phase.
Change-Id: I4d404a5b3a34998b441d23105197f33d32d39670 Signed-off-by: Yunjie Li <yunjieli@google.com>
Yunjie Li [Mon, 10 Feb 2020 23:22:31 +0000 (15:22 -0800)]
Refactor: Make retriveCompressed an method of the Bitmap class
Make retrieveCompressed() a method of Bitmap interface to avoid type
casting and later reuse in improving the memory footprint of GC's bitmap
generation phase.
Change-Id: I098d85105cf17af845d43b8c71b4ca48b02fd7da Signed-off-by: Yunjie Li <yunjieli@google.com>
Pat Long [Thu, 23 Apr 2020 17:52:22 +0000 (13:52 -0400)]
Allow for using custom s3 host with lfs server
By default, it will generate hostname using the aws region passed to the
constructor.
This will allow for easier testing, since you can just spin up a local
minio (or other s3-compatible storage service) instance and point the
application at that for the storage mechanism.
It will also allow for storing lfs objects on-prem.
Change-Id: I2566b1fcce58f3d306ddd23a8da702ef5a451c7b Signed-off-by: Pat Long <pllong@arista.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Demetr Starshov [Thu, 7 May 2020 00:53:36 +0000 (17:53 -0700)]
ReceivePack: adding IterativeConnectivityChecker
Introduce an IterativeConnectivityChecker which runs a connectivity
check with a filtered set of references, and falls back to using the
full set of advertised references.
It uses references during first check attempt:
- References that are ancestors of an incoming commits (e.g., pushing
a commit onto an existing branch or pushing a new branch based on
another branch)
- Additional list of references we know client can be interested in
(e.g. list of open changes for Gerrit)
We tested it inside Google and it improves connectivity for certain
topologies. For example connectivity counts for
chromium.googlesource.com/chromium/src:
Nail Samatov [Thu, 7 May 2020 09:03:56 +0000 (12:03 +0300)]
Fix error occurring during checkout
Fix NullPointerException occurring when calling
CheckoutCommand with forced == true option when
the branch isn't changed and there is deleted
uncommitted file.
Change-Id: I99bf1fc25e6889f07092320d7bc2772ec5d341b5 Signed-off-by: Nail Samatov <sanail@yandex.ru>
Jack Wickham [Fri, 17 Apr 2020 17:33:39 +0000 (18:33 +0100)]
Create parent directories when renaming a file in ApplyCommand
Before this change, applying a patch will fail if the destination directory
doesn't exist; after, the necessary parent directories are created.
If renaming the file fails, the directories won't be deleted, so this change
isn't atomic. However, ApplyCommand is already not atomic - if one hunk fails
to apply, other hunks still get applied - so I don't think that is a blocker.
Change-Id: Iea36138b806d4e7012176615bcc673756a82f365 Signed-off-by: Jack Wickham <jwickham@palantir.com>
ObjectReachabilityChecker interface is the only public API. The
implementation is instantiated by ObjectWalk and doesn't need to be
visible outside the package.
Change-Id: I5b97bb98990cded637686bdc15c9655330b7780f Signed-off-by: Ivan Frade <ifrade@google.com>
Ivan Frade [Mon, 6 Apr 2020 21:35:52 +0000 (14:35 -0700)]
UploadPack: Use more relevant refs first in object reachability check
The bitmap-bassed object reachability checker, tries to find the objects
in the first starter, then adding the second starter... and so on. This
rewards passing the most popular refs first.
Order the refs with heads first, then tags, then others (e.g. changes)
for the object reachability checker. Using streams, delay also the
resolution of the ref to RevObject until necessary.
Change-Id: I9414b76754d7c0ffee1e2eeed6939895c8e92cbe Signed-off-by: Ivan Frade <ifrade@google.com>
Ivan Frade [Sat, 4 Apr 2020 06:27:33 +0000 (23:27 -0700)]
UploadPack: Refactor to generalize the object reachability checks
ObjectWalk#createObjectReachabilityChecker() returns the best
implementation for the repo. UploadPack can use the interface and fold
the with/without commits cases in one code path.
Change-Id: I857c11735d1d8e36c3ed8185ff11de8a62e86540 Signed-off-by: Ivan Frade <ifrade@google.com>
Ivan Frade [Thu, 2 Apr 2020 05:10:09 +0000 (22:10 -0700)]
UploadPack: Extract walk-based reachability check
Preparing the code to optimize the bitmap-based object reachability
checker. We are mirroring first the commit reachability checker
structure (interface + 2 implementations).
Move the walk-base reachability checker to its own class.
This class is public at the moment. Later ObjectWalk will return an
interface and this implementation will be package-private.
Change-Id: Ifac70094e1af137291c3607d95e689992f814b26 Signed-off-by: Ivan Frade <ifrade@google.com>
UploadPack: Clear advertised ref map after negotiation
After negotiation phase of a fetch, the advertised ref map is no longer used and
can be safely cleared. For >1GiB repos object selection and packfile writing may
take 10s of minutes. For the chromium.googlesource.com/chromium/src repo, this
advertised ref map is >400MiB. Returning this memory to the Java heap is a major
scalability win.
David Ostrovsky [Fri, 17 Apr 2020 22:00:32 +0000 (00:00 +0200)]
Bazel: Disable SecurityManagerMissingPermissionsTest test
In Id5376f09f0d a test with dependency on log4j library was added, but
the library was missed to be added to the Bazel build tool chain.
Given that Bazel test runner doesn't suport custom security manager the
test wouldn't pass even if the missing dependency would be added. The
only solution we have for now is to exclude that test from Bazel tool
chain.
Filed a feature request for bazel to support such tests at
https://github.com/bazelbuild/bazel/issues/11146
Bug: 562274
Change-Id: I873a0e09addc583455b68122f66cd3952e485f0e Signed-off-by: David Ostrovsky <david@ostrovsky.org> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Michael Keppler [Tue, 14 Apr 2020 07:31:50 +0000 (09:31 +0200)]
Remove double blank from sentence start
Multiple whitespaces are not normalized when reading properties files,
therefore leading to unwanted space/indentation in console or UI output.
Change-Id: I1f5224fe359e0cac493e0237872afc75dc8b9fbe Signed-off-by: Michael Keppler <Michael.Keppler@gmx.de> Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
(cherry picked from commit ebbc3efce73278d6e0dbb1acd099db2446b1bed9)
Thomas Wolf [Thu, 2 Apr 2020 19:15:18 +0000 (21:15 +0200)]
FS.runInShell(): handle quoted filters and hooksPath containing blanks
Revert commit 2323d7a. Using $0 in the shell command call results in
the command string being taken literally. That was introduced to fix
a problem with backslashes, but is actually not correct.
First, the problem with backslashes occurred only on Win32/Cygwin,
and has been properly fixed in commit 6f268f8.
Second, this is used only for hooks (which don't have backslashes in
their names) and filter commands from the git config, where the user
is responsible for properly quoting or escaping such that the commands
work.
Third, using $0 actually breaks correctly quoted filter commands
like in the bug report. The shell really takes the command literally,
and then doesn't find the command because of quotes.
So revert this change.
At the same time there's a related problem with hooks. If the path to
the hook contains blanks, runInShell() would also fail to find the
hook. In this case, the command doesn't come from user input but is
just a Java File object with an absolute path containing blanks. (Can
occur if core.hooksPath points to such a path with blanks, or if the
repository has such a path.)
The path to the hook as obtained from the file system must be quoted.
Matthias Sohn [Sun, 29 Mar 2020 11:08:26 +0000 (13:08 +0200)]
Define constants for pack config option keys
Change-Id: Ifb8227cb62370029d6774f2a22b15d6478c713ca Signed-off-by: Matthias Sohn <matthias.sohn@sap.com> Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Masaya Suzuki [Tue, 24 Mar 2020 18:50:34 +0000 (11:50 -0700)]
ReceivePack: Use error message if set
ReceiveCommand can have an error message. This is shown only for some
cases even if it's set. This change uses the error message if it's set,
and fallback to the default message if unset.
Thomas Wolf [Wed, 25 Mar 2020 08:13:20 +0000 (09:13 +0100)]
Handle non-normalized index also for executable files
Commit 60cf85a4 corrected the handling of check-in for files where
the index version is non-normalized, i.e., contains CR-LF line endings.
However, it did so only for regular files, not executable files.
Bug: 561438
Change-Id: I372cc990c5efeb00315460f36459c0652d5d1e77 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Ivan Frade [Wed, 18 Mar 2020 05:29:59 +0000 (22:29 -0700)]
ResolveMerger: Ignore merge conflicts if asked so
The recursive merge strategy builds a virtual ancestor merging
recursively the common bases (when more than one) between the
want-to-merge commits. While building this virtual ancestor, content
conflicts are ignored, but current code doesn't do so when a file is
removed.
This was spotted in [1], for example. Merging two commits to build the
virtual ancestor bumped into a conflict (modified in one side, deleted
in the other) that stopped the process.
Follow the "spec" and in case of conflict leave the unmerged content in
the index and working trees.
Alex Spradlin [Thu, 12 Mar 2020 16:04:36 +0000 (09:04 -0700)]
RevWalk: fix bad topo flags error message
The error message for an Exception thrown by StartGenerator when given
both the TOPO flag and the TOPO_KEEP_BRANCH_TOGETHER flag mentions a
non-existent flag, TOPO_NON_INTERMIX. The error message was introduced
in commit e498d43.
Replace TOPO_NON_INTERMIX with TOPO_KEEP_BRANCH_TOGETHER in the error
message of an Exception thrown by the StartGenerator when the TOPO flag
is provided together with the TOPO_KEEP_BRANCH_TOGETHER flag.
Signed-off-by: Alex Spradlin <alexaspradlin@google.com>
Change-Id: Id24640dc08e96a196508fe38ce144aa7e035082f