David Ostrovsky [Sat, 25 Jul 2020 08:00:11 +0000 (10:00 +0200)]
Migrate to Apache MINA sshd 2.6.0 and Orbit I20210203173513
Re-enable DSA, DSA_CERT, and RSA_CERT public key authentication.
DSA is discouraged for a long time already, but it might still be
way too disruptive to completely drop it. RSA is discouraged for
far less long, and dropping that would be really disruptive.
Adapt to the changed property handling. Remove work-arounds for
shortcomings of earlier sshd versions.
Use Orbit I20210203173513, which includes sshd 2.6.0. This also bumps
apache.httpclient to 4.5.13 and apache.httpcore to 4.4.14.
Change-Id: I2d24a1ce4cc9f616a94bb5c4bdaedbf20dc6638e Signed-off-by: David Ostrovsky <david@ostrovsky.org> Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Thomas Wolf [Fri, 29 Jan 2021 11:12:28 +0000 (12:12 +0100)]
LFS: make pointer parsing more robust
Parsing an LFS pointer must check the input more to not run into
exceptions. LfsPoint.parseLfsPointer() is used in various places to
determine whether a blob is a LFS pointer; it is not only called with
valid LFS pointers. Tighten the validations and return null if they
fail. All callers already do check for a null return value.
Also, LfsPointer implemented Comparable but did not override equals().
This is rather unusual and actually warned against in the javadoc of
Comparable. Implement equals() and hashCode().
Add more tests.
Bug: 570744
Change-Id: I90ca264d0a250275cf1907e9dcfcee5eab80df0f Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Terry Parker [Mon, 25 Jan 2021 06:18:21 +0000 (22:18 -0800)]
Move reachability checker generation into the ObjectReader object
Reachability checkers are retrieved from RevWalk and ObjectWalk objects:
* RevWalk.createReachabilityChecker()
* ObjectWalk.createObjectReachabilityChecker()
Since RevWalks and ObjectWalks are themselves directly instantiated
in hundreds of places (e.g. UploadPack...) overriding them in a
consistent way requires overloading 100s of methods, which isn't
feasible. Moving reachability checker generation to a more central
place solves that problem.
The ObjectReader object seems a good place from which to get
reachability checkers, because reachability checkers return
information about relationships between objects. ObjectDatabases
delegate many operations to ObjectReaders, and reachability bitmaps
are attached to ObjectReaders.
The Bitmapped and Pedestrian reachability checker objects were
package private in the org.eclipse.jgit.revwalk package. This change
makes them public and moves them to the
org.eclipse.jgit.internal.revwalk package. Corresponding tests are
also moved.
Motivation:
1) Reachability checking algorithms need to scale. One of the
internal Android repositories has ~2.4 million refs/changes/*
references, causing bad long tail performance in reachability
checks.
2) Reachability check performance is impacted by repository
topography: number of refs, number of objects, amounts of
related vs. unrelated history.
3) Reachability check performance is also affected by per-branch
access (Gerrit branch permissions) since different users can
see different branches.
4) Reachability check performance isn't affected by any state in a
RevWalk or ObjectWalk.
I don't yet know if a single algorithm will work for all cases in #2
and #3. We may need to evolve the ReachabilityChecker interfaces
over time to solve the Gerrit branch permissions case, or use
Gerrit-specific identity information to solve that in an efficient
way.
This change takes the existing public API and moves it to the
ObjectReader/whole repository level, which is where we can do
consistent customizations for #2 and #3. We intend to upstream the
best of whatever works, but anticipate the need for multiple rounds
of experimentation.
Change-Id: I9185feff43551fb387957c436112d5250486833d Signed-off-by: Terry Parker <tparker@google.com>
Add the "compression-level" option to all ArchiveCommand formats
Different archive formats support a compression level in the range
[0-9]. The value 0 is for lowest compressions and 9 for highest. Highest
levels produce output files of smaller sizes but require more memory to
do the compression.
This change allows passing a "compression-level" option to the git
archive command and implements using it for different file formats.
Change-Id: I5758f691c37ba630dbac24db67bb7da827bbc8e1 Signed-off-by: Youssef Elghareeb <ghareeb@google.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Jonathan Tan [Wed, 27 Jan 2021 18:36:43 +0000 (13:36 -0500)]
Merge changes I36d9b63e,I8c5db581,I2c02e89c
* changes:
Compare getting all refs except specific refs with seek and with filter
Add getsRefsByPrefixWithSkips (excluding prefixes) to ReftableDatabase
Add seekPastPrefix method to RefCursor
Gal Paikin [Mon, 7 Dec 2020 14:18:34 +0000 (15:18 +0100)]
Compare getting all refs except specific refs with seek and with filter
There are currently two ways to get all refs except a specific ref, we
add two methods that perform both and compare the two different approaches.
This change adds two methods that compares the two different approaches
of such query:
1. Get all the refs, and then filter by refs that don't start with the
prefix (current approach).
2. Get all refs until encountering a ref that is part of the prefix we
should exclude, skip using seekPastPrefix, and continue (new approach).
This works since the refs are sorted.
Specifically in Gerrit, we often have thousands of refs that are not
refs/changes, and millions of refs/changes, hence the second approach
should be much faster. In Jgit in general it's still expected to provide
a better result even if we're skipping a smaller chunk of the refs
since the complexity here is O(logn) with a binary search, rather than
O(number of skipped refs).
We ran this benchmark on a big chunk of chromium/src's reftable. To run
it, we first create the reftable:
Result:
total time the action took using seek: 36925 usec
total time the action took using filter: 874382 usec
number of refs that start with prefix: 4266.
number of refs that don't start with prefix: 1962695.
Similarly for Android's biggest repository, platform/frameworks/base
(still only partial result):
total time the action took using seek: 9020 usec
total time the action took using filter: 143166 usec
number of refs that start with prefix: 296.
number of refs that don't start with prefix: 60400.
In conclusion, it's easy to see an improvement of a factor of 15-20x for
large Gerrit repositories!
Signed-off-by: Gal Paikin <paiking@google.com>
Change-Id: I36d9b63eb259804c774864429cf2c761cd099cc3
Gal Paikin [Mon, 30 Nov 2020 14:57:06 +0000 (15:57 +0100)]
Add getsRefsByPrefixWithSkips (excluding prefixes) to ReftableDatabase
We sometimes want to get all the refs except specific prefixes,
similarly to getRefsByPrefix that gets all the refs of a specific
prefix.
We now create a new method that gets all refs matching a prefix except a
set of specific prefixes.
One use-case is for Gerrit to be able to get all the refs except
refs/changes; in Gerrit we often have lots of refs/changes, but very
little other refs. Currently, to get all the refs except refs/changes we
need to get all the refs and then filter the refs/changes, which is very
inefficient. With this method, we can simply skip the unneeded prefix so
that we don't have to go over all the elements.
RefDirectory still uses the inefficient implementation, since there
isn't a simple way to use Refcursor to achieve the efficient
implementation (as done in ReftableDatabase).
Signed-off-by: Gal Paikin <paiking@google.com>
Change-Id: I8c5db581acdeb6698e3d3a2abde8da32f70c854c
Gal Paikin [Thu, 19 Nov 2020 17:05:04 +0000 (18:05 +0100)]
Add seekPastPrefix method to RefCursor
This method will be used by the follow-up change. This useful if we want
to go over all the changes after a specific ref.
For example, the new method allows us to create a follow-up that would
go over all the refs until we reach a specific ref (e.g refs/changes/),
and then we use seekPastPrefix(refs/changes/) to read the rest of the refs,
thus basically we return all refs except a specific prefix.
When seeking past a prefix, the previous condition that created the
RefCursor still applies. E.g, if the cursor was created by
seekRefsWithPrefix, we can skip some refs but we will not return refs
that are not starting with this prefix.
Signed-off-by: Gal Paikin <paiking@google.com>
Change-Id: I2c02e89c877fe90da8619cb8a4a9a0c865f238ef
Han-Wen Nienhuys [Mon, 25 Jan 2021 15:54:52 +0000 (16:54 +0100)]
reftable: add random suffix to table names
In some circumstances (eg. compacting a stack that has deletions), the
result may have a {min, max} range that already exists. In these
cases, we would rename onto an already existing file, which does not
work on Windows. By adding a random suffix, we disambiguate the files,
and avoid this failure scenario.
Thomas Wolf [Mon, 18 Jan 2021 15:05:58 +0000 (16:05 +0100)]
Correct the minimum required version of Apache httpclient
org.eclipse.jgit.http.apache uses several features that exist only
since httpclient 4.4, but its MANIFEST.MF still had a lower bound of
4.3.0. Bump this to 4.4.0 for all packages from httpclient. 4.3.0 for
the packages from httpcore is fine.
Do a similar clean-up in the other bundles using packages from Apache
httpclient (http.test, lfs, lfs.server, lfs.server.test)
Bug: 570451
Change-Id: Iffdde2a9bd0d65db2e5201a08cffbf03597e2866 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Wed, 2 Dec 2020 22:07:01 +0000 (23:07 +0100)]
TransportHttp: support preemptive Basic authentication
If the caller knows already HTTP Basic authentication will be needed
and if it also already has the username and password, preemptive
authentication is a little bit more efficient since it avoids the
initial 401 response.
Add a setPreemptiveBasicAuthentication(username, password) method
to TransportHttp. Client code could call this for instance in a
TransportConfigCallback. The method throws an IllegalStateException
if it is called after an HTTP request has already been made.
Additionally, a URI can include userinfo. Although it is not
recommended to put passwords in URIs, JGit's URIish and also the
Java URL and URI classes still allow it. The underlying HTTP
connection may omit these fields though. If present, take these
fields as additional source for preemptive Basic authentication if
setPreemptiveBasicAuthentication() has not been called.
No preemptive authentication will be done if the connection is
redirected to a different host.
Add tests.
Bug: 541327
Change-Id: Id00b975e56a15b532de96f7bbce48106d992a22b Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Mon, 30 Nov 2020 11:14:22 +0000 (12:14 +0100)]
TransportHttp: shared SSLContext during fetch or push
TransportHttp makes several HTTP requests. The SSLContext and socket
factory must be shared over these requests, otherwise authentication
information may not be propagated correctly from one request to the
next. This is important for authentication mechanisms that rely on
client-side state, like NEGOTIATE (either NTLM, if the underlying HTTP
library supports it, or Kerberos). In particular, SPNEGO cannot
authenticate on a POST request; the authentication must come from the
initial GET request, which implies that the POST request must use the
same SSLContext and socket factory that was used for the GET.
Change the way HTTPS connections are configured. Introduce the concept
of a GitSession, which is a client-side HTTP session over several HTTPS
requests. TransportHttp creates such a session and uses it to configure
all HTTP requests during that session (fetch or push). This gives a way
to abstract away the differences between JDK and Apache HTTP connections
and to configure SSL setup outside.
A GitSession can maintain state and thus give all HTTP requests in a
session the same socket factory.
Introduce an extension interface HttpConnectionFactory2 that adds a
method to obtain a new GitSession. Implement this for both existing
HTTP connection factories. Change TransportHttp to use the new
GitSession to configure HTTP connections.
The old methods for disabling SSL verification still exist to support
possibly external connection and connection factory implementations
that do not make use of the new GitSession yet.
Bug: 535850
Change-Id: Iedf67464e4e353c1883447c13c86b5a838e678f1 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Sun, 29 Nov 2020 15:22:39 +0000 (16:22 +0100)]
TransportHttp: make the connection factory configurable
Previously, TransportHttp always used the globally set connection
factory. This is problematic if that global factory is changed in
the middle of a fetch or push operation. Initialize the factory to
use in the constructor, then use that factory for all HTTP requests
made through this transport. Provide a setter and a getter for it
so that client code can customize the factory, if needed, in a
TransportConfigCallback.
Once a factory has been used on a TransportHttp instance it cannot
be changed anymore.
Make the global static factory reference volatile.
Change-Id: I7c6ee16680407d3724e901c426db174a3125ba1c Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Thu, 7 Jan 2021 16:10:45 +0000 (17:10 +0100)]
Tag message must not include the signature
Signatures on tags are just tacked onto the end of the message.
Getting the message must not return the signature. Compare [1]
and [2] in C git, which both drop a signature at the end of an
object body.
Thomas Wolf [Wed, 6 Jan 2021 11:13:48 +0000 (12:13 +0100)]
Protocol V2: don't log spurious ACKs in UploadPack
UploadPack may log ACKs in protocol V2 that it doesn't send (if it
got a "done" from the client), or may log ACKs twice. That makes
packet log analysis difficult.
Add a new constructor to PacketLineOut to omit all logging from an
instance, and use it in UploadPack.
Change-Id: Ic29ef5f9a05cbcf5f4858a4e1b206ef0e6421c65 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Sun, 3 Jan 2021 22:07:18 +0000 (23:07 +0100)]
Protocol V2: respect MAX_HAVES only once we got at least one ACK
The negotiation in the git protocol contains a cutoff: if the client
has sent more than MAX_HAVES "have" lines without getting an ACK, it
gives up and sends a "done". MAX_HAVES is 256.
However, this cutoff must kick in only if at least one ACK has been
received. Otherwise the client may give up way too early, which makes
the server send all its history. See [1].
This was missed when protocol V2 was implemented for fetching in JGit
in commit 0853a241.
Compare also C git commit 0b07eecf6ed.[2] C git had the same bug.[3][4]
Matthias Sohn [Mon, 28 Dec 2020 01:21:17 +0000 (02:21 +0100)]
RepositoryCache: declare schedulerLock final
This fixes errorprone error [SynchronizeOnNonFinalField]: Synchronizing
on non-final fields is not safe: if the field is ever updated, different
threads may end up locking on different objects.
Matthias Sohn [Sun, 3 Jan 2021 15:08:24 +0000 (16:08 +0100)]
[spotbugs]: Fix potential NPE in FileSnapshot constructor
File#getParent can return null which caused this spotbugs warning.
FS.FileStoreAttributes#get already gets the parent directory if the
passed File is not a directory and checks for null. Hence there is no
need to get the parent directory in the FileSnapshot constructor.
Change-Id: I77f71503cffb05970ab8d9ba55b69c96c53098b9 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Sun, 3 Jan 2021 14:33:36 +0000 (15:33 +0100)]
FileSnapshot: don't try to read file attributes twice
If file doesn't exist set state to MISSING_FILE immediately. Doing that
by calling File#lastModified and File#length effectively does the same
since they set the value to 0 if the file doesn't exist.
Log an error if a different exception than NoSuchFileException is
caught.
Thomas Wolf [Sun, 2 Aug 2020 17:22:05 +0000 (19:22 +0200)]
Client-side protocol V2 support for fetching
Make all transports request protocol V2 when fetching. Depending on
the transport, set the GIT_PROTOCOL environment variable (file and
ssh), pass the Git-Protocol header (http), or set the hidden
"\0version=2\0" (git anon). We'll fall back to V0 if the server
doesn't reply with a version 2 answer.
A user can control which protocol the client requests via the git
config protocol.version; if not set, JGit requests protocol V2 for
fetching. Pushing always uses protocol V0 still.
In the API, there is only a new Transport.openFetch() version that
takes a collection of RefSpecs plus additional patterns to construct
the Ref prefixes for the "ls-refs" command in protocol V2. If none
are given, the server will still advertise all refs, even in protocol
V2.
BasePackConnection.readAdvertisedRefs() handles falling back to
protocol V0. It newly returns true if V0 was used and the advertised
refs were read, and false if V2 is used and an explicit "ls-refs" is
needed. (This can't be done transparently inside readAdvertisedRefs()
because a "stateless RPC" transport like TransportHttp may need to
open a new connection for writing.)
BasePackFetchConnection implements the changes needed for the protocol
V2 "fetch" command (stateless protocol, simplified ACK handling,
delimiters, section headers).
In TransportHttp, change readSmartHeaders() to also recognize the
"version 2" packet line as a valid smart server indication.
Adapt tests, and run all the HTTP tests not only with both HTTP
connection factories (JDK and Apache HttpClient) but also with both
protocol V0 and V2. The SSH tests are much slower and much more
focused on the SSH protocol and SSH key handling. Factor out two
very simple cloning and pulling tests and make those run with
protocol V2.
Bug: 553083
Change-Id: I357c7f5daa7efb2872f1c64ee6f6d54229031ae1 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Wed, 30 Dec 2020 16:17:44 +0000 (17:17 +0100)]
Use Map interface instead of ConcurrentHashMap class
On Android, the co-variant override of ConcurrentHashMap.keySet()
introduced in Java 8 was undone. [1] If compiled Java code calls that
co-variant override directly, one gets a NoSuchMethodError exception
at run-time on Android.
Making the code call that method via Map.keySet() side-steps this
problem.
This is similar to bug 496262, where the same problem cropped up when
compiling with Java 8 against a Java 7 target, but here we cannot use
bootclasspath. We build against Java 8, not against the Android version
of it.
Recent Android versions should have some bytecode "magic" that adds the
co-variant override in bytecode (see the commit referenced in [1]), but
on older Android version this problem may still occur. (Or perhaps the
"magic" is ineffective...) There are two pull requests on Github for
this problem, both from 2020, [2][3] while the Android commit [1] is
from March 2018. Apparently people still occasionally run into this
problem in the wild.
Thomas Wolf [Wed, 30 Dec 2020 09:12:34 +0000 (10:12 +0100)]
Fix NPE in DirCacheCheckout
If a file exists in head, merge, and the working tree, but not in
the index, and we're doing a force checkout, the checkout must be
an "update", not a "keep".
This is a follow-up on If3a9b9e60064459d187c7db04eb4471a72c6cece.
Bug: 569962
Change-Id: I59a7ac41898ddc1dd90e86b09b621a41fdf45667 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Tue, 29 Dec 2020 09:15:20 +0000 (10:15 +0100)]
GPG user ID matching: use case-insensitive matching
Although not mentioned in the GPG documentation at [1], GPG uses
case-insensitive matching also for the '<' (exact e-mail) and '@'
(partial e-mail) operators. Matching for '=' (full exact match) is
case-sensitive. Compare [2].
Thomas Wolf [Mon, 28 Dec 2020 21:53:50 +0000 (22:53 +0100)]
Don't export package from test bundle
Do not export the test-only package org.eclipse.jgit.transport from
bundle org.eclipse.jgit.ssh.jsch.test. Doing so can confuse the build
in Eclipse: other bundles that import this package may then also pick
up this test package, leading to non-test sources depending on test
sources and to build cycles.
Change-Id: I9f73b7a8d13bc4a2fe58bd2f1d33068164a13991 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Matthias Sohn [Fri, 4 Dec 2020 23:23:15 +0000 (00:23 +0100)]
[spotbugs] Fix potential NPE in OpenSshServerKeyDatabase
If oldLine is null #updateModifiedServerKey shouldn't be called since it
would derefence it. Spotbugs raised this as problem
RCN_REDUNDANT_NULLCHECK_WOULD_HAVE_BEEN_A_NPE. Fix it by checking if
oldLine is null before calling #updateModifiedServerKey.
Change-Id: I8a2000492986e52ce7dbe25f48b321c05fd371e4 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Fri, 4 Dec 2020 10:21:08 +0000 (11:21 +0100)]
[spotbugs] Fix FileReftableStack#equals to check for null
This fixes spotbugs warning NP_EQUALS_SHOULD_HANDLE_NULL_ARGUMENT.
This implementation violated the contract defined by
java.lang.Object.equals() because it did not check for null being passed
as the argument. All equals() methods should return false if passed a
null value.
Change-Id: I607f6979613d390aae2f3546b587f63133d6d73c Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Thu, 3 Dec 2020 01:04:36 +0000 (02:04 +0100)]
[spotbugs] Fix potential NPE in WorkingTreeIterator#isModified
File#list can return null. Fix the potential NPE by using Files#list
which is also faster since it retrieves directory entries lazily while
File#list retrieves them eagerly.
Change-Id: Idf4bda398861c647587e357326b8bc8b587a2584 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Thu, 3 Dec 2020 00:44:58 +0000 (01:44 +0100)]
[spotbugs] Fix potential NPE in FileRepository#convertToReftable
File#listFiles can return null. Use Files#list which does not return
null and should be faster since it's returning directory entries lazily
while File#listFiles fetches them eagerly.
Change-Id: I3bfe2a52278244fc469143692c06b05d9af0d0d4 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
If TransportException is thrown in the finally block of
downloadPackedObject() ensure that the exception handled in the previous
catch block isn't suppressed.
Change-Id: I23982a2b215e38f681cc1719788985e60232699a Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Martin Fick [Thu, 26 Apr 2018 16:53:57 +0000 (10:53 -0600)]
Split out loose object handling from ObjectDirectory
The ObjectDirectory class manages the interactions for the entire object
database, this includes loose objects, packfiles, alternates, and
shallow commits. To help reduce the complexity of this class, abstract
some of the loose object specific details into a class which understands
just this, leaving the ObjectDirectory to focus more on the interactions
between the different mechanisms.
Change-Id: I39f3a74d6308f042a2a2baa57769f4acde5ba5e0 Signed-off-by: Martin Fick <mfick@codeaurora.org> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Martin Fick [Wed, 25 Apr 2018 17:59:21 +0000 (11:59 -0600)]
Split out packfile handling from ObjectDirectory
The ObjectDirectory class manages the interactions for the entire object
database, this includes loose objects, packfiles, alternates, and
shallow commits. To help reduce the complexity of this class, abstract
some of the packfile specific details into a class which understands
just this, leaving the ObjectDirectory to focus more on the interactions
between the different mechanisms.
Change-Id: I5cc87b964434b0afa860b3fe23867a77b3c3a4f2 Signed-off-by: Martin Fick <mfick@codeaurora.org> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Thomas Wolf [Tue, 8 Dec 2020 14:45:35 +0000 (15:45 +0100)]
TagCommand: propagate NO_CHANGE information
Some clients may wish to allow NO_CHANGE lightweight tag updates
without setting the force flag. (For instance EGit does so.)
Command-line git does not allow this.
Propagate the RefUpdate result via the RefAlreadyExistsException.
That way a client has the possibility to catch it and check the
failure reason without having to parse the exception message, and
take appropriate action, like ignoring the exception on NO_CHANGE.
Change-Id: I60e7a15a3c309db4106cab87847a19b6d24866f6 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>