Thomas Wolf [Thu, 24 May 2018 14:26:36 +0000 (16:26 +0200)]
Don't prune symbolic refs when fetch.prune = true
The canonical implementation also doesn't. Compare current
code in remote.c, function get_stale_heads_cb.[1] Not handling
symrefs in this case was introduced in canonical git in [2]
in 2008.
Matthias Sohn [Tue, 22 May 2018 21:00:37 +0000 (23:00 +0200)]
Merge branch 'master' into stable-5.0
* master:
DescribeCommand: Refactor to not use deprecated Repository#peel
Repository: Deprecate #peel method
Repository: Make #exactRef and #findRef final
Skip ignored directories in FileTreeIterator
Repository: Deprecate getTags method
InfoRefsServlet: Refactor to not use deprecated methods
RefAdvertiser: Add send(Collection<Ref>) and deprecate send(Map<String, Ref>)
Remove deprecated Repository#notifyIndexChanged
Implementors should override Repository#notifyIndexChanged(boolean)
Revive Repository#notifyIndexChanged()
Remove further unnecessary 'final' keywords
Execute AdvertiseRefsHook only for protocol v0 and v1
Add protocol v2 support in "jgit daemon"
Teach UploadPack "ofs-delta" in "fetch"
Teach UploadPack "include-tag" in "fetch"
Avoid using #refs in UploadPack#sendPack
FileRepository: Don't use deprecated RefDatabase#getRefs(String)
BatchRefUpdate: Don't use deprecated RefDatabase#getRefs(String)
Change-Id: I16c5da62d09262d3f4070aa0f466dd6c8352b5ea Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
David Pursehouse [Tue, 22 May 2018 02:27:30 +0000 (11:27 +0900)]
Repository: Deprecate #peel method
Callers should use getRefDatabase().peel(ref) instead since it
doesn't swallow the IOException.
Adapt all trivial callers to user the alternative.
DescribeCommand still uses the deprecated method and is not adapted in
this change since it will require more refactoring to add handling of
the IOException.
Change-Id: I14d4a95a5e0570548753b9fc5c03d024dc3ff832 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
David Pursehouse [Tue, 22 May 2018 00:47:16 +0000 (09:47 +0900)]
Repository: Make #exactRef and #findRef final
This means less cognitive overhead for both implementors and callers,
since this way we can guarantee that they are always synonyms for
RefDatabase#exactRef and RefDatabase#findRef, respectively.
Change-Id: Ic8aeb52fc7ed65672f3f6cd1da39a66908d88baa Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Thomas Wolf [Tue, 27 Mar 2018 20:22:09 +0000 (22:22 +0200)]
Skip ignored directories in FileTreeIterator
Make FileTreeIterator not enter ignored directories by default. We
only need to enter ignored directories if we do some operation against
git, and there is at least one tracked file underneath an ignored
directory.
Walking ignored directories should be avoided as much as possible as
it is a potential performance bottleneck. Some projects have a lot of
files or very deep hierarchies in ignored directories; walking those
may be costly (especially so on Windows). See for instance also bug
500106.
Provide a FileTreeIterator.setWalkIgnoredDirectories() operation to
force the iterator to iterate also through otherwise ignored
directories. Useful for tests (IgnoreNodeTest, CGitIgnoreTest), or
to implement things like "git ls-files --ignored".
Add tests in DirCacheCheckoutTest, and amend IndexDiffTest to test a
little bit more.
Bug: 388582
Change-Id: I6ff584a42c55a07120a4369fd308409431bdb94a Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Jonathan Nieder [Fri, 18 May 2018 16:16:16 +0000 (09:16 -0700)]
Implementors should override Repository#notifyIndexChanged(boolean)
Declare Repository#notifyIndexChanged() final and
Repository#notifyIndexChanged(boolean) abstract to force implementors
to switch to overriding the latter method. This makes Repository less
error-prone to extend since implementors no longer need to remember to
override one of those two methods.
Change-Id: I721db0f4a4865db3b35212ee0a2045d5b31c96af Signed-off-by: Jonathan Nieder <jrn@google.com>
Jonathan Nieder [Fri, 18 May 2018 16:15:30 +0000 (09:15 -0700)]
Revive Repository#notifyIndexChanged()
e9e150fdd24d (Store in IndexChangedEvent if it was caused by JGit
itself, 2018-05-13) modified Repository#notifyIndexChanged to take a
boolean argument to indicate whether the index change happened under
the current process's control or externally, for use by EGit. In
other words, the function signature changed from
public abstract void notifyIndexChanged();
to
public abstract void notifyIndexChanged(boolean internal);
Callers outside JGit itself notifying a Repository about index changes
are expected to be rare, so this is not very disruptive to them. In
most cases they would be notifying about changes that they made
themselves, so treating their notifyIndexChanged() calls as
notifyIndexChanged(true) should be relatively safe.
Implementors have the opposite problem: adding the new "abstract void
notifyIndexChanged(boolean)" method means they are obligated to
override it. Add a default implementation that calls their existing
override of notifyIndexChanged() to make their migration easier.
The main downside is that authors of new Repository subclasses that
do not realize they need to override notifyIndexChanged would end up
with a default implementation which calls notifyIndexChanged(true),
in turn calling notifyIndexChanged() again and so on, resulting in
StackOverflowException. Add an implementors' note to the class
Javadoc to avoid this issue. A followup commit will force
implementors to adapt to the new API by changing the methods to
@Deprecated
public final void notifyIndexChanged() {
notifyIndexChanged(true);
}
public abstract void notifyIndexChanged(boolean internal);
Change-Id: I7d014890ee19abf283ea824d9baa9044bfdde130 Signed-off-by: Jonathan Nieder <jrn@google.com>
Masaya Suzuki [Mon, 14 May 2018 21:13:11 +0000 (14:13 -0700)]
Execute AdvertiseRefsHook only for protocol v0 and v1
Refs are not advertised as part of the protocol v2 capability
advertisement. Don't call AdvertiseRefsHook.
Noticed because many implementations of AdvertiseRefsHook read all
refs in order to call UploadPack#setAdvertisedRefs, causing the
capability advertisement to be as slow as a v0 ref advertisement with
some RefDatabase implementations.
Such an AdvertiseRefsHook is of dubious utility (a better place to
determine which refs are advertised is in the RefDatabase
implementation itself, as in Gerrit), but at any rate since it's not
bringing about any benefit here, we can skip the hook call.
Jonathan Tan [Sat, 24 Feb 2018 00:55:03 +0000 (16:55 -0800)]
Add protocol v2 support in "jgit daemon"
With this patch, a server spawned by "jgit daemon" can be accessed using
protocol v2 from a Git client that supports it (for example, "git" with
the appropriate patches). This is only activated if the repository's
config has "protocol.version" be 2.
This required a change to the package-private #execute methods in
DaemonService to allow passing of extra parameters.
This has been tested with a patched Git.
Change-Id: Icf043efec7ce956d72b075fc6dc7a87d5a2da82a Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Jonathan Tan [Tue, 13 Mar 2018 18:07:36 +0000 (11:07 -0700)]
Teach UploadPack "include-tag" in "fetch"
Add support for the "include-tag" parameter in the "fetch" command in
the fetch-pack/upload-pack protocol v2.
In order to determine which tags to include, only objects pointed to by
refs starting with "refs/tags/" are checked. This restriction is for
performance reasons and to match the behavior of Git (see add_ref_tag()
in builtin/pack-objects.c).
Change-Id: I7d70aa09bcc8a525218ff1559e286c2a610258ca Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jonathan Nieder <jrn@google.com>
Jonathan Tan [Mon, 23 Apr 2018 20:03:04 +0000 (13:03 -0700)]
Avoid using #refs in UploadPack#sendPack
When OPTION_INCLUDE_TAG is set, UploadPack#sendPack uses the #refs
instance variable as a source of information of tags. A subsequent patch
will need to supply this information to #sendPack without
modifying #refs, so refactor #sendPack to take in this information
through a parameter instead.
Note that prior to this patch, #refs was used twice in #sendPack: once
to generate the argument to PackWriter#setTagTargets, and once to
determine if any tags need to be included in the packfile. This patch
only updates the latter use, since the former is meant not only for
"true" tag targets but any object that should be hoisted earlier during
packing (see the documentation of PackWriter#setTagTargets).
This patch does not introduce any functionality change.
Change-Id: I70ed65a1041334abeda8d4bac98cce7cae7efcdf Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jonathan Nieder <jrn@google.com>
Introduce new RawText constructor and RawParseUtils.lineMapOrBinary
This makes binary detection exact in ResolveMerger and DiffFormatter
This has the same intention as
Id4342a199628d9406bfa04af1b023c27a47d4014 but preserves backward
compatibility of the signature of RawParseUtils.lineMap.
Change-Id: Ia24a4e716592bab3363ae24e3a46315a7511154f Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Fri, 11 May 2018 12:06:53 +0000 (14:06 +0200)]
Use a secure random generator to seed nonce for digest authentication
https://tools.ietf.org/html/rfc7616 says:
5.12. Parameter Randomness
The security of this protocol is critically dependent on the
randomness of the randomly chosen parameters, such as client and
server nonces. These should be generated by a strong random or
properly seeded pseudorandom source (see [RFC4086]).
Change-Id: I4da5316cb1eb3f59ae06c070ce1c3335e9ee87d6 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
On a local non-NFS filesystem the .git/config file will be orphaned if
it is replaced by a new process while the current process is reading the
old file. The current process successfully continues to read the
orphaned file until it closes the file handle.
Since NFS servers do not keep track of open files, instead of orphaning
the old .git/config file, such a replacement on an NFS filesystem will
instead cause the old file to be garbage collected (deleted). A stale
file handle exception will be raised on NFS clients if the file is
garbage collected (deleted) on the server while it is being read. Since
we no longer have access to the old file in these cases, the previous
code would just fail. However, in these cases, reopening the file and
rereading it will succeed (since it will open the new replacement file).
Since retrying the read is a viable strategy to deal with stale file
handles on the .git/config file, implement such a strategy.
Since it is possible that the .git/config file could be replaced again
while rereading it, loop on stale file handle exceptions, up to 5 extra
times, trying to read the .git/config file again, until we either read
the new file, or find that the file no longer exists. The limit of 5 is
arbitrary, and provides a safe upper bounds to prevent infinite loops
consuming resources in a potential unforeseen persistent error
condition.
Change-Id: I6901157b9dfdbd3013360ebe3eb40af147a8c626 Signed-off-by: Nasser Grainawi <nasser@codeaurora.org> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Jonathan Nieder [Fri, 4 May 2018 22:09:22 +0000 (15:09 -0700)]
Replace http://errorprone.info with https://errorprone.info
That site serves from https now.
Reported-by: Nicholas Glorioso <glorioso@google.com>
Change-Id: I2150a18425a1fe3ab5a022882ffe06ccbde17f16 Signed-off-by: Jonathan Nieder <jrn@google.com>
- suppress "Failed to load class "org.slf4j.impl.StaticLoggerBinder"
output
- add org.eclipse.jgit.pgm.test tests to bazel build
- supply package information so that "jgit version" can work
Jonathan Nieder [Mon, 30 Apr 2018 00:15:18 +0000 (17:15 -0700)]
Rename RefDatabase#getAllRefs to getRefs
This is easier to type and makes it clearer that it only returns refs
and not the pseudo-refs returned by getAdditionalRefs. It also puts us
in a better position to add a method to the Repository class later
that delegates to this one without colliding with the existing
Repository#getAllRefs method that returns a Map<String, Ref>.
While at it, clarify the javadoc of getRefs and hasRefs to make the
same point.
Suggested-by: David Pursehouse <david.pursehouse@gmail.com>
Change-Id: I23497c66ac7b5e0c987b91efbc9e9cc29924ca66 Signed-off-by: Jonathan Nieder <jrn@google.com>
David Pursehouse [Fri, 27 Apr 2018 12:49:08 +0000 (21:49 +0900)]
RefDatabase: add hasRefs convenience method
Callers can now say:
db.getRefDatabase().hasRefs()
rather than the more verbose:
!db.getRefDatabase().getAllRefs().isEmpty()
The default implementation simply uses getAllRefs().isEmpty(), but a
derived class could possibly override the method with a more efficient
implementation.
Change-Id: I5244520708a1a7d9adb351f10e43fc39d98e22a1 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
David Turner [Fri, 27 Apr 2018 19:04:18 +0000 (15:04 -0400)]
Fix comparison order in AnyObjectId
The previous version suggested testing w2 first because w1 was used
for hashing, but in fact, hashCode returns w2. The order (w3, w4, w5,
w1, w2) might be better on 64-bit processors too, since it allows
comparing 64 bits at a time, although perhaps on a modern SIMD
processor, the entire 160 bytes would be compared at once anyway.
Change-Id: Ieb69606d3c1456aeff36bffe99a71587ea76e977 Signed-off-by: David Turner <dturner@twosigma.com>
Eclipse warns that DfsReader should be managed by try-with-resource.
As described in 1484d6e (LargePackedWholeObject: Do not reuse released
inflater, 2018-04-26), the DfsReader is owned and closed by the
PackInputStream or explicitly closed in the try block's finally.
Suppress the warning with a brief explanatory comment.
Change-Id: I4187c935742072f3ee7f2d3551a6a98d40fc2702 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Jonathan Nieder [Thu, 26 Apr 2018 22:14:50 +0000 (15:14 -0700)]
LargePackedWholeObject: Do not reuse released inflater
LargePackedWholeObject.openStream produces a stream that allows
reading a large object. This stream holds a DfsReader that takes care
of caching delta bases etc and in particular holds zlib Inflater for
use while reading the each delta in the packfile.
At DfsReader creation time, the Inflater is acquired from a global
InflaterCache to avoid initialization overhead in case there is an
existing Inflater available for reuse. When done with the Inflater,
the DfsReader is responsible for returning it to the cache for reuse.
The DfsReader is AutoClosable to remind the caller to close it and
release the Inflater when finished with it.
b0ac5f9c8907a4034612543a92eb465e88a9c6f2 (LargePackedWholeObject:
Refactor to open DfsReader in try-with-resource, 2018-04-11) tried to
clarify the lifetime of the DfsReader but was too aggressive: when
this function returns, PackInputStream owns the DfsReader and is
already going to release it. Worse, the returned InflaterInputStream
holds a reference to the DfsReader's inflater, making releasing the
DfsReader not only unnecessary but unsafe.
The Inflater gets released into the InflaterCache's pool, to be
acquired by another caller that uses it concurrently with the
InflaterInputStream. This results in errors, such as
java.util.zip.ZipException: incorrect header check
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:164)
at java.util.zip.InflaterInputStream.skip(InflaterInputStream.java:208)
at java.io.BufferedInputStream.skip(BufferedInputStream.java:377)
and
java.util.zip.DataFormatException: incorrect header check
at java.util.zip.Inflater.inflateBytes(Native Method)
at java.util.zip.Inflater.inflate(Inflater.java:259)
at org.eclipse.jgit.internal.storage.dfs.DfsReader.inflate(DfsReader.java:783)
at org.eclipse.jgit.internal.storage.dfs.DfsPackFile.decompress(DfsPackFile.java:420)
at org.eclipse.jgit.internal.storage.dfs.DfsPackFile.load(DfsPackFile.java:767)
and
Caused by: java.util.zip.ZipException: incorrect header check
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:164)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at org.eclipse.jgit.lib.ObjectStream$Filter.read(ObjectStream.java:219)
at org.eclipse.jgit.util.IO.readFully(IO.java:233)
at org.eclipse.jgit.transport.PackParser.checkObjectCollision(PackParser.java:1173)
Verified in production. It should be possible to make a
straightforward unit test for this using the InflaterCache state but
that can wait for a followup commit.
Matthias Sohn [Wed, 25 Apr 2018 01:09:05 +0000 (03:09 +0200)]
Use eclipse compiler in Maven build
Found instructions for configuring maven-compiler-plugin with ecj in
[1]. Verified that ecj run in this way raises build errors when executed
on commit d3ef5213.
Define profiles "ecj" for using Eclipse compiler and "javac" for using
javac including errorprone. By default ecj will be used.
use ecj:
$ mvn -Pecj clean install
use javac:
$ mvn -Pjavac clean install
TODO: find out how to run ecj with errorprone from Maven.
Michael Keppler [Mon, 23 Apr 2018 20:00:27 +0000 (22:00 +0200)]
File compile and API errors in JGit
* Photon throws null analysis errors on the repeated invocation of those
previously null checked methods. Extract them to a local variable to
avoid this. (the null analysis is configured in project properties)
* setUseProtocolV2() misses @since tag. Problem was introduced with 332bc611249d21f9b604f2c0207bf0bdfbfc3a78. Might be caused by the long
delay of 2 months from creation to merging.
Change-Id: Ibbb1a1580b604b8e7cd4bf7edc4643e292b6b4a8 Signed-off-by: Michael Keppler <Michael.Keppler@gmx.de>
Jonathan Tan [Fri, 23 Feb 2018 22:07:02 +0000 (14:07 -0800)]
Teach UploadPack basic "fetch" command
Add basic support for the "fetch" command in the fetch-pack/upload-pack
protocol v2. This patch teaches "have" and "done".
The protocol specification (Documentation/technical/protocol-v2.txt in
the Git project) states:
want <oid>
Indicates to the server an object which the client wants to
retrieve. Wants can be anything and are not limited to
advertised objects.
It is unspecified whether the server should respect the
uploadpack.allowtipsha1inwant option etc. when serving packfiles. This
patch is conservative in that the server respects them.
Change-Id: I3dbec172239712ef9286a15b8407e86b87ea7863 Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Jonathan Nieder [Fri, 20 Apr 2018 23:34:15 +0000 (16:34 -0700)]
UploadPack: Don't buffer ls-refs output
Once we have read the user's entire command, there is no more need to
buffer our response --- even the strictest servlet engine allows
writing output once the input has been consumed. Noticed when the
analogous code in the "fetch" command (introduced in a later patch)
overflowed its buffer:
java.lang.OutOfMemoryError
at java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123)
[...]
at org.eclipse.jgit.transport.UploadPack.sendPack(UploadPack.java:1905)
at org.eclipse.jgit.transport.UploadPack.sendPack(UploadPack.java:1741)
at org.eclipse.jgit.transport.UploadPack.fetchV2(UploadPack.java:1001)
at org.eclipse.jgit.transport.UploadPack.serviceV2(UploadPack.java:1030)
at org.eclipse.jgit.transport.UploadPack.upload(UploadPack.java:726)
at org.eclipse.jgit.http.server.UploadPackServlet.doPost(UploadPackServlet.java:195)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:637)
Jonathan Tan [Thu, 22 Feb 2018 18:24:19 +0000 (10:24 -0800)]
Implement ls-refs in UploadPack
Implement support for Git protocol v2's "ls-refs" command and its
"symrefs" and "peel" parameters.
This adds support for this command to UploadPack but the git://,
ssh://, and git:// transports do not make use of it yet. That will
have to wait for later patches.
Change-Id: I8abc6bcc6ed4a88c165677ff1245625aca01267b Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jonathan Nieder <jrn@google.com>
Jonathan Tan [Thu, 22 Feb 2018 18:24:19 +0000 (10:24 -0800)]
Implement protocol v2 with no capabilities in UploadPack
Add initial support for protocol v2 of the fetch-pack/upload-pack
protocol. This protocol is described in the Git project in
"Documentation/technical/protocol-v2.txt".
This patch adds support for protocol v2 (without any capabilities) to
UploadPack. Adaptations of callers to make use of this support will
come in subsequent patches.
[jn: split from a larger patch; tweaked the API to make UploadPack
handle parsing the extra parameters and config instead of requiring
each caller to do such parsing]
Change-Id: I79399fa0dce533fdc8c1dbb6756748818cee45b0 Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jonathan Nieder <jrn@google.com>
Jonathan Tan [Thu, 22 Feb 2018 18:24:19 +0000 (10:24 -0800)]
PacketLineIn, PacketLineOut: Add support for delim-pkt
Most pkt-lines (data-pkts) have the form
pkt-len pkt-payload
where pkt-len is a string of 4 hexadecimal digits representing the
size in bytes of the pkt-line. Since this size includes the size of
the pkt-len, no data-pkt has a length less than 4.
A pkt-line with a length field less than 4 can thus be used for
other purposes. In Git protocol v1, the only such pkt-line was
flush-pkt = "0000"
which was used to mark the end of a stream. Protocol v2 (see
Documentation/technical/protocol-v2.txt in git.git) introduces a
second special pkt-line type:
delim-pkt = "0001"
used to mark the end of a section within a stream, for example to
separate capabilities from the content of a command.
[jn: split out from a larger patch that made use of this support]
Change-Id: I10e7824fa24ed74c4f45624bd490bba978cf5c34 Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jonathan Nieder <jrn@google.com>
Jonathan Tan [Fri, 6 Apr 2018 22:50:01 +0000 (15:50 -0700)]
Add RefDatabase#getRefsByPrefix method
The existing RefDatabase#getRefs abstract method (to be implemented by
ref database backends) has the following issues:
- It returns a map with a key (the name of the ref with the prefix
removed) which is potentially superfluous (it can be derived by the
caller if need be) and confusing (in that the prefix is removed).
- The prefix is required to end with a '/', but some backends (e.g.
reftable) have fast search by prefix regardless of what the last
character of the prefix is.
Add a new method #getRefsByPrefix that does not have these issues. This
is non-abstract with a default implementation that uses #getRefs (for
backwards compatibility), but ref database backends can reimplement it.
This also prepares for supporting "ref-prefix" in the "ls-refs" command
in the fetch-pack/upload-pack protocol v2, which does not require that
the prefix end with a '/'.
Change-Id: I4c92f852e8c1558095dd460b5fd7b602c1d82df1 Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jonathan Nieder <jrn@google.com>
Dave Borowitz [Thu, 12 Apr 2018 15:43:50 +0000 (11:43 -0400)]
Push: Ensure ref updates are processed in input order
Various places on the client side of the push were creating unordered
maps and sets of ref names, resulting in ReceivePack processing commands
in an order other than what the client provided. This is normally not
problematic for clients, who don't typically care about the order in
which ref updates are applied to the storage layer.
However, it does make it difficult to write deterministic tests of
ReceivePack or hooks whose output depends on the order in which commands
are processed, for example if informational per-ref messages are written
to a sideband.[1]
Add a test that ensures the ordering of commands both internally in
ReceivePack and in the output PushResult.
David Pursehouse [Wed, 11 Apr 2018 12:31:07 +0000 (21:31 +0900)]
DirCache: Use constant from StandardCharsets
Instead of hard-coding the encoding name, use the constant from
StandardCharsets. As a result it is no longer necessary to catch
the UnsupportedEncodingException.
Change-Id: I3cb6de921a78e05e2a894c220e0d5a5c85e172cc Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Thomas Wolf [Sun, 18 Mar 2018 22:29:59 +0000 (23:29 +0100)]
Significantly speed up FileTreeIterator on Windows
Getting attributes of files on Windows is an expensive operation.
Windows stores file attributes in the directory, so they are
basically available "for free" when a directory is listed. The
implementation of Java's Files.walkFileTree() takes advantage of
that (at least in the OpenJDK implementation for Windows) and
provides the attributes from the directory to a FileVisitor.
Using Files.walkFileTree() with a maximum depth of 1 is thus a
good approach on Windows to get both the file names and the
attributes in one go.
In my tests, this gives a significant speed-up of FileTreeIterator
over the "normal" way: using File.listFiles() and then reading the
attributes of each file individually. The speed-up is hard to
quantify exactly, but in my tests I've observed consistently 30-40%
for staging 500 files one after another, each individually, and up
to 50% for individual TreeWalks with a FileTreeIterator.
On Unix, this technique is detrimental. Unix stores file attributes
differently, and getting attributes of individual files is not costly.
On Unix, the old way of doing a listFiles() and getting individual
attributes (both native operations) is about three times faster than
using walkFileTree, which is implemented in Java.
Therefore, move the operation to FS/FS_Win32 and call it from
FileTreeIterator, so that we can have different implementations
depending on the file system.
A little performance test program is included as a JUnit test (to be
run manually).
While this does speed up things on Windows, it doesn't solve the basic
problem of bug 532300: the iterator always gets the full directory
listing and the attributes of all files, and the more files there are
the longer that takes.
Bug: 532300
Change-Id: Ic5facb871c725256c2324b0d97b95e6efc33282a Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Matthias Sohn [Sat, 24 Mar 2018 21:05:53 +0000 (22:05 +0100)]
Add missing @since tags for new API
These methods were added after 4.11 so strictly speaking they violate
semantic versioning since new API requires increasing the minor version
number. Hence pretend these methods were introduced in 5.0
Change-Id: I7793ead16577dc1f2ddea09ba6b055103c783555 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Sat, 24 Mar 2018 21:14:15 +0000 (22:14 +0100)]
Fix API problem filter warnings
Silence warnings for bundles which haven't broken API since 4.11 but
we increased major version to 5 since we always use the same version
for all jgit bundles
Change-Id: If4f9a6aa4ef21f9b511946c5fc4bd7c0af6094c4 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
David Pursehouse [Tue, 20 Mar 2018 02:43:15 +0000 (11:43 +0900)]
DfsInserter#openStream: Suppress resource warning about DfsReader
DfsReader is not opened in a try-with-resource because in the case where
the method returns an ObjectStream.Filter, the DfsReader should only be
closed from within the Filter's close() method.
Suppress the resource warning.
Change-Id: Ifcaf5e4a326bd1d03c6331b476c645ca43943b34 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
David Pursehouse [Tue, 20 Mar 2018 02:28:18 +0000 (11:28 +0900)]
CloneCommand: Suppress resource warning about Repository
Repository is not opened in try-with-resource because it is wrapped
in a Git instance which should be closed by the caller. On exeptions
during fetch, it is explicitly closed in the catch blocks.
Suppress the warning with an explanatory comment.
Change-Id: Ib32c74ce39bb810077ab84db33002bdde806f3b6 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>