This resolves a regression introduced in fef78212.
Change-Id: Ibb4521635a87012520566efc70870c59d11be874
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Include filekey file attribute when comparing FileSnapshots
Due to finite filesystem timestamp resolution the last modified
timestamp of files cannot detect file changes which happened in the
immediate past (less than one filesystem timer tick ago).
Some filesystems expose unique file identifiers, e.g. inodes in Posix
filesystems which are named filekeys in Java's BasicFileAttributes. Use
them as another means to detect file modifications based on stat
information.
Running git gc on a repository yields a new packfile with the same id as
a packfile which existed before the gc if these packfiles contain the
same set of objects. The content of the old and the new packfile might
differ if a different PackConfig was used when writing the packfile.
Considering filekeys in FileSnapshot may help to detect such packfile
modifications.
Bug: 546891
Change-Id: I711a80328c55e1a31171d540880b8e80ec1fe095
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Measure file timestamp resolution used in FileSnapshot
FileSnapshot.notRacyClean() assumed a worst case filesystem timestamp
resolution of 2.5 sec (FAT has a resolution of 2 sec). Instead measure
timestamp resolution to avoid unnecessary IO caused by false positives
in detecting the racy git problem caused by finite filesystem timestamp
resolution [1].
Cache the measured resolution per FileStore since timestamp resolution
depends on the respective filesystem type. If timestamp resolution
cannot be measured or fails due to an exception fallback to the worst
case FAT timestamp resolution and avoid caching this value.
Add a 10% safety margin in FileSnapshot.notRacyClean(), though running
FsTest.testFsTimestampResolution() 1000 times which is not using a
safety margin didn't fail on Mac using APFS and Java 8, 11, 12.
Measured Java file timestamp resolution: [2]
[1] https://github.com/git/git/blob/master/Documentation/technical/racy-git.txt
[2] https://docs.google.com/spreadsheets/d/1imy0y6WmRqBf0kjCxzxj2X7M50eIVfa7oaUIzEOHmjo
Bug: 546891
Change-Id: I493f3b57b6b306285ffa7d392339d253e5966ab8
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* fix equals() and hashCode() methods to consider size
* fix toString() to show size
Change-Id: I5485db55eda5110121efd65d86c7166b3b2e93d0
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Revert 4678f4b and provide another solution for bug 467631
Making gitlinks and folders match in a tree walk was the wrong
approach to fix bug 467631. The problem is that in such a conflict
the tree walk may then not descend into the folder.
Revert the changes to Paths.java and PathsTest.java from commit
4678f4b. Instead test for the problem case from bug 467631 explicitly
in IndexDiff. Add Daniel's test case from bug 545162, and add yet
another test case for DiffEntry.scan() that covers the problem
originally reported in bug 545162.
Bug: 545162
Change-Id: Ie2214c5d5ee32ac6596b621f0f1c7b86d38fa9b7
Also-by: Daniel Veihelmann <daniel.veihelmann@gmail.com>
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Keep track of the original cause for a packfile invalidation.
It is needed for the sysadmin to understand if there is a real
underlying filesystem problem and repository corruption or if it is
simply a consequence of a concurrency of Git operations (e.g. repack
or GC).
Change-Id: I06ddda9ec847844ec31616ab6d17f153a5a34e33
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fix pack files scan when filesnapshot isn't modified
Do not reload packfiles when their associated filesnapshot is not
modified on disk compared to the one currently stored in memory.
Fix the regression introduced by fef78212 which, in conjunction with
core.trustfolderstats = false, caused any lookup of objects inside
the packlist to loop forever when the object was not found in the pack
list.
Bug: 546190
Change-Id: I38d752ebe47cefc3299740aeba319a2641f19391
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fix GC to delete empty fanout directories after repacking
The prune method did not delete empty fanout directories when loose
objects moved to a new pack file but only when loose unreferenced
objects were pruned.
Change-Id: Ia068f4914c54d9cf9f40b75e8ea50759402b5000
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
When reading from a packfile, make sure that is valid
and has a non-null file-descriptor.
Because of concurrency between a thread invalidating a packfile
and another trying to read it, the read() may result into a NPE
that won't be able to be automatically recovered.
Throwing a PackInvalidException would instead cause the packlist
to be refreshed and the read to eventually succeed.
Bug: 544199
Change-Id: I27788b3db759d93ec3212de35c0094ecaafc2434
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Move throw of PackInvalidException outside the catch
When a packfile is invalid, throw an exception explicitly
outside any catch scope, so that is not accidentally caught
by the generic catch-all cause, which would set the packfile
as valid again.
Flagging an invalid packfile as valid again would have
dangerous consequences such as the corruption of the in-memory
packlist.
Bug: 544199
Change-Id: If7a3188a68d7985776b509d636d5ddf432bec798
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Do not redundantly call File.lastModified() for extracting the
timestamp of the PackFile but rather use consistently the FileSnapshot
which reads all file attributes in a single bulk call.
Change-Id: I932675ae4fe56dcd3833dac249816f097303bb09
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Due to finite filesystem timestamp resolution the last modified
timestamp of files cannot detect file changes which happened in the
immediate past (less than one filesystem timer tick ago).
Read and consider file size also, so that differing file size can help
to more accurately detect file changes without reading the file content.
Use bulk read to avoid multiple stat calls to retrieve file attributes.
Change-Id: I974288fff78ac78c52245d9218b5639603f67a46
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The pack reload mechanism from the filesystem works only by name
and does not check the actual last modified date of the packfile.
This lead to concurrency issues where multiple threads were loading
and removing from each other list of packfiles when one of those
was failing the checksum.
Rely on FileSnapshot rather than directly checking lastModified
timestamp so that more checks can be performed.
Bug: 544199
Change-Id: I173328f29d9914007fd5eae3b4c07296ab292390
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
In case of concurrent pack file access, threads may wait on the idx()
function even for already open files. This happens especially with a
slow file system.
Performance numbers are listed in the bug report.
Bug: 543739
Change-Id: Iff328d347fa65ae07ecce3267d44184161248978
Signed-off-by: Juergen Denner <j.denner@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
PackFile: report correct message for checksum mismatch
When the packfile checksum does not match the expected one
report the correct checksum error instead of reporting that
the number of objects is incorrect.
Change-Id: I040f36dacc4152ae05453e7acbf8dfccceb46e0d
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
(cherry picked from commit 436c99ce59)
Externalize the message and log the pack file with absolute path.
Change-Id: I019052dfae8fd96ab67da08b3287d699287004cb
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
(cherry picked from commit 9665d86ba1)
ObjectDirectory: extra logging on packfile exceptions
Display extra logging, including the exception with the associated
stacktrace, whenever a packFile can't be read and thus removed
from the packlist.
Change-Id: I97a4e31dc427bfcc0baae438dcbe2dcd4704b824
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
(cherry picked from commit 962babc4b2)
PackFile: report correct message for checksum mismatch
When the packfile checksum does not match the expected one
report the correct checksum error instead of reporting that
the number of objects is incorrect.
Change-Id: I040f36dacc4152ae05453e7acbf8dfccceb46e0d
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Externalize the message and log the pack file with absolute path.
Change-Id: I019052dfae8fd96ab67da08b3287d699287004cb
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
ObjectDirectory: extra logging on packfile exceptions
Display extra logging, including the exception with the associated
stacktrace, whenever a packFile can't be read and thus removed
from the packlist.
Change-Id: I97a4e31dc427bfcc0baae438dcbe2dcd4704b824
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
UploadPack: Filter refs used for deepen-not resolution
Clients can use --shallow-exclude to obtain information about what
commits are reachable from refs they are not supposed to be able to
see. Plug the hole by allowing the AdvertiseRefsHook and RefFilter to
take effect here, too.
Change-Id: If2b8e95344fa49e10a6a202144318b60d002490e
Signed-off-by: Jonathan Nieder <jrn@google.com>
The AdvertiseRefsHook can be called twice if the following conditions
hold:
1. This AdvertiseRefsHook doesn't set this.refs.
2. getAdvertisedOrDefaultRefs is called after getFilteredRefs.
For example, this can happen when fetchV2 is called after lsRefsV2
when using a stateful bidirectional transport.
The second call does not accomplish anything useful. Guard it with
'if (!advertiseRefsHookCalled)' to avoid wasted work.
Reported-by: Jonathan Tan <jonathantanmy@google.com>
Change-Id: Ib746582e4ef645b767a5b3fb969596df99ac2ab5
Signed-off-by: Jonathan Nieder <jrn@google.com>
UploadPack: Filter refs used for want-ref resolution
In the longer term, we can add support for this to the
RequestValidator interface. In the short term, this is a minimal
band-aid to ensure any refs the client requests are visible to the
client.
Change-Id: I0683c7a00e707cf97eef6c6bb782671d0a550ffe
Reported-by: Ivan Frade <ifrade@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
UploadPack: Defer want-ref resolution to after parsing
ProtocolV2Parser explains:
// TODO(ifrade): This validation should be done after the
// protocol parsing. It is not a protocol problem asking for an
// unexisting ref and we wouldn't need the ref database here.
Do so. This way all ref database accesses are in one place, in the
UploadPack class.
No user-visible change intended --- this is just to make the code
easier to manipulate.
Change-Id: I68e87dff7b9a63ccc169bd0836e8e8baaf5d1048
Signed-off-by: Jonathan Nieder <jrn@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
AdvertiseRefsHook is used to limit the visibility of the refs in Gerrit.
If this hook is not called, then all refs are treated as visible.
In protocol v2, the hook is not called, causing the server to advertise
all refs. This bug was introduced in v5.0.0.201805221745-rc1~1^2~9
(Execute AdvertiseRefsHook only for protocol v0 and v1, 2018-05-14).
Even before then, the hook was not called in requests after the
capability advertisement, so in transports like HTTP that do not retain
state between round-trips, the server would advertise all refs in
response to an ls-refs (ls-remote) request.
Fix both cases by using getAdvertisedOrDefaultRefs to retrieve the
advertised refs in lsRefs, ensuring the hook is called in all cases that
use its result.
[jn: backported to stable-5.0; split out from a larger patch that also
fixes protocol v0; avoided filtering this.refs by ref prefix]
Change-Id: I64bce0e72d15b90baccc235c067e57b6af21b55f
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
AdvertiseRefsHook is used to limit the visibility of the refs in Gerrit.
If this hook is not called, then all refs are treated as visible,
causing the server to serve commits reachable from branches the client
should not be able to access, if asked to via a request naming a guessed
object id.
This bug was introduced in v2.0.0.201206130900-r~123 (Modify refs in
UploadPack/ReceivePack using a hook interface, 2012-02-08). Stateful
bidirectional transports are not affected.
Fix it by moving the AdvertiseRefsHook call to
getAdvertisedOrDefaultRefs, ensuring the hook is called in all cases.
[jn: backported to stable-4.5 by splitting out tests and the protocol v2
specific parts]
Change-Id: I159f396216354f2eda3968d17802e166d8c8ec2d
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
BasePackConnection: Check for expected length of ref advertisement
When a server sends a ref advertisement using protocol v2 it contains
lines other than ref names and sha1s. Attempting to get the sha1 out
of such a line using the substring method can result in a SIOOB error
when it doesn't actually contain the sha1 and ref name.
Add a check that the line is of the expected length, and subsequently
that the extracted object id is valid, and if not throw an exception.
Change-Id: Id92fe66ff8b6deb2cf987d81929f8d0602c399f4
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
UploadPack has a setTransferConfig method which allows to set the
transfer config, however since the constructors of TransferConfig
have the default package visibility it is not possible for any
application using UploadPack, for example Gerrit, to actually set
a transfer config.
Make the constructors public. This is consistent with the public
constructors for example on PackConfig.
Change-Id: I07080255838421871403b2b2bcc294aa8f621c57
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Set GIT_DIR and GIT_WORK_TREE when calling hooks.
Bug: 541622
Change-Id: I6153d8a6a934ec37a3a5e7319c2d0e516f539ab7
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
After cloning a repo with a submodule, non-recursively, JGit would
encounter in its TreeWalk in IndexDiff:
* first, a missing gitlink (in index & HEAD, not in working tree)
* second, the untracked folder (not in index and head, in working tree)
As a result, it would report the submodule as missing. Canonical git
reports a clean workspace.
The root cause of this is that the path of a gitlink "x" did
not compare equal to the path of a tree "x" in JGit.
Correct Paths.compare() to account for that. If two paths are otherwise
equal, then let gitlinks match both trees and files. Matching trees
solves the bug. Matching files is necessary to handle the case where
the gitlink directory was replaced by a file; see the new test case
IndexDiffSubmoduleTest.testSubmoduleReplacedByFile(). Comparisons of
unequal paths are left untouched, so the sort order is unchanged.
After the fix, another bug(?) in WorkingTreeIterator became apparent:
with core.dirNoGitLinks = true, it was no longer possible to overwrite
a gitlink in the index. This is now fixed in WorkingTreeIterator.
Add new test cases for the bug itself and for some related cases
(submodule directory deleted or replaced by a file) in
IndexDiffSubmoduleTest. Add a test for missing files in IndexDiffTest,
and adapt the PathsTest to test matching gitlinks.
Bug: 467631
Change-Id: I0549d10d46b1858e5eec3cc15095aa9f1d5f5280
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
when multiple match options are given in git describe the result must
not depend on the order of the match options. JGit wrongly picked the
first match using the match options in the order they were defined. Fix
this by concatenating the streams of matching tags for all match options
and then choosing the first match on the concatenated stream sorted in
tie break order.
See https://git-scm.com/docs/git-describe#git-describe---matchltpatterngt
Change-Id: Id01433d35fa16fb4c30526605bee041ac1d954b2
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Add a method to get all values of HTTP header defined as list
According to RFC 2616 [1] header field names are case insensitive.
Header fields defined as a comma separated list can have multiple header
fields with the same field name. Add a method to HttpConnection which
retrieves all values with a given header field name with the field name
compared case insensitive.
[1] https://tools.ietf.org/html/rfc2616#section-4.2"
Change-Id: I7f601b21cda99e84f43f866c7c7cb4cb0e3cf5c3
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This partially reverts commit a551b646: revert the changes in
RawParseUtils.lineMap(). Forcing all blobs containing a NUL byte
as a single line causes blame to produce useless results as soon
as it hits any version containing a NUL byte.
Doing binary detection at this level also has the problem that the
user cannot control it. Not by setting the text attribute nor in any
other way.
This came up in bug 541036, where a Java source inadvertently
contained NUL bytes in strings. Even fixing this by using escapes
"\000" will not fix JGit's blame for this file because the past
versions will still contain the NUL byte.
Native git can blame that file from bug 541036 fine.
Added new tests verifying that blaming a text file containing a NUL
byte produces sensible results.
Bug: 541036
Change-Id: I8991bec88e9827cc096868c6026ea1890b6d0d32
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
This continues what commit d9ac7ddf10
(Remove unnecessary modifiers from interfaces, 2018-11-15) started.
Change-Id: I89720985a5a986722a0dcb9b5e9bbc25996bd5b3
This reverts the workaround introduced by
1c6c73c5a9b8dd700be45d658f165a464265dba7, which is a patch for dealing
with a buggy C Git client v1.7.5 in 2012. We'll stop supporting very old
C Git clients.
Change-Id: I94999a39101c96f210b5eca3c2f620c15eb1ac1b
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
From Oracle's "Defining an interface":
"All abstract, default, and static methods in an interface are
implicitly public, so you can omit the public modifier."
(Without any modifier, the interface methods are also abstract, so we
omit also the "abstract")
"In addition, an interface can contain constant declarations. All
constant values defined in an interface are implicitly public, static,
and final. Once again, you can omit these modifiers."
This makes the code more consistent. Now all interfaces under
org.eclipse.jgit follow the guidelines.
Change-Id: I4fe6deb111899ec1b4318ab5a6050f3851fa1fd3
Signed-off-by: Ivan Frade <ifrade@google.com>
Add more ssh tests: pushing, known_host file handling, etc.
Add support for git-receive-pack to the ssh git server and add two
new tests for pushing.
This actually uncovered an undocumented requirement in TransportSftp:
the FTP rename operation assumes POSIX semantics, i.e., that the
target is removed. This works as written only for servers that
support and advertise the "posix-rename@openssh.com" FTP extension.
Our little Apache MINA server does not advertise this extension.
Fix the FtpChannel implementation for Jsch to handle this case in a
meaningful way so that it can pass the new "push over sftp" test.
Add more tests to test the behavior of server host key checking.
Also refactor the tests generally to separate better the test
framework from the actual tests.
Bug: 520927
Change-Id: Ia4bb85e17ddacde7b36ee8c2d5d454bbfa66dfc3
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Introduce an FtpChannel abstraction, which can be obtained from a
RemoteSession. In JSchSession, wrap a JSch ChannelSftp as such an
FtpChannel. The JSch-specific SftpException is also mapped to a
generic FtpException. Rewrite TransportSftp to use only the new
abstraction layer.
This makes it possible to provide alternate ssh/sftp implementations.
Bug: 520927
Change-Id: I379026f7d4122f34931df909a28e73c02cd8a1da
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
The lock is obtained in receivePackAndCheckConnectivity. It seems to me
the structure that requres the caller to unlock the lock is wrong, but
at least by calling in finally ensures it is called even if an exception
is thrown.
Change-Id: I123841b017baf5acffe0064d1004ef11a0a5e6c2
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Correct behaviour as git 1.7.1.1 is to resolve tie-breakers to choose
the most recent tag.
https://github.com/git/git/blob/master/Documentation/RelNotes/1.7.1.1.txt:
* "git describe" did not tie-break tags that point at the same commit
correctly; newer ones are preferred by paying attention to the
tagger date now.
Bug: 538610
Change-Id: Ib0b2a301997bb7f75935baf7005473f4de952a64
Signed-off-by: Håvard Wall <haavardw@gmail.com>
Simplify RevWalk#iterator by factoring out common code
Factor out a helper that calls next() and tunnels IOException in a
RuntimeException, similar to TunnelException.tunnel(RevWalk::next) in
Guava terms[1].
This should make the code a little more readable. No functional
change intended.
[1] https://github.com/google/guava/issues/2828#issuecomment-304187823
Change-Id: I97c062d03a17663d5c40895fd3d2c6a7306d4f39
Signed-off-by: Jonathan Nieder <jrn@google.com>