Matthias Sohn [Tue, 12 Mar 2019 21:39:53 +0000 (22:39 +0100)]
Merge branch 'stable-4.6' into stable-4.7
* stable-4.6:
Prepare 4.5.7-SNAPSHOT builds
JGit v4.5.6.201903121547-r
Check for packfile validity and fd before reading
Move throw of PackInvalidException outside the catch
Use FileSnapshot to get lastModified on PackFile
Include size when comparing FileSnapshot
Do not reuse packfiles when changed on filesystem
Silence API warnings for new API introduced for fixes
Change-Id: I3d1544d034783fe0fa1385dfe9b03ad8e9247c63 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Tue, 12 Mar 2019 20:01:55 +0000 (21:01 +0100)]
Merge branch 'stable-4.5' into stable-4.6
* stable-4.5:
Prepare 4.5.7-SNAPSHOT builds
JGit v4.5.6.201903121547-r
Check for packfile validity and fd before reading
Move throw of PackInvalidException outside the catch
Use FileSnapshot to get lastModified on PackFile
Include size when comparing FileSnapshot
Do not reuse packfiles when changed on filesystem
Silence API warnings for new API introduced for fixes
Change-Id: I029e1797447e6729de68bd89d4d69b324dbb3f5f Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Luca Milanesio [Sun, 10 Mar 2019 22:03:40 +0000 (22:03 +0000)]
Check for packfile validity and fd before reading
When reading from a packfile, make sure that is valid
and has a non-null file-descriptor.
Because of concurrency between a thread invalidating a packfile
and another trying to read it, the read() may result into a NPE
that won't be able to be automatically recovered.
Throwing a PackInvalidException would instead cause the packlist
to be refreshed and the read to eventually succeed.
Luca Milanesio [Wed, 6 Mar 2019 11:30:07 +0000 (11:30 +0000)]
Move throw of PackInvalidException outside the catch
When a packfile is invalid, throw an exception explicitly
outside any catch scope, so that is not accidentally caught
by the generic catch-all cause, which would set the packfile
as valid again.
Flagging an invalid packfile as valid again would have
dangerous consequences such as the corruption of the in-memory
packlist.
Luca Milanesio [Tue, 12 Mar 2019 07:00:01 +0000 (07:00 +0000)]
Use FileSnapshot to get lastModified on PackFile
Do not redundantly call File.lastModified() for extracting the
timestamp of the PackFile but rather use consistently the FileSnapshot
which reads all file attributes in a single bulk call.
Change-Id: I932675ae4fe56dcd3833dac249816f097303bb09 Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Luca Milanesio [Wed, 6 Mar 2019 17:51:59 +0000 (17:51 +0000)]
Include size when comparing FileSnapshot
Due to finite filesystem timestamp resolution the last modified
timestamp of files cannot detect file changes which happened in the
immediate past (less than one filesystem timer tick ago).
Read and consider file size also, so that differing file size can help
to more accurately detect file changes without reading the file content.
Use bulk read to avoid multiple stat calls to retrieve file attributes.
Change-Id: I974288fff78ac78c52245d9218b5639603f67a46 Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Luca Milanesio [Wed, 6 Mar 2019 00:31:45 +0000 (00:31 +0000)]
Do not reuse packfiles when changed on filesystem
The pack reload mechanism from the filesystem works only by name
and does not check the actual last modified date of the packfile.
This lead to concurrency issues where multiple threads were loading
and removing from each other list of packfiles when one of those
was failing the checksum.
Rely on FileSnapshot rather than directly checking lastModified
timestamp so that more checks can be performed.
Matthias Sohn [Mon, 24 Dec 2018 12:25:31 +0000 (13:25 +0100)]
Merge branch 'stable-4.6' into stable-4.7
* stable-4.6:
Fix feature versions imported by feature org.eclipse.jgit.pgm
Prepare 4.5.6-SNAPSHOT builds
JGit v4.5.5.201812240535-r
Call AdvertiseRefsHook before validating wants
Change-Id: If637694f80dbd1e774d60c672fe78a6500650bb8 Signed-off-by: Jonathan Nieder <jrn@google.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Change-Id: I0fd67ddd9c4966c20d82cdfe78b2f9d4898b4665 Signed-off-by: Jonathan Nieder <jrn@google.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Masaya Suzuki [Tue, 18 Dec 2018 17:20:54 +0000 (09:20 -0800)]
Call AdvertiseRefsHook before validating wants
AdvertiseRefsHook is used to limit the visibility of the refs in Gerrit.
If this hook is not called, then all refs are treated as visible,
causing the server to serve commits reachable from branches the client
should not be able to access, if asked to via a request naming a guessed
object id.
This bug was introduced in v2.0.0.201206130900-r~123 (Modify refs in
UploadPack/ReceivePack using a hook interface, 2012-02-08). Stateful
bidirectional transports are not affected.
Fix it by moving the AdvertiseRefsHook call to
getAdvertisedOrDefaultRefs, ensuring the hook is called in all cases.
[jn: backported to stable-4.5 by splitting out tests and the protocol v2
specific parts]
Change-Id: I159f396216354f2eda3968d17802e166d8c8ec2d Signed-off-by: Masaya Suzuki <masayasuzuki@google.com> Signed-off-by: Jonathan Nieder <jrn@google.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
SpotBugs [1] is the spiritual successor of FindBugs, carrying on from
the point where it left off with support of its community.
This is a backport of [1] which originally did the replacement on the
master branch. This change updates to the current latest version, so
that we can get the benefit of its checks when pushing changes to the
stable branches.
Jonathan Nieder [Sun, 7 Oct 2018 21:55:52 +0000 (21:55 +0000)]
SubmoduleValidator: Permit missing path or url
A .gitmodules file can include a submodule without a path to configure
the URL for a submodule that is only present on other branches.
A .gitmodules file can include a submodule with no URL and no path to
reserve the name for a submodule that existed in earlier history but
is not available from any URL any more.
"git fsck" permits both of these cases. Permit them in JGit as well
(instead of throwing NullPointerException).
Change-Id: I3b442639ad79ea7a59227f96406a12e62d3573ae Reported-by: David Pursehouse <david.pursehouse@gmail.com> Signed-off-by: Jonathan Nieder <jrn@google.com>
The text "<tree, blob>" with angle brackets should not be used in javadoc
since it is interpreted as an HTML tag and then rejected since it's not a
valid HTML tag. Wrap the text in a @literal tag.
Also add a missing space.
Change-Id: Ide045e8c04a39a916f5b2e964e58c151e4555830 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
The main concern are submodule urls starting with '-' that could pass as
options to an unguarded tool.
Pass through the parser the ids of blobs identified as .gitmodules
files in the ObjectChecker. Load the blobs and parse/validate them
in SubmoduleValidator.
Change-Id: Ia0cc32ce020d288f995bf7bc68041fda36be1963 Signed-off-by: Ivan Frade <ifrade@google.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Ivan Frade [Thu, 27 Sep 2018 20:05:13 +0000 (13:05 -0700)]
ObjectChecker: Report .gitmodules files found in the pack
In order to validate .gitmodules files, we first need to find them
in the incoming pack.
Do it in the ObjectChecker stage. Check in the tree objects if they
point to a .gitmodules file and report the tree id and the .gitmodules
blob id.
This can be used later to check if the file is in the root of the
project and if the contents are good.
While we're here, make isMacHFSGit more accurate by detecting variants
of filenames that vary in case.
[jn: tweaked NTFS and HFS+ checking; added more tests]
Change-Id: I70802e7d2c1374116149de4f89836b9498f39582 Signed-off-by: Ivan Frade <ifrade@google.com> Signed-off-by: Jonathan Nieder <jrn@google.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Ivan Frade [Mon, 24 Sep 2018 23:03:35 +0000 (16:03 -0700)]
SubmoduleAddCommand: Reject submodule URIs that look like cli options
In C git versions before 2.19.1, the submodule is fetched by running
"git clone <uri> <path>". A URI starting with "-" would be interpreted
as an option, causing security problems. See CVE-2018-17456.
Refuse to add submodules with URIs, names or paths starting with "-",
that could be confused with command line arguments.
[jn: backported to JGit 4.7.y, bringing portions of Masaya Suzuki's
dotdot check code in v5.1.0.201808281540-m3~57 (Add API to specify
the submodule name, 2018-07-12) along for the ride]
Change-Id: I2607c3acc480b75ab2b13386fe2cac435839f017 Signed-off-by: Ivan Frade <ifrade@google.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
David Ostrovsky [Tue, 25 Sep 2018 06:20:57 +0000 (08:20 +0200)]
ObjectDownloadListener#onWritePossible: Add comment on return statement
It is not obvious why this return statement is needed. Clarify with a
comment that otherwise endless loop may show up when recent versions
of Jetty are used.
Change-Id: I8e5d4de51869fb1179bf599bfb81bcd7d745874b Signed-off-by: David Ostrovsky <david@ostrovsky.org>
Matthias Sohn [Sun, 16 Sep 2018 22:35:33 +0000 (00:35 +0200)]
Fix error handling in FileLfsServlet
Check in #sendError method if the response was committed already.
If yes we cannot set response status or send an error message, last
resort is to close the outputstream.
If the response wasn't yet committed first reset the response before
using writer to send the error message to the client since mixing STREAM
and WRITE mode (mixing asynchronous and blocking I/O) is illegal in
servlet 3.1.
see the following bugs in the gerrit and jetty issue trackers
https://bugs.chromium.org/p/gerrit/issues/detail?id=9667
https://bugs.chromium.org/p/gerrit/issues/detail?id=9721
https://github.com/eclipse/jetty.project/issues/2911
Change-Id: Ie35563c2e0ac1c5e918185a746622589a880dc7f Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
David Ostrovsky [Sat, 15 Sep 2018 19:09:51 +0000 (21:09 +0200)]
ObjectDownloadListener#onWritePossible: Make code spec compatible
Current code violates the ServletOutputStream contract. For every
out.isReady() == true either write or close of that ServletOutputStream
should be called.
See also this issue upstream for more context: [1].
Change-Id: Ied575f3603a6be0d2dafc6c3329d685fc212c7a3 Signed-off-by: David Ostrovsky <david@ostrovsky.org> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
David Ostrovsky [Sat, 15 Sep 2018 19:09:51 +0000 (21:09 +0200)]
ObjectDownloadListener: Return from onWritePossible when data is written
When buffer was written not only call AsyncContext#complete() but also
return from the ObjectDownloadListener#onWritePossible(). This avoids
endless loop after upgrading from Jetty 9.3.x to 9.4.x lines.
In Jetty example implementation:[1] the return statemnt is also used:
// If we are at EOF then complete
if (len < 0)
{
async.complete();
return;
}
Matthias Sohn [Thu, 23 Aug 2018 09:44:28 +0000 (11:44 +0200)]
Externalize warning message in RefDirectory.delete()
Change-Id: Icec16c01853a3f5ea016d454b3d48624498efcce Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
(cherry picked from commit 5e68fe245fb97f38620ea683dbeab724bf86da4c) Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Thomas Wolf [Sun, 19 Aug 2018 18:48:06 +0000 (20:48 +0200)]
Suppress warning for trying to delete non-empty directory
This is actually a fairly common occurrence; deleting the parent
directories can work only if the file deleted was the last one
in the directory.
Bug: 537872
Change-Id: I86d1d45e1e2631332025ff24af8dfd46c9725711 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
(cherry picked from commit d9e767b431eae7978613cc8e0ade7467ec04376c) Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Matthias Sohn [Sun, 26 Aug 2018 17:44:29 +0000 (19:44 +0200)]
Fix atomic lock file creation on NFS
FS_POSIX.createNewFile(File) failed to properly implement atomic file
creation on NFS using the algorithm [1]:
- name of the hard link must be unique to prevent that two processes
using different NFS clients try to create the same link. This would
render nlink useless to detect if there was a race.
- the hard link must be retained for the lifetime of the file since we
don't know when the state of the involved NFS clients will be
synchronized. This depends on NFS configuration options.
To fix these issues we need to change the signature of createNewFile
which would break API. Hence deprecate the old method
FS.createNewFile(File) and add a new method createNewFileAtomic(File).
The new method returns a LockToken which needs to be retained by the
caller (LockFile) until all involved NFS clients synchronized their
state. Since we don't know when the NFS caches are synchronized we need
to retain the token until the corresponding file is no longer needed.
The LockToken must be closed after the LockFile using it has been
committed or unlocked. On Posix, if core.supportsAtomicCreateNewFile =
false this will delete the hard link which guarded the atomic creation
of the file. When acquiring the lock fails ensure that the hard link is
removed.
[1] https://www.time-travellers.org/shane/papers/NFS_considered_harmful.html
also see file creation flag O_EXCL in
http://man7.org/linux/man-pages/man2/open.2.html
Change-Id: I84fcb16143a5f877e9b08c6ee0ff8fa4ea68a90d Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fix handling of option core.supportsAtomicCreateNewFile
When core.supportsAtomicCreateNewFile was set to false and the
repository was located on a filesystem which doesn't support the file
attribute "unix:nlink" then FS_POSIX#createNewFile may report an error
even if everything was ok. Modify FS_POSIX#createNewFile to silently
ignore this situation. An example of such a filesystem is sshfs where
reading "unix:nlink" always returns 1 (instead of throwing a exception).
Bug: 537969
Change-Id: I6deda7672fa7945efa8706ea1cd652272604ff19 Also-by: Thomas Wolf <thomas.wolf@paranor.ch>
GC: Avoid logging errors when deleting non-empty folders
I88304d34c and Ia555bce00 modified the way errors are handled when
trying to delete non-empty reference folders. Before, this error was
silently ignored as it was considered an expected output. Now, every
failed folder delete is logged which can be noisy.
Ignore the DirectoryNotEmptyException but log any other error avoiding
deletion of an eligible folder.
David Pursehouse [Wed, 29 Aug 2018 23:48:41 +0000 (08:48 +0900)]
Bazel: Use hyphen instead of underscore in external repository names
Recent Bazel versions support the hyphen character in external
repository names. On the Gerrit project, the repository names
were harmonized to consistently use hyphen.
As a side effect, it is no longer possible to build jgit from source
in the gerrit tree, due to the different repository names.
Rename the dependencies to use hyphens, consistent with gerrit.
Change-Id: Ideebd858ddd3f0e6f765643001642dfb6c12441f Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
David Pursehouse [Sat, 30 Sep 2017 09:56:42 +0000 (10:56 +0100)]
ChangeIdUtilTest: Remove unused notestCommitDashV
This test was never being run. Since it was introduced it was
named "notest.." which meant it didn't run with JUnit3, and
since it is not annotated @Test it also doesn't run with JUnit4.
When compiling with Bazel 0.6.0, error-prone raises an error
that the public method is not annotated with @Ignore or @Test.
Given that the test has never been run anyway, we can just
remove it.
Bug: 525415
Change-Id: Ie9a54f89fe42e0c201f547ff54ff1d419ce37864 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Hugo Arès [Wed, 15 Aug 2018 13:54:29 +0000 (09:54 -0400)]
Fix GC run in foreground to not use executor
Since I3870cadb4, GC task was always delegated to an executor even when
background option was set to false. This was an issue because if more
than one GC object was instantiated and executed in parallel, only one GC
was actually running because of the single thread executor.
Change-Id: I8c587d22d63c1601b7d75914692644a385cd86d6 Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>
After packaging references, the folders containing these references are
not deleted. In a busy repository, this causes operations to slow down
as traversing the references tree becomes longer.
Delete empty reference folders after the loose references have been
packed.
To avoid deleting a folder that was just created by another concurrent
operation, only delete folders that were not modified in the last 30
seconds.
Marco Miller [Thu, 21 Jun 2018 18:18:48 +0000 (14:18 -0400)]
ResolveMerger: Fix encoding with string; use bytes
This change fixes the issue [1]. Before this fix, a merge involving
the caching of consecutive yet similar filenames with Norwegian
characters [2] used to throw an IllegalStateException: Duplicate
stages not allowed. This was caused by inaccurate decoding of the
filenames, using string values assuming default encoding. In the
toString method of DirCacheEntry, used before through getPathString,
UTF-8 encoding is used, but the end result becomes default encoding,
through Object's default toString usage. The special characters in
those two consecutive (particular) filenames [2] were becoming the
very same decoded /single character, lending consecutive -but then
identical- filenames. Thus the perceived duplicate 0-staging of the
file(s).
Replace getPathString usage with getRawPath for this specific case,
or use byte array representations of cached entries instead of string.
Adding a test for this change is not possible, as there is no known
way to change the default encoding for filenames such as [2] (e.g.).
JGitTestUtil does write file contents through UTF-8, but encoding like
so does not apply to the actual file name. Hence there is no way to
create files with names properly made of special characters such as
[2]'s. And the test that is necessary for this case assumes such
Norwegian (or similar characters) filenames. Changing the default
locale programmatically in a test has no effect either. And changing
the LANG value passed to the JVM is only possible upon starting it.
On a local non-NFS filesystem the .git/config file will be orphaned if
it is replaced by a new process while the current process is reading the
old file. The current process successfully continues to read the
orphaned file until it closes the file handle.
Since NFS servers do not keep track of open files, instead of orphaning
the old .git/config file, such a replacement on an NFS filesystem will
instead cause the old file to be garbage collected (deleted). A stale
file handle exception will be raised on NFS clients if the file is
garbage collected (deleted) on the server while it is being read. Since
we no longer have access to the old file in these cases, the previous
code would just fail. However, in these cases, reopening the file and
rereading it will succeed (since it will open the new replacement file).
Since retrying the read is a viable strategy to deal with stale file
handles on the .git/config file, implement such a strategy.
Since it is possible that the .git/config file could be replaced again
while rereading it, loop on stale file handle exceptions, up to 5 extra
times, trying to read the .git/config file again, until we either read
the new file, or find that the file no longer exists. The limit of 5 is
arbitrary, and provides a safe upper bounds to prevent infinite loops
consuming resources in a potential unforeseen persistent error
condition.
Change-Id: I6901157b9dfdbd3013360ebe3eb40af147a8c626 Signed-off-by: Nasser Grainawi <nasser@codeaurora.org> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
When running on NFS there was a chance that JGits LockFile
semantic is broken because File#createNewFile() may allow
multiple clients to create the same file in parallel. This
change provides a fix which is only used when the new config
option core.supportsAtomicCreateNewFile is set to false. The
default for this option is true. This option can only be set in the
global or the system config file. The repository config file is not
taken into account in this case.
If the config option core.supportsAtomicCreateNewFile is true
then File#createNewFile() is trusted and the behaviour doesn't
change.
But if core.supportsAtomicCreateNewFile is set to false then after
successful creation of the lock file a hardlink to that lock file is
created and the attribute nlink of the lock file is checked to be 2. If
multiple clients manage to create the same lock file nlink would be
greater than 2 showing the error.
This expensive workaround is described in
https://www.time-travellers.org/shane/papers/NFS_considered_harmful.html
section III.d) "Exclusive File Creation"
Honor trustFolderStats also when reading packed-refs
Then list of packed refs was cached in RefDirectory based on mtime of
the packed-refs file. This may fail on NFS when attributes are cached.
A cached mtime of the packed-refs file could cause JGit to trust the
cached content of this file and to overlook that the file is modified.
Honor the config option trustFolderStats and always read the packed-refs
content if the option is false. By default this option is set to true
and this fix is not active.
Fix exception handling for opening bitmap index files
When creating a new PackFile instance it is specified whether this pack
has an associated bitmap index file or not. This information is cached
and the public method getBitmapIndex() will always assume a bitmap index
file must exist if the cached data tells so. But it may happen that the
packfiles are repacked during a gc in a different process causing the
packfile, bitmap-index and index file to be deleted. Since JGit still
has an open FileHandle on the packfile this file is not really deleted
and can still be accessed. But index and bitmap index file are deleted.
Fix getBitmapIndex() to invalidate the cached packfile instance if such
a situation occurs.
This problem showed up when a gerrit server was serving repositories
which where garbage collected with native git regularly. Fetch and
clone commands for certain repositories failed permanently after a
native git gc had deleted old bitmap index files.
Change-Id: I8e620bec74dd3f310ba42024f9a657062f868f0e Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
David Turner [Wed, 8 Feb 2017 20:07:18 +0000 (15:07 -0500)]
Run auto GC in the background
When running an automatic GC on a FileRepository, when the caller
passes a NullProgressMonitor, run the GC in a background thread. Use a
thread pool of size 1 to limit the number of background threads spawned
for background gc in the same application. In the next minor release we
can make the thread pool configurable.
In some cases, the auto GC limit is lower than the true number of
unreachable loose objects, so auto GC will run after every (e.g) fetch
operation. This leads to the appearance of poor fetch performance.
Since these GCs will never make progress (until either the objects
become referenced, or the two week timeout expires), blocking on them
simply reduces throughput.
In the event that an auto GC would make progress, it's still OK if it
runs in the background. The progress will still happen.
This matches the behavior of regular git.
Git (and now jgit) uses the lock file for gc.log to prevent simultaneous
runs of background gc. Further, it writes errors to gc.log, and won't
run background gc if that file is present and recent. If gc.log is too
old (according to the config gc.logexpiry), it will be ignored.
Change-Id: I3870cadb4a0a6763feff252e6eaef99f4aa8d0df Signed-off-by: David Turner <dturner@twosigma.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>