The text "<tree, blob>" with angle brackets should not be used in javadoc
since it is interpreted as an HTML tag and then rejected since it's not a
valid HTML tag. Wrap the text in a @literal tag.
Also add a missing space.
Change-Id: Ide045e8c04a39a916f5b2e964e58c151e4555830 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Matthias Sohn [Fri, 5 Oct 2018 23:03:20 +0000 (01:03 +0200)]
Merge branch 'stable-4.7' into stable-4.8
* stable-4.7:
Prepare 4.7.6-SNAPSHOT builds
JGit v4.7.5.201810051826-r
BaseReceivePack: Validate incoming .gitmodules files
ObjectChecker: Report .gitmodules files found in the pack
SubmoduleAddCommand: Reject submodule URIs that look like cli options
Change-Id: Id6fabec4d0b682a7e20a46e88cbc05432efca062 Signed-off-by: Jonathan Nieder <jrn@google.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The main concern are submodule urls starting with '-' that could pass as
options to an unguarded tool.
Pass through the parser the ids of blobs identified as .gitmodules
files in the ObjectChecker. Load the blobs and parse/validate them
in SubmoduleValidator.
Change-Id: Ia0cc32ce020d288f995bf7bc68041fda36be1963 Signed-off-by: Ivan Frade <ifrade@google.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Ivan Frade [Thu, 27 Sep 2018 20:05:13 +0000 (13:05 -0700)]
ObjectChecker: Report .gitmodules files found in the pack
In order to validate .gitmodules files, we first need to find them
in the incoming pack.
Do it in the ObjectChecker stage. Check in the tree objects if they
point to a .gitmodules file and report the tree id and the .gitmodules
blob id.
This can be used later to check if the file is in the root of the
project and if the contents are good.
While we're here, make isMacHFSGit more accurate by detecting variants
of filenames that vary in case.
[jn: tweaked NTFS and HFS+ checking; added more tests]
Change-Id: I70802e7d2c1374116149de4f89836b9498f39582 Signed-off-by: Ivan Frade <ifrade@google.com> Signed-off-by: Jonathan Nieder <jrn@google.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Ivan Frade [Mon, 24 Sep 2018 23:03:35 +0000 (16:03 -0700)]
SubmoduleAddCommand: Reject submodule URIs that look like cli options
In C git versions before 2.19.1, the submodule is fetched by running
"git clone <uri> <path>". A URI starting with "-" would be interpreted
as an option, causing security problems. See CVE-2018-17456.
Refuse to add submodules with URIs, names or paths starting with "-",
that could be confused with command line arguments.
[jn: backported to JGit 4.7.y, bringing portions of Masaya Suzuki's
dotdot check code in v5.1.0.201808281540-m3~57 (Add API to specify
the submodule name, 2018-07-12) along for the ride]
Change-Id: I2607c3acc480b75ab2b13386fe2cac435839f017 Signed-off-by: Ivan Frade <ifrade@google.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
David Ostrovsky [Tue, 25 Sep 2018 06:20:57 +0000 (08:20 +0200)]
ObjectDownloadListener#onWritePossible: Add comment on return statement
It is not obvious why this return statement is needed. Clarify with a
comment that otherwise endless loop may show up when recent versions
of Jetty are used.
Change-Id: I8e5d4de51869fb1179bf599bfb81bcd7d745874b Signed-off-by: David Ostrovsky <david@ostrovsky.org>
David Pursehouse [Tue, 18 Sep 2018 00:15:02 +0000 (09:15 +0900)]
Merge branch 'stable-4.7' into stable-4.8
* stable-4.7:
Fix ObjectUploadListener#close
Fix error handling in FileLfsServlet
ObjectDownloadListener#onWritePossible: Make code spec compatible
ObjectDownloadListener: Return from onWritePossible when data is written
Fix IOException when LockToken#close fails
Change-Id: Iad9836811be034cf992ea25dad4409addba75115 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Matthias Sohn [Sun, 16 Sep 2018 22:35:33 +0000 (00:35 +0200)]
Fix error handling in FileLfsServlet
Check in #sendError method if the response was committed already.
If yes we cannot set response status or send an error message, last
resort is to close the outputstream.
If the response wasn't yet committed first reset the response before
using writer to send the error message to the client since mixing STREAM
and WRITE mode (mixing asynchronous and blocking I/O) is illegal in
servlet 3.1.
see the following bugs in the gerrit and jetty issue trackers
https://bugs.chromium.org/p/gerrit/issues/detail?id=9667
https://bugs.chromium.org/p/gerrit/issues/detail?id=9721
https://github.com/eclipse/jetty.project/issues/2911
Change-Id: Ie35563c2e0ac1c5e918185a746622589a880dc7f Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
David Ostrovsky [Sat, 15 Sep 2018 19:09:51 +0000 (21:09 +0200)]
ObjectDownloadListener#onWritePossible: Make code spec compatible
Current code violates the ServletOutputStream contract. For every
out.isReady() == true either write or close of that ServletOutputStream
should be called.
See also this issue upstream for more context: [1].
Change-Id: Ied575f3603a6be0d2dafc6c3329d685fc212c7a3 Signed-off-by: David Ostrovsky <david@ostrovsky.org> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
David Ostrovsky [Sat, 15 Sep 2018 19:09:51 +0000 (21:09 +0200)]
ObjectDownloadListener: Return from onWritePossible when data is written
When buffer was written not only call AsyncContext#complete() but also
return from the ObjectDownloadListener#onWritePossible(). This avoids
endless loop after upgrading from Jetty 9.3.x to 9.4.x lines.
In Jetty example implementation:[1] the return statemnt is also used:
// If we are at EOF then complete
if (len < 0)
{
async.complete();
return;
}
David Pursehouse [Wed, 12 Sep 2018 05:05:46 +0000 (14:05 +0900)]
Merge branch 'stable-4.7' into stable-4.8
* stable-4.7:
Fix NoSuchFileException during directory cleanup in RefDirectory
Externalize warning message in RefDirectory.delete()
Suppress warning for trying to delete non-empty directory
Change-Id: I9ec6352b5ff57aa1a3380079dc9165890cc76d49 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Matthias Sohn [Thu, 23 Aug 2018 09:44:28 +0000 (11:44 +0200)]
Externalize warning message in RefDirectory.delete()
Change-Id: Icec16c01853a3f5ea016d454b3d48624498efcce Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
(cherry picked from commit 5e68fe245fb97f38620ea683dbeab724bf86da4c) Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Thomas Wolf [Sun, 19 Aug 2018 18:48:06 +0000 (20:48 +0200)]
Suppress warning for trying to delete non-empty directory
This is actually a fairly common occurrence; deleting the parent
directories can work only if the file deleted was the last one
in the directory.
Bug: 537872
Change-Id: I86d1d45e1e2631332025ff24af8dfd46c9725711 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
(cherry picked from commit d9e767b431eae7978613cc8e0ade7467ec04376c) Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Matthias Sohn [Sun, 26 Aug 2018 17:44:29 +0000 (19:44 +0200)]
Fix atomic lock file creation on NFS
FS_POSIX.createNewFile(File) failed to properly implement atomic file
creation on NFS using the algorithm [1]:
- name of the hard link must be unique to prevent that two processes
using different NFS clients try to create the same link. This would
render nlink useless to detect if there was a race.
- the hard link must be retained for the lifetime of the file since we
don't know when the state of the involved NFS clients will be
synchronized. This depends on NFS configuration options.
To fix these issues we need to change the signature of createNewFile
which would break API. Hence deprecate the old method
FS.createNewFile(File) and add a new method createNewFileAtomic(File).
The new method returns a LockToken which needs to be retained by the
caller (LockFile) until all involved NFS clients synchronized their
state. Since we don't know when the NFS caches are synchronized we need
to retain the token until the corresponding file is no longer needed.
The LockToken must be closed after the LockFile using it has been
committed or unlocked. On Posix, if core.supportsAtomicCreateNewFile =
false this will delete the hard link which guarded the atomic creation
of the file. When acquiring the lock fails ensure that the hard link is
removed.
[1] https://www.time-travellers.org/shane/papers/NFS_considered_harmful.html
also see file creation flag O_EXCL in
http://man7.org/linux/man-pages/man2/open.2.html
Change-Id: I84fcb16143a5f877e9b08c6ee0ff8fa4ea68a90d Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fix handling of option core.supportsAtomicCreateNewFile
When core.supportsAtomicCreateNewFile was set to false and the
repository was located on a filesystem which doesn't support the file
attribute "unix:nlink" then FS_POSIX#createNewFile may report an error
even if everything was ok. Modify FS_POSIX#createNewFile to silently
ignore this situation. An example of such a filesystem is sshfs where
reading "unix:nlink" always returns 1 (instead of throwing a exception).
Bug: 537969
Change-Id: I6deda7672fa7945efa8706ea1cd652272604ff19 Also-by: Thomas Wolf <thomas.wolf@paranor.ch>
GC: Avoid logging errors when deleting non-empty folders
I88304d34c and Ia555bce00 modified the way errors are handled when
trying to delete non-empty reference folders. Before, this error was
silently ignored as it was considered an expected output. Now, every
failed folder delete is logged which can be noisy.
Ignore the DirectoryNotEmptyException but log any other error avoiding
deletion of an eligible folder.
David Pursehouse [Fri, 31 Aug 2018 00:24:57 +0000 (09:24 +0900)]
Merge branch 'stable-4.7' into stable-4.8
* stable-4.7:
Bazel: Use hyphen instead of underscore in external repository names
Bazel: Format all build files with buildifier 0.15.0
ChangeIdUtilTest: Remove unused notestCommitDashV
Change-Id: I414ade902dc38b696c566dd604000e3d289f3973 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
David Pursehouse [Wed, 29 Aug 2018 23:48:41 +0000 (08:48 +0900)]
Bazel: Use hyphen instead of underscore in external repository names
Recent Bazel versions support the hyphen character in external
repository names. On the Gerrit project, the repository names
were harmonized to consistently use hyphen.
As a side effect, it is no longer possible to build jgit from source
in the gerrit tree, due to the different repository names.
Rename the dependencies to use hyphens, consistent with gerrit.
Change-Id: Ideebd858ddd3f0e6f765643001642dfb6c12441f Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
David Pursehouse [Sat, 30 Sep 2017 09:56:42 +0000 (10:56 +0100)]
ChangeIdUtilTest: Remove unused notestCommitDashV
This test was never being run. Since it was introduced it was
named "notest.." which meant it didn't run with JUnit3, and
since it is not annotated @Test it also doesn't run with JUnit4.
When compiling with Bazel 0.6.0, error-prone raises an error
that the public method is not annotated with @Ignore or @Test.
Given that the test has never been run anyway, we can just
remove it.
Bug: 525415
Change-Id: Ie9a54f89fe42e0c201f547ff54ff1d419ce37864 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Hugo Arès [Wed, 15 Aug 2018 13:54:29 +0000 (09:54 -0400)]
Fix GC run in foreground to not use executor
Since I3870cadb4, GC task was always delegated to an executor even when
background option was set to false. This was an issue because if more
than one GC object was instantiated and executed in parallel, only one GC
was actually running because of the single thread executor.
Change-Id: I8c587d22d63c1601b7d75914692644a385cd86d6 Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>
David Pursehouse [Fri, 27 Jul 2018 07:22:51 +0000 (08:22 +0100)]
Merge branch 'stable-4.7' into stable-4.8
* stable-4.7:
Prepare 4.7.3-SNAPSHOT builds
JGit v4.7.2.201807261330-r
Delete all loose refs empty directories
Use java.nio to delete path to get detailed errors
GC: Remove empty references folders
Do not ignore path deletion errors
Change-Id: Iadc8275fbaa3d6f7d08a96ab66d49f392f6aab78 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
After packaging references, the folders containing these references are
not deleted. In a busy repository, this causes operations to slow down
as traversing the references tree becomes longer.
Delete empty reference folders after the loose references have been
packed.
To avoid deleting a folder that was just created by another concurrent
operation, only delete folders that were not modified in the last 30
seconds.
Marco Miller [Thu, 21 Jun 2018 18:18:48 +0000 (14:18 -0400)]
ResolveMerger: Fix encoding with string; use bytes
This change fixes the issue [1]. Before this fix, a merge involving
the caching of consecutive yet similar filenames with Norwegian
characters [2] used to throw an IllegalStateException: Duplicate
stages not allowed. This was caused by inaccurate decoding of the
filenames, using string values assuming default encoding. In the
toString method of DirCacheEntry, used before through getPathString,
UTF-8 encoding is used, but the end result becomes default encoding,
through Object's default toString usage. The special characters in
those two consecutive (particular) filenames [2] were becoming the
very same decoded /single character, lending consecutive -but then
identical- filenames. Thus the perceived duplicate 0-staging of the
file(s).
Replace getPathString usage with getRawPath for this specific case,
or use byte array representations of cached entries instead of string.
Adding a test for this change is not possible, as there is no known
way to change the default encoding for filenames such as [2] (e.g.).
JGitTestUtil does write file contents through UTF-8, but encoding like
so does not apply to the actual file name. Hence there is no way to
create files with names properly made of special characters such as
[2]'s. And the test that is necessary for this case assumes such
Norwegian (or similar characters) filenames. Changing the default
locale programmatically in a test has no effect either. And changing
the LANG value passed to the JVM is only possible upon starting it.
On a local non-NFS filesystem the .git/config file will be orphaned if
it is replaced by a new process while the current process is reading the
old file. The current process successfully continues to read the
orphaned file until it closes the file handle.
Since NFS servers do not keep track of open files, instead of orphaning
the old .git/config file, such a replacement on an NFS filesystem will
instead cause the old file to be garbage collected (deleted). A stale
file handle exception will be raised on NFS clients if the file is
garbage collected (deleted) on the server while it is being read. Since
we no longer have access to the old file in these cases, the previous
code would just fail. However, in these cases, reopening the file and
rereading it will succeed (since it will open the new replacement file).
Since retrying the read is a viable strategy to deal with stale file
handles on the .git/config file, implement such a strategy.
Since it is possible that the .git/config file could be replaced again
while rereading it, loop on stale file handle exceptions, up to 5 extra
times, trying to read the .git/config file again, until we either read
the new file, or find that the file no longer exists. The limit of 5 is
arbitrary, and provides a safe upper bounds to prevent infinite loops
consuming resources in a potential unforeseen persistent error
condition.
Change-Id: I6901157b9dfdbd3013360ebe3eb40af147a8c626 Signed-off-by: Nasser Grainawi <nasser@codeaurora.org> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
When running on NFS there was a chance that JGits LockFile
semantic is broken because File#createNewFile() may allow
multiple clients to create the same file in parallel. This
change provides a fix which is only used when the new config
option core.supportsAtomicCreateNewFile is set to false. The
default for this option is true. This option can only be set in the
global or the system config file. The repository config file is not
taken into account in this case.
If the config option core.supportsAtomicCreateNewFile is true
then File#createNewFile() is trusted and the behaviour doesn't
change.
But if core.supportsAtomicCreateNewFile is set to false then after
successful creation of the lock file a hardlink to that lock file is
created and the attribute nlink of the lock file is checked to be 2. If
multiple clients manage to create the same lock file nlink would be
greater than 2 showing the error.
This expensive workaround is described in
https://www.time-travellers.org/shane/papers/NFS_considered_harmful.html
section III.d) "Exclusive File Creation"
Honor trustFolderStats also when reading packed-refs
Then list of packed refs was cached in RefDirectory based on mtime of
the packed-refs file. This may fail on NFS when attributes are cached.
A cached mtime of the packed-refs file could cause JGit to trust the
cached content of this file and to overlook that the file is modified.
Honor the config option trustFolderStats and always read the packed-refs
content if the option is false. By default this option is set to true
and this fix is not active.
Fix exception handling for opening bitmap index files
When creating a new PackFile instance it is specified whether this pack
has an associated bitmap index file or not. This information is cached
and the public method getBitmapIndex() will always assume a bitmap index
file must exist if the cached data tells so. But it may happen that the
packfiles are repacked during a gc in a different process causing the
packfile, bitmap-index and index file to be deleted. Since JGit still
has an open FileHandle on the packfile this file is not really deleted
and can still be accessed. But index and bitmap index file are deleted.
Fix getBitmapIndex() to invalidate the cached packfile instance if such
a situation occurs.
This problem showed up when a gerrit server was serving repositories
which where garbage collected with native git regularly. Fetch and
clone commands for certain repositories failed permanently after a
native git gc had deleted old bitmap index files.
Change-Id: I8e620bec74dd3f310ba42024f9a657062f868f0e Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Sun, 11 Jun 2017 09:51:59 +0000 (11:51 +0200)]
Use a dedicated executor to run auto-gc in command line interface
WorkQueue uses daemon threads so auto-gc would not be executed after
short-lived commands run in command line. Hence use a dedicated executor
which we shutdown when the command finishes.
Change-Id: I0c2429ecfa04387389d159168ba78a020a696228 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Sat, 10 Jun 2017 23:20:34 +0000 (01:20 +0200)]
Allow to use an external ExecutorService for background auto-gc
If set use the external executor, otherwise use JGit's own simple
WorkQueue. Move WorkQueue to an internal package so we can reuse it
without exposing it in the public API.
Change-Id: I060d62ffd6692362a88b4bf13ee07b0dc857abe9 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
David Pursehouse [Fri, 24 Mar 2017 01:26:44 +0000 (10:26 +0900)]
Fetch: Add --recurse-submodules and --no-recurse-submodules options
Add options to control recursion into submodules on fetch.
Add a callback interface on FetchCommand, to allow Fetch to display
an update "Fetching submodule XYZ" for each submodule.
Change-Id: Id805044b57289ee0f384b434aba1dbd2fd317e5b Signed-off-by: David Pursehouse <david.pursehouse@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Fri, 9 Jun 2017 22:58:23 +0000 (00:58 +0200)]
SubmoduleUpdateCommand#setCallback should return 'this'
The other methods in this class follow the builder pattern, and
return 'this', allowing multiple method calls to be chained in a
single statement.
Update the setCallback method to do the same.
Change-Id: I4ddaacd6d50601f47f61eb6be8b62c8d59cce062 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
David Turner [Wed, 8 Feb 2017 20:07:18 +0000 (15:07 -0500)]
Run auto GC in the background
When running an automatic GC on a FileRepository, when the caller
passes a NullProgressMonitor, run the GC in a background thread. Use a
thread pool of size 1 to limit the number of background threads spawned
for background gc in the same application. In the next minor release we
can make the thread pool configurable.
In some cases, the auto GC limit is lower than the true number of
unreachable loose objects, so auto GC will run after every (e.g) fetch
operation. This leads to the appearance of poor fetch performance.
Since these GCs will never make progress (until either the objects
become referenced, or the two week timeout expires), blocking on them
simply reduces throughput.
In the event that an auto GC would make progress, it's still OK if it
runs in the background. The progress will still happen.
This matches the behavior of regular git.
Git (and now jgit) uses the lock file for gc.log to prevent simultaneous
runs of background gc. Further, it writes errors to gc.log, and won't
run background gc if that file is present and recent. If gc.log is too
old (according to the config gc.logexpiry), it will be ignored.
Change-Id: I3870cadb4a0a6763feff252e6eaef99f4aa8d0df Signed-off-by: David Turner <dturner@twosigma.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Tue, 30 May 2017 11:01:53 +0000 (13:01 +0200)]
Merge branch 'master' into stable-4.8
* master:
Fix out-of-bounds exception in RepoCommand#relative
Fix null return from FS.readPipe when command fails to launch
RenameDetector: Clarify rename limits <= 0
Remove unnecessary cast for DfsReader
Allow DfsReader to be subclassed
Track read IO for DfsReader
Fix javadoc of TooLargeObjectInPackException
Exclude refs/tags from bitmap commit selection
Bryan Donlan [Mon, 22 May 2017 18:37:14 +0000 (11:37 -0700)]
Fix null return from FS.readPipe when command fails to launch
When a command invoked from readPipe fails to launch (i.e. the exec call
fails due to a missing command executable), Process.start() throws,
which gets caught by the generic IOException handler, resulting in a
null return. This change detects this case and rethrows a
CommandFailedException instead.
Additionally, this change uses /bin/sh instead of bash for its posix
command failure test, to accomodate building in environments where bash
is unavailable.
Change-Id: Ifae51e457e5718be610c0a0914b18fe35ea7b008 Signed-off-by: Bryan Donlan <bdonlan@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Shawn Pearce [Fri, 19 May 2017 19:23:02 +0000 (12:23 -0700)]
Track read IO for DfsReader
Compute how much disk IO a DfsReader is performing, and how long the
sum of those operations took on this reader instance. Implementations
of DFS and interested applications can get the stats by calling the
new DfsReader.getIoStats() method at or after close().
Terry Parker [Thu, 18 May 2017 08:30:14 +0000 (01:30 -0700)]
Exclude refs/tags from bitmap commit selection
Commit db77610 ensured that all refs/tags commits are added to the
primary GC pack. It did that by adding all of the refs/tags commits
to the primary GC pack PackWriter's "interesting" object set.
Unfortunately, all commit objects in the "interesting" set are
selected as commits for which bitmap indices will be built. In a
repository like chromium with lots of tags, this changed the number of
bitmaps created from <700 to >10000. That puts huge memory pressure on
the GC task.
This change restores the original behavior of ignoring tags when
selecting commits for bitmaps.
In the "uninteresting" set, commits for refs/heads and refs/tags for
unannotated tags can not be differentiated. We instead identify
refs/tags commits by passing their ObjectIds as a new "noBitmaps"
parameter to the PackWriter.preparePack() methods.
PackWriterBitmapPreparer.setupTipCommitBitmaps() can then use that
"noBitmaps" parameter to exclude those commits.
Change-Id: Icd287c6b04fc1e48de773033fe432a9b0e904ac5 Signed-off-by: Terry Parker <tparker@google.com>