IgnoreNode: include path to file for invalid .gitignore patterns
Include the full file path of the .gitignore file and the line number
of the invalid pattern. Also include the pattern itself.
.gitignore files inside the repository are reported with their
repository-relative path; files outside (from git config
core.excludesFile or .git/info/exclude) are reported with their
full absolute path.
Bug: 571143
Change-Id: Ibe5969679bc22cff923c62e3ab9801d90d6d06d1
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Wrap the Files.list returned Stream in a try-with-resources block
Adds a new FileUtils.hasFiles(Path) helper method to correctly handle
the Files.list returned Stream.
These errors were found by compiling the code using JDK11's
javac compiler.
Change-Id: Ie8017fa54eb56afc2e939a2988d8b2c5032cd00f
Signed-off-by: Terry Parker <tparker@google.com>
[spotbugs] Fix potential NPE in WorkingTreeIterator#isModified
File#list can return null. Fix the potential NPE by using Files#list
which is also faster since it retrieves directory entries lazily while
File#list retrieves them eagerly.
Change-Id: Idf4bda398861c647587e357326b8bc8b587a2584
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This enables EGit to override with a lenient variant that logs the
problem and continues with the default value. EGit needs this because
otherwise a corrupt core.excludesFile entry can render the whole UI
unusable (until the git config is fixed) because any use of the
WorkingTreeIterator will throw an InvalidPathException.
This is not a problem on OS X, where all characters are allowed in
file names. But on Windows some characters are forbidden... see bug
567296. The message of the InvalidPathException is not helpful since
it doesn't point to the origin of the problem. EGit can log a much
better message indicating the offending config file and entry in the
Eclipse error log, where the user can see it.
Bug: 567309
Change-Id: I4e57afa715ff3aaa52cd04b5733f69e53af5b1e0
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
When comparing git directory paths to check whether one is a prefix
of another, one must add a slash to avoid false prefix matches when
one directory name is a prefix of another. The path "audio" is not
a prefix of the path "audio-new", but would be a prefix of a path
"audio/new".
Bug: 566799
Change-Id: I6f671ca043c7c2c6044eb05a71dc8cca8d0ee040
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
DiffFormatter: correctly deal with tracked files in ignored folders
In JGit 5.0, the FileTreeIterator was changed to skip ignored folders
by default. To catch tracked files inside ignored folders, the tree
walk needs to have a DirCacheIterator, and the FileTreeIterator has
to know about that DirCacheIterator via setDirCacheIterator(). (Or
the optimization has to be switched off explicitly via
setWalkIgnoredDirectories(true).)
Skipping ignored directories is an important optimization in some
cases, for instance in node.js/npm projects, where we'd otherwise
traverse the whole huge and deep hierarchy of the typically ignored
node_modules folder.
While all uses of WorkingTreeIterator in JGit had been adapted,
DiffFormatter was forgotten. To make it work correctly (again) also
for such cases, make it set up a WorkingTreeeIterator automatically,
and make sure the WorkingTreeSource can find such files, too. Also
pass the repository to the TreeWalks used inside the DiffFormatter
to pick up the correct attributes, filters, and line-ending settings.
Bug: 565081
Change-Id: Ie88ac81166dc396ba28b83313964c1712b6ca199
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Handle non-normalized index also for executable files
Commit 60cf85a4 corrected the handling of check-in for files where
the index version is non-normalized, i.e., contains CR-LF line endings.
However, it did so only for regular files, not executable files.
Bug: 561438
Change-Id: I372cc990c5efeb00315460f36459c0652d5d1e77
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Reorder modifiers to follow Java Language Specification
The Java Language Specification recommends listing modifiers in
the following order:
1. Annotations
2. public
3. protected
4. private
5. abstract
6. static
7. final
8. transient
9. volatile
10. synchronized
11. native
12. strictfp
Not following this convention has no technical impact, but will reduce
the code's readability because most developers are used to the standard
order.
This was detected using SonarLint.
Change-Id: I9cddecb4f4234dae1021b677e915be23d349a380
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Enable UnusedException at ERROR level which causes the build to fail
in many places with:
[UnusedException] This catch block catches an symbol and re-throws
another, but swallows the caught symbol rather than setting it as a
cause. This can make debugging harder.
Fix it by setting the caught exception as cause on the subsequently
thrown exception.
Note: The grammatically incorrect error message is copy-pasted as-is
from the version of ErrorProne currently used in Bazel; it has been
fixed by [1] in the latest version.
[1] https://github.com/google/error-prone/commit/d57a39c
Change-Id: I11ed38243091fc12f64f1b2db404ba3f1d2e98b5
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Replace usage of ArrayIndexOutOfBoundsException in treewalk
Using exceptions during normal operations - for example with the
desire of expanding an array in the failure case - can have a
severe performance impact. When exceptions are instantiated,
a stack trace is collected. Generating stack trace can be expensive.
Compared to that, checking an array for length - even if done many
times - is cheap since this is a check that can run in just a
handful of CPU cycles.
Change-Id: Ifaf10623f6a876c9faecfa44654c9296315adfcb
Signed-off-by: Patrick Hiesel <hiesel@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Enable and fix "Statement unnecessarily nested within else clause" warnings
Since [1] the gerrit project includes jgit as a submodule, and has this
warning enabled, resulting in 100s of warnings in the console.
Also enable the warning here, and fix them.
At the same time, add missing braces around adjacent and nearby one-line
blocks.
[1] https://gerrit-review.googlesource.com/c/gerrit/+/227897
Change-Id: I81df3fc7ed6eedf6874ce1a3bedfa727a1897e4c
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
WorkingTreeIterator: handle different timestamp resolutions
Older JGit stored only milliseconds timestamps in the index. Newer
JGit may get finer timestamps from the file system. This leads to
slow index diffs when a new JGit runs against an index produced
by older JGit because many timestamps will differ and JGit will
then do many content checks. See [1].
Handle this migration case by only comparing milliseconds if the
index entry has only millisecond precision.
The inverse may also occur; also compare only milliseconds if the
file timestamp has only millisecond precision.
Do the same also for microsecond resolution. On Windows, NTFS may
provide 100ns resolution and may be used by external programs writing
the index, but Java's WindowsFileAttributes may provide only
microseconds.
File timestamp precision in Java depends not only on the Java APIs
used by different JGit versions but may also change when running the
same Java code on different VMs. And of course the resolution may
vary among operating and file systems. Moreover, timestamp precision
in the index depends on the program that wrote the index. Canonical
git may use a different resolution, maybe even different between git
versions.
[1] https://www.eclipse.org/forums/index.php/t/1100344/
Change-Id: Idfd08606c883cb98787b2138f9baf0cc89a57b56
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Remove an old work-around for core.autocrlf = input
The removed code was trying to avoid mistakenly reporting differences
when core.autocrlf was set to "input" but a file had already been
committed with CR-LF. It did that by running the blob from the cache
through a CRLF-to-LF filter because older JGit would also run the file
from the working tree through such a filter.
The real fix for this case was done in commit 60cf85a. Since then files
are not normalized if they have already been committed with CR-LF and
this old fix attempt from bug 372834 is no longer needed.
Change-Id: Ib4facc153d81325cb48b4ee956a596b423f36241
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Fix WorkingTreeIterator.compareMetadata() for CheckStat.MINIMAL
If CheckStat is MINIMAL or timestamps have no nanosecond part
WorkingTreeIterator.compareMetaData only checks the second part of
timestamps and ignores nanoseconds which may have ended up in the index
by using native git.
If
fileLastModified.getEpochSecond() == cacheLastModified.getEpochSecond()
we currently proceed comparing fileLastModified and cacheLastModified
with full precision which is wrong since we determined that we detected
reduced timestamp resolution.
Fix this and also handle smudged index entries for CheckStat.MINIMAL.
Change-Id: I6149885903ac63d79b42d234cc02aa4e19578f3c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fix NarrowingCompoundAssignment warnings from Error Prone
Error Prone reports:
[NarrowingCompoundAssignment] Compound assignments from long to int
hide lossy casts
and
[NarrowingCompoundAssignment] Compound assignments from int to byte
hide lossy casts
See https://errorprone.info/bugpattern/NarrowingCompoundAssignment
Fix the warnings by adding explicit casts or changing types as
necessary.
Now that all occurrences of the warning are fixed, increase its
severity to ERROR.
Change-Id: Idb3670e6047b146ae37daee07212ff9455512623
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Use Instant instead of milliseconds for filesystem timestamp handling
This enables higher file timestamp resolution on filesystems like ext4,
Mac APFS (1ns) or NTFS (100ns) providing high timestamp resolution on
filesystem level.
Note:
- on some OSes Java 8,9 truncate milliseconds, see
https://bugs.openjdk.java.net/browse/JDK-8177809, fixed in Java 10
- UnixFileAttributes truncates timestamp resolution to microseconds when
converting the internal representation to FileTime exposed in the API,
see https://bugs.openjdk.java.net/browse/JDK-8181493
- WindowsFileAttributes also provides only microsecond resolution
Change-Id: I25ffff31a3c6f725fc345d4ddc2f26da3b88f6f2
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Handle missing "ours" stage in WorkingTreeIterator.hasCrLfInIndex()
In a delete-modify conflict with the deletion as "ours" there may be
no stage 2 in the index. Add appropriate null checks. Add a new test
for this case, and verify that the file gets added with a single LF
after conflict resolution with core.autocrlf=true. This matches the
behavior of canonical git for this case.
Bug: 547724
Change-Id: I1bafdb83d9b78bf85294c78325e818e72fae53bc
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Replace simple uses of Iterator with a corresponding for-loop.
Also add missing braces on loops as necessary.
Change-Id: I708d82acdf194787e3353699c07244c5ac3de189
Signed-off-by: Carsten Hammer <carsten.hammer@t-online.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
With text=auto or core.autocrlf=true, git does not normalize upon
check-in if the file in the index contains already CR/LFs. The
documentation says: "When text is set to "auto", the path is
marked for automatic end-of-line conversion. If Git decides that
the content is text, its line endings are converted to LF on
checkin. When the file has been committed with CRLF, no conversion
is done."[1]
Implement the last bit as in canonical git: check the blob in the
index for CR/LFs. For very large files, we check only the first 8000
bytes, like RawText.isBinary() and AutoLFInputStream do.
In Auto(CR)LFInputStream, ensure that the buffer is filled as much as
possible for the isBinary() check.
Regarding these content checks, there are a number of inconsistencies:
* Canonical git considers files containing lone CRs as binary.
* RawText checks the first 8000 bytes.
* Auto(CR)LFInputStream checks the first 8096 (not 8192!) bytes.
None of these are changed with this commit. It appears that canonical
git will check the whole blob, not just the first 8k bytes. Also
note: the check for CR/LF text won't work with LFS (neither in JGit
nor in git) since the blob data is not run through the smudge filter.
C.f. [2].
Two tests in AddCommandTest actually tested that normalization was
done even if the file was already committed with CR/LF.These tests
had to be adapted. I find the git documentation unclear about the
case where core.autocrlf=input, but from [3] it looks as if this
non-normalization also applies in this case.
Add new tests in CommitCommandTest testing this for the case where
the index entry is for a merge conflict. In this case, canonical git
uses the "ours" version.[4] Do the same.
[1] https://git-scm.com/docs/gitattributes
[2] https://github.com/git/git/blob/3434569fc/convert.c#L225
[3] https://github.com/git/git/blob/3434569fc/convert.c#L529
[4] https://github.com/git/git/blob/f2b6aa98b/read-cache.c#L3281
Bug: 470643
Change-Id: Ie7310539fbe6c737d78b1dcc29e34735d4616b88
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
After cloning a repo with a submodule, non-recursively, JGit would
encounter in its TreeWalk in IndexDiff:
* first, a missing gitlink (in index & HEAD, not in working tree)
* second, the untracked folder (not in index and head, in working tree)
As a result, it would report the submodule as missing. Canonical git
reports a clean workspace.
The root cause of this is that the path of a gitlink "x" did
not compare equal to the path of a tree "x" in JGit.
Correct Paths.compare() to account for that. If two paths are otherwise
equal, then let gitlinks match both trees and files. Matching trees
solves the bug. Matching files is necessary to handle the case where
the gitlink directory was replaced by a file; see the new test case
IndexDiffSubmoduleTest.testSubmoduleReplacedByFile(). Comparisons of
unequal paths are left untouched, so the sort order is unchanged.
After the fix, another bug(?) in WorkingTreeIterator became apparent:
with core.dirNoGitLinks = true, it was no longer possible to overwrite
a gitlink in the index. This is now fixed in WorkingTreeIterator.
Add new test cases for the bug itself and for some related cases
(submodule directory deleted or replaced by a file) in
IndexDiffSubmoduleTest. Add a test for missing files in IndexDiffTest,
and adapt the PathsTest to test matching gitlinks.
Bug: 467631
Change-Id: I0549d10d46b1858e5eec3cc15095aa9f1d5f5280
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Fix replacement quoting for replaceAll in filter command
According to String.replaceAll JavaDoc:
"Note that backslashes (\) and dollar signs ($) in the replacement
string may cause the results to be different than if it were being
treated as a literal replacement string; see Matcher.replaceAll. Use
java.util.regex.Matcher.quoteReplacement to suppress the special meaning
of these characters, if desired."
Bug: 536318
Change-Id: Ib70cfec41bf73e14d23d94d14aee05a25b1e87f6
Signed-off-by: Markus Duft <markus.duft@ssi-schaefer.com>
Make FileTreeIterator not enter ignored directories by default. We
only need to enter ignored directories if we do some operation against
git, and there is at least one tracked file underneath an ignored
directory.
Walking ignored directories should be avoided as much as possible as
it is a potential performance bottleneck. Some projects have a lot of
files or very deep hierarchies in ignored directories; walking those
may be costly (especially so on Windows). See for instance also bug
500106.
Provide a FileTreeIterator.setWalkIgnoredDirectories() operation to
force the iterator to iterate also through otherwise ignored
directories. Useful for tests (IgnoreNodeTest, CGitIgnoreTest), or
to implement things like "git ls-files --ignored".
Add tests in DirCacheCheckoutTest, and amend IndexDiffTest to test a
little bit more.
Bug: 388582
Change-Id: I6ff584a42c55a07120a4369fd308409431bdb94a
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Remove it from
* package private functions.
* try blocks
* for loops
this was done with the following python script:
$ cat f.py
import sys
import re
import os
def replaceFinal(m):
return m.group(1) + "(" + m.group(2).replace('final ', '') + ")"
methodDecl = re.compile(r"^([\t ]*[a-zA-Z_ ]+)\(([^)]*)\)")
def subst(fn):
input = open(fn)
os.rename(fn, fn + "~")
dest = open(fn, 'w')
for l in input:
l = methodDecl.sub(replaceFinal, l)
dest.write(l)
dest.close()
for root, dirs, files in os.walk(".", topdown=False):
for f in files:
if not f.endswith('.java'):
continue
full = os.path.join(root, f)
print full
subst(full)
Change-Id: If533a75a417594fc893e7c669d2c1f0f6caeb7ca
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Significantly speed up FileTreeIterator on Windows
Getting attributes of files on Windows is an expensive operation.
Windows stores file attributes in the directory, so they are
basically available "for free" when a directory is listed. The
implementation of Java's Files.walkFileTree() takes advantage of
that (at least in the OpenJDK implementation for Windows) and
provides the attributes from the directory to a FileVisitor.
Using Files.walkFileTree() with a maximum depth of 1 is thus a
good approach on Windows to get both the file names and the
attributes in one go.
In my tests, this gives a significant speed-up of FileTreeIterator
over the "normal" way: using File.listFiles() and then reading the
attributes of each file individually. The speed-up is hard to
quantify exactly, but in my tests I've observed consistently 30-40%
for staging 500 files one after another, each individually, and up
to 50% for individual TreeWalks with a FileTreeIterator.
On Unix, this technique is detrimental. Unix stores file attributes
differently, and getting attributes of individual files is not costly.
On Unix, the old way of doing a listFiles() and getting individual
attributes (both native operations) is about three times faster than
using walkFileTree, which is implemented in Java.
Therefore, move the operation to FS/FS_Win32 and call it from
FileTreeIterator, so that we can have different implementations
depending on the file system.
A little performance test program is included as a JUnit test (to be
run manually).
While this does speed up things on Windows, it doesn't solve the basic
problem of bug 532300: the iterator always gets the full directory
listing and the attributes of all files, and the more files there are
the longer that takes.
Bug: 532300
Change-Id: Ic5facb871c725256c2324b0d97b95e6efc33282a
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Open auto-closeable resources in try-with-resource
When an auto-closeable resources is not opened in try-with-resource,
the warning "should be managed by try-with-resource" is emitted by
Eclipse.
Fix the ones that can be silenced simply by moving the declaration of
the variable into a try-with-resource.
In cases where we explicitly call the close() method, for example in
tests where we are testing specific behavior caused by the close(),
suppress the warning.
Leave the ones that will require more significant refcactoring to fix.
They can be done in separate commits that can be reviewed and tested
in isolation.
Change-Id: I9682cd20fb15167d3c7f9027cecdc82bc50b83c4
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Use TreeWalk#getEolStreamType(OperationType) instead.
Change-Id: I0f102ddf36102ff55a71448e376ed08743da5d1f
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
As it is right now some streams leak out of the filter construct. This
change clarifies responsibilities and fixes stream leaks
Change-Id: Ib9717d43a701a06a502434d64214d13a392de5ab
Signed-off-by: Markus Duft <markus.duft@ssi-schaefer.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Processing of negated rules, like !bin/ was not working correctly: they
were interpreted too broad, resulting in unexpected untracked files
which should actually be ignored
Bug: 409664
Change-Id: I0a422fd6607941461bf2175c9105a0311612efa0
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Transfer data in chunks of 8k Transferring data byte per byte is slow,
running checkout with CleanFilter on a 2.9MB file takes 20 seconds.
Using a buffer of 8k shrinks this time to 70ms.
Also register the filter commands in a way that the native GIT LFS can
be used alongside with JGit.
Implements auto-discovery of LFS server URL when cloning from a Gerrit
LFS server.
Change-Id: I452a5aa177dcb346d92af08b27c2e35200f246fd
Also-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Markus Duft <markus.duft@ssi-schaefer.com>
This doesn't handle the really hard thing, which is merging spurious
conflicts inside .gitmodules files. That's OK: git.git doesn't
either. Users can resolve the conflict themselves and then commit
the merge.
Previously, jgit would crash when attempting to merge conflicting
submodule changes. Even if there was no conflict, after a merge which
adds submodules, the repository would have been missing empty
directories for newly-added submodules.
This patch fixes the crash, and adds the empty directories where
necessary. It ensures that the index is in a conflicted state when
submodule changes conflict.
Reported-by: Alexey Korobkov
Bug: 494551
Change-Id: I79db6798c2bdcc1159b5b2589b02da198dc906a1
Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
TreeWalk: Make getEolStreamType(OperationType) public
and deprecate getEolStreamType().
This resolves a TODO that was apparently supposed to be done in
version 4.4.
Change-Id: I5c9861aedabdc3f99dcf47519b3959a979e6a591
Happily, most anonymous SectionParser implementations can be replaced
with FooConfig::new, as long as the constructor takes a single Config
arg. Many of these, the non-public ones, can in turn be inlined. A few
remaining SectionParsers can be lambdas.
Change-Id: I3f563e752dfd2007dd3a48d6d313d20e2685943a