Fix computation of id in WorkingTreeIterator with autocrlf and smudging
JGit failed to do checkouts when the index contained smudged entries and
autocrlf was on. In such cases the WorkingTreeIterator calculated the
SHA1 sometimes on content which was not correctly filtered. The SHA1 was
computed on content which two times went through a lf->crlf conversion.
We used to tell the treewalk whether it is a checkin or checkout
operation and always use the related filters when reading any content.
If on windows and autocrlf is true and we do a checkout operation then
we always used a lf->crlf conversion on any text content. That's not
correct. Even during a checkout we sometimes need the crlf->lf
conversion. E.g. when calculating the content-id for working-tree
content we need to use crlf->lf filtering although the overall operation
type is checkout.
Often this bug does not have effects because we seldom compute the
content-id of filesystem content during a checkout. But we do need to
know whether a file is dirty or not before we overwrite it during a
checkout. And if the index entries are smudged we don't trust the index
and compute filesystem-content-sha1's explicitly.
This caused EGit not to be able to switch branches anymore on Windows
when autocrlf was true. EGit denied the checkout because it thought
workingtree files are dirty because content-sha1 are computed on wrongly
filtered content.
Bug: 493360
Change-Id: I1072a57b4c529ba3aaa50b7b02d2b816bb64a9b8
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Implement the DIR_NO_GITLINKS setting with the same functionality
it provides in cGit.
Bug: 436200
Change-Id: I8304e42df2d7e8d7925f515805e075a92ff6ce28
Signed-off-by: Preben Ingvaldsen <preben@puppetlabs.com>
TreeWalk provides the new method getEolStreamType. This new method can
be used with EolStreamTypeUtil in order to create a wrapped InputStream
or OutputStream when reading / writing files. The implementation
implements support for the git configuration options core.crlf, core.eol
and the .gitattributes "text", "eol" and "binary"
CQ: 10896
Bug: 486563
Change-Id: Ie4f6367afc2a6aec1de56faf95120fff0339a358
Signed-off-by: Ivan Motsch <ivan.motsch@bsiag.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Paths.pathCompare: Utility to sort paths from byte[]
Consolidate copies of this function into one location.
Add some unit tests to prevent bugs that were accidentally
introduced while trying to make this refactoring.
Change-Id: I82f64bbb8601ca2d8316ca57ae8119df32bb5c08
When filters are defined for certain paths in gitattributes make
sure that clean filters are processed when adding new content to the
object database.
Change-Id: Iffd72914cec5b434ba4d0de232e285b7492db868
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Add an attribute accessor to CanonicalTreeParser and use it in Treewalk
When checking out a branch we need to access the attributes stored
in the tree to be checked out. E.g. directly after a clone we checkout
the remote HEAD. In this case index and workingtree are still empty.
So we have to search the tree to be checked out for attributes.
Change-Id: I6d96f5d095ed2e3c259d4b12124e404f5215bd9f
Adds the git attributes computation on the treewalk
Adds the getAttributes feature to the tree walk. The computation of
attributes needs to be done by the TreeWalk since it needs both a
WorkingTreeIterator and a DirCacheIterator.
Bug: 342372
CQ: 9120
Change-Id: I5e33257fd8c9895869a128bad3fd1e720409d361
Signed-off-by: Arthur Daussy <arthur.daussy@obeo.fr>
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Consider original file mode while checking parent ignore rules
The WorkingTreeIterator.isEntryIgnored() should use originally requested
file mode while descending to the file tree root and checking ignore
rules. Original code asking isEntryIgnored() on a file was using
directory mode instead if the .gitignore was not located in the same
directory.
Bug: 473506
Change-Id: I9f16ba714c3ea9e6585e9c11623270dbdf4fb1df
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Core classes to parse and process .gitattributes files including
support for reading attributes in WorkingTreeIterator and the
dirCacheIterator.
The implementation follows the git ignore implementation. It supports
lazy reading attributes while walking the working tree.
Bug: 342372
CQ: 9078
Change-Id: I05f3ce1861fbf9896b1bcb7816ba78af35f3ad3d
Also-by: Marc Strapetz <marc.strapetz@syntevo.com>
Also-by: Gunnar Wagenknecht <gunnar@wagenknecht.org>
Also-by: Arthur Daussy <arthur.daussy@obeo.fr>
Signed-off-by: Gunnar Wagenknecht <gunnar@wagenknecht.org>
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: Arthur Daussy <arthur.daussy@obeo.fr>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
The change tries to make jgit behave more like native CLI git regarding
the negation rules. According to [1] "... prefix "!" which negates the
pattern; any matching file excluded by a previous pattern will become
included again." Negating the pattern should not automatically make the
file *not ignored* - other pattern rules have to be considered too.
The fix adds test cases for both bugs 448094 and 407475.
[1] https://www.kernel.org/pub/software/scm/git/docs/gitignore.html
Bug: 448094
Bug: 407475
Change-Id: I322954200dd3c683e3d8f4adc48506eb99e56ae1
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The current IgnoreRule/FileNameMatcher implementation scales not well
with huge repositories - it is both slow and memory expensive while
parsing glob expressions (bug 440732). Addtitionally, the "double star"
pattern (/**/) is not understood by the old parser (bug 416348).
The proposed implementation is a complete clean room rewrite of the
gitignore parser, aiming to add missing double star pattern support and
improve the performance and memory consumption.
The glob expressions from .gitignore rules are converted to Java regular
expressions (java.util.regex.Pattern). java.util.regex.Pattern code can
evaluate expression from gitignore rules considerable faster (and with
less memory consumption) as the old FileNameMatcher implementation.
CQ: 8828
Bug: 416348
Bug: 440732
Change-Id: Ibefb930381f2f16eddb9947e592752f8ae2b76e1
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Adds further tests where the working tree is dirty (differs from
index) and where we have staged but uncommitted changes.
Fixed the test case 9 for file/directory conflicts.
Bug: 428819
Change-Id: Ie44a288b052abe936ebb74272d0fefef3b218a7a
Signed-off-by: Axel Richard <axel.richard@obeo.fr>
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Add missing @Deprecated to deprecated fields and methods
Java spec requires the @Deprecated annotation on any deprecated
field or method. Add the missing annotation to fields and methods
already declared deprecated in the javadoc.
Change-Id: Ic0ef24b43cfd99ac947e771ef5a28e493c304274
While fixing an NPE, I introduced another one in a deprecated isModified
method. It cannot avoid NPE's entirely, which is the reason the method
is deprecated
Change-Id: I5147c1c94621586dd84bd11e6090a954523b6c1c
Remove obsolete getRepositoryMethod from WorkingTreeIterator
The method was added for symlink support, but isn't needed anymore.
Since it was added for this release it's like it never existed.
Change-Id: I422cd1dcdfa40b25ba3d6e08b112159dae9a4353
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fix NPE when WorkingTreeIterator does not have a repository set
It's strange that we have that member since it is not so clear
when it it set or not.
Change-Id: I53903a264f46866d249901a3cd9f9295028aa6bd
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Java normalizes paths to NFC, but other source may not, e.g Eclipse.
Bug: 413390
Change-Id: I08649ac58c9b3cb8bf12794703e4137b1b4e94d5
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Add special case to WorkingTreeIterator for matching unnormalized symlinks
If there is an unnormalized symbolic link in the index, lie that it
matches a normalized link in the working tree. This does not make the
case completely invisible everywhere though, but it helps to some
degree.
Change-Id: I599fb71648c41fa2310049d0e0040b3c9f09386b
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The change includes comparing symbolic links between disk and index,
adding symbolic links to the index, creating/modifying links on
checkout. The behavior is controlled by the core.symlinks setting, just
as C Git does. When a new repository is created core.symlinks will be
set depending on the capabilities of the operating system and Java
runtime.
If core.symlinks is set to true, the assumption is that symlinks are
supported, which may result in runtime errors if this turns out not to
be the case.
Measuring the cost of jgit status on a repository with ~70000 files,
of which ~30000 are tracked reveals a penalty of about 10% for using
the Java7 (really NIO2) support module.
Bug: 354367
Change-Id: I12f0fdd9d26212324a586896ef7eb1f6ff89c39c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
We should really pass the forceContentCheck parameter to
the real method.
Change-Id: I9ea439cf6340a18d0e931edde3b9e3486cafde93
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fix for core.autocrlf=input resulting in modified file
This version does not attempt to unsmudge, unlike the first attempt
in Idafad150553df14827eccfde2e3b95760e16a8b6.
Bug: 372834
Change-Id: I9300e735cb16d6208e1df963abb1ff69f688155d
Also-by: Robin Stocker <robin@nibor.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Robin Stocker <robin@nibor.org>
Revert "Fix for core.autocrlf=input resulting in modified file..."
This reverts commit 1def0a1257.
We found this fix uncovers problems with unsmudged DirCacheEntry's. This
surfaced because egit's ui test CreatePatchActionTest failed since jgit
computes a wrong status. JGit doesn't detect modified content in now
unsmudged entries. Hence revert this change until these problems are
fixed.
Change-Id: Ia04277ce316d35fc5b0d82c93d2078b856af24bb
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
There is a huge performance issue when using both JGit (EGit) and Git
because JGit does not fill all dircache stat fields with the values Git
would expect. As a result thereof Git would typically revalidate a large
number of tracked files. This can take several minutes for large
repositories with many large files.
Since 1.8.2 Git will restrict stat checking to the size and whole second
part of the modification time stamp, if core.statinfo is set to
"minimal".
As JGit checks only size and modification time this is close to what
JGit already does. To make the match perfect ignore the sub-second part
of the modification time stamp if core.statinfo = minimal.
Change-Id: I8eaff1858a891571075a86db043f9d80da3d7503
Consider that some Java version on Linux only return integral timestamps
This logic is similar to what we do on Windows, but in this case it's
Java that truncates the timestamps, not Git.
Bug: 395410
Change-Id: Ie55dcb9fa583f5c3dd10d7a1b582e5b04b45858d
A few classes such as Constanrs are marked with @SuppressWarnings, as are
toString() methods with many liternal, but otherwise $NLS-n$ is used for
string containing text that should not be translated. A few literals may
fall into the gray zone, but mostly I've tried to only tag the obvious
ones.
Change-Id: I22e50a77e2bf9e0b842a66bdf674e8fa1692f590
Fix for Iff768422c, use offset 0 when going back to work tree iterator
In Iff768422c the offset used for the content id was fixed to use the
offset that applied to the dircache iterator. Unfortunately the index
for the dircache content id offset stuck for entries that were not in
the index. Few caller probably cared about that, unless it actually
caused an ArrayIndexOutOfBoundsException.
Change-Id: Ic9f0e77c8ea3a0770d88565e94392e76853e3006
Add isModeDifferent method to WorkingTreeIterator
that compares mode with consideration of the
core.filemode setting in the config.
Bug: 379004
Change-Id: I07335300d787a69c3d1608242238991d5b5214ac
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Remove throws IOException declaration on filterClean
This method only creates an EolCanonicalizingInputStream
which does not throw an IOException and so the throws
declaration on the method is unneeded.
Change-Id: Icae8b80006c5e3ffcf3b69790a1a45c505be0f05
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Allow adding files with size over 2 GB. The drawback is that the tests
for huge file support adds roughly 10 minutes of execution time.
For that reason we @Ignore the test in the standard test execution.
Change-Id: I5788e8009899203b346f353297166825b3744575
Content length is computed and cached (short term) in the working
tree iterator when core.autocrlf is set.
Hopefully this is a cleaner fix than my previous attempt to make
autocrlf work.
Change-Id: I1b6bbb643101a00db94e5514b5e2b069f338907a
EolCanonicalizingInputStream: binary detection should be optional
EolCanonicalizingInputStream may also be used in combination with
.gitattributes. If .gitattributes states that a file is of type text, line
endings have to be canonicalized even if the actual file content seems
to be binary.
Change-Id: Ie4ccdfc5cb91fbd55e06f51146cf5c7c84b8e18b
Support gitdir references in working tree .git file
A '.git' file in a repository's working tree root is now parsed
as a ref to a folder located elsewhere. This supports submodules
having their repository location outside of the parent repository's
working directory such as in the parent repository's '.git/modules'
directory.
This adds support to BaseRepositoryBuilder for repositories created
with the '--separate-git-dir' option specified to 'git init'.
Change-Id: I73c538f6d845bdbc0c4e2bce5a77f900cf36e1a9
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This patch introduces CRLF handling to the DirCacheCheckout and
WorkingTreeIterator supporting the AutoCRLF for add, checkout
reset and status and hopefully some other places that depende
on the underlying logic of the affected API's.
The patch includes test cases for the Status command provided by
Tomasz Zarna for bug 353867.
The core.eol and core.safecrlf options are not yet supported.
Bug: 301775
Bug: 353867
Change-Id: I2280a2dc0698829475de6a662a6c6e80b1df7663
Retain executable mode of existing files on Windows
Currently files in a repository marked as executable will have
that mode unset when modified and committed on systems that
do not support detection of this mode since the working tree
iterator will never report this mode for any entries.
This change updates WorkingTreeIterator to be able
to determine the target file mode to be used for the index
through consideration of the configured WorkingTreeOptions.
Bug: 364956
Change-Id: Iae496baa011b8a59d9329ec73615482b03d34a5a
Open a repository for submodule entries that have a child .git
directory and use the resolved HEAD commit as the entry's id.
Change-Id: I68d6e127f018b24ee865865a2dd3011a0e21453c
Signed-off-by: Kevin Sawicki <kevin@github.com>
Respect core.excludesfile to enable global ignore rules
Also use FS.resolve() to properly resolve files from path strings.
Bug: 328428 (partial fix)
Change-Id: I41d94694f220dcb85605c9acadfffb1fa23beaeb
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Introduce metaData compare between working tree and index entries
Instead of offering only a high-level isModified() method a new
method compareMetadata() is introduced which compares a working tree entry
and a index entry by looking at metadata only. Some use-cases
(e.g. computing the content-id in idBuffer()) may use this new method
instead of isModified().
Change-Id: I4de7501d159889fbac5ae6951f4fef8340461b47
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Do not rely on filemode differences in case of symbolic links
When checking whether a file in the working tree has been modified -
WorkingTreeIterator.isModified() - we should not trust the filemode
in case of symbolic links, but check the timestamp and also the
content, if requested. Without this fix symlinks will always be shown
in EGit as modified files on Windows systems.
Change-Id: I367c807df5a7e85e828ddacff7fee7901441f187
Signed-off-by: Philipp Thun <philipp.thun@sap.com>