An attempt to re-implement not well documented Git CLI behavior for
patterns with backslashes.
It looks like Git silently ignores all \ characters in ignore rules, if
they are NOT covered by 3 cases described in [1]:
{quote}
1) ... Put a backslash ("\") in front of the first hash for patterns
that begin with a hash.
...
2) Trailing spaces are ignored unless they are quoted with backslash
("\").
...
3) Put a backslash ("\") in front of the first "!" for patterns that
begin with a literal "!", for example, "\!important!.txt".
{quote}
Undocumented also is the fact that backslash itself can be escaped by
backslash.
So \h\e\l\l\o\.t\x\t rule matches hello.txt and a\\\\b a\b in Git CLI.
Additionally, the glob parser [2] knows special meaning of backslash:
{quote}
One can remove the special meaning of '?', '*' and '[' by preceding
them by a backslash, or, in case this is part of a shell command
line, enclosing them in quotes. Between brackets these characters
stand for themselves. Thus, "[[?*\]" matches the four characters
'[', '?', '*' and '\'.
{quote}
[1] https://www.kernel.org/pub/software/scm/git/docs/gitignore.html
[2] http://man7.org/linux/man-pages/man7/glob.7.html
Bug: 478065
Change-Id: I3dc973475d1943c5622103701fa8cb3ea0684e3e
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Don't use java.util.regex for two simple wildcard cases
To improve ignore parser performance we can avoid using java.util.regex
code on simple wildcard patterns with leading or trailing asterisk. As
those patterns represent a majority of ignore rules, the index diff
performance can be drastically increased on huge repository with lot of
ignore rules.
Bug: 450466
Change-Id: I80428441cc8d5de5468813f841d89322413eed8b
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The current IgnoreRule/FileNameMatcher implementation scales not well
with huge repositories - it is both slow and memory expensive while
parsing glob expressions (bug 440732). Addtitionally, the "double star"
pattern (/**/) is not understood by the old parser (bug 416348).
The proposed implementation is a complete clean room rewrite of the
gitignore parser, aiming to add missing double star pattern support and
improve the performance and memory consumption.
The glob expressions from .gitignore rules are converted to Java regular
expressions (java.util.regex.Pattern). java.util.regex.Pattern code can
evaluate expression from gitignore rules considerable faster (and with
less memory consumption) as the old FileNameMatcher implementation.
CQ: 8828
Bug: 416348
Bug: 440732
Change-Id: Ibefb930381f2f16eddb9947e592752f8ae2b76e1
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Most diff implementations really want to use cached hash codes for
elements, rather than element equality, as they need to perform many
compares and unique hash codes for elements can really speed that
process up.
To make it easier to define element hash functions, move the caching
of hash codes into a wrapper sequence type, so that individual
sequence types like RawText don't need to do this themselves. This
has a nice property of also allowing the sequence to no longer care
about the specific SequenceComparator that is going to be used, and
permits the caching to only examine the middle region that isn't
common to the two inputs.
Change-Id: If8623556da9419117b07c5073e8bce39de02570e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Use core.streamFileThreshold to set our streaming limit
We default this to 1 MiB for now, but we allow users to modify
it through the Repository's configuration file to be a different
value. A new repository listener is used to identify when the
setting has been updated and trigger a reconfiguration of any
active ObjectReaders.
To prevent a horrible explosion we cap core.streamFileThreshold
at no more than 1/4 of the maximum JVM heap size. We do this
because we need at least 2 byte arrays equal in size to the
stream threshold for the worst case delta inflation scenario,
and our host application probably also needs some amount of the
heap for their working set size.
Change-Id: I103b3a541dc970bbf1a6d92917a12c5a1ee34d6c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Replace the old crude event listener system with a much more generic
implementation, patterned after the event dispatch techniques used
in Google Web Toolkit 1.5 and later.
Each event delivers to an interface that defines a single method,
and the event itself is what performs the delivery in a type-safe
way through its own dispatch method.
Listeners are registered in a generic listener list, indexed by
the interface they implement and wish to receive an event for.
Delivery of events is performed by looping through all listeners
implementing the event's corresponding listener interface, and using
the event's own dispatch method to deliver the event. This is the
classical "double dispatch" pattern for event delivery.
Listeners can be unregistered by invoking remove() on their
registration handle. This change therefore requires application
code to track the handle if it wishes to remove the listener at a
later point in time.
Event delivery is now exposed as a generic public method on the
Repository class, making it easier for any type of message to
be sent out to any type of listener that has registered, without
needing to pre-arrange for type-safe fireFoo() methods.
New event types can be added in the future simply by defining a
new RepositoryEvent subclass and a corresponding RepositoryListener
interface that it dispatches to. By always adding new events through
a new interface, we never need to worry about defining an Adapter
to provide default no-op implementations of new event methods.
Change-Id: I651417b3098b9afc93d91085e9f0b2265df8fc81
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Per CQ 3448 this is the initial contribution of the JGit project
to eclipse.org. It is derived from the historical JGit repository
at commit 3a2dd9921c.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>