source.dussan.org Git - jgit.git/log

Delta compression: reuse DeltaTask.getAdjustedWeight()

Change-Id: I07ed5207b175735b4e2c46edf652cc35908dad02
Signed-off-by: Terry Parker <tparker@google.com>

[performance] Speed up delta packing

When packing is able to reuse lots of deltas from existing packs, those
objects are marked as "doNotAttemptDelta" and do not contribute to
DeltaTask's computeTopPaths() "totalWeight" calculation.

In the extreme case when all packs are reusable, "totalWeight" will be
zero. DeltaTask.partitionTasks() uses "totalWeight" to determine a
"weightPerThread" size it uses to set up DeltaTasks. When "totalWeight"
is small, partitionTasks() ends up creating a DeltaTask for every
unique path.

For a large repository, the small "weightPerThread" can result in the
creation of >100k tasks (for the MSM 3.10 Linux repository, the count
was ~150k). This makes the "task stealing" mechanism in DeltaTask very
inefficient, because every attempt to steal work does a linear walk
through all tasks, searching for the one with the most work remaining,
which is O(N^2) comparisons. For the MSM 3.10 repository when all
deltas were reusable, PackWriter.parallelDeltaSearch() took
(1615+1633+1458)/3 = 1568 seconds.

The error is that DeltaTask treats the weights of objects marked as
"doNotAttemptDelta" inconsistently. It ignores the weights when
calculating "totalWeight" but uses them when partitioning the tasks.
The fix is to also ignore them when partitioning the tasks.

With this patch applied, PackWriter.parallelDeltaSearch() on the
MSM 3.10 repository when all deltas are reused went from taking
1568 seconds to 62ms (>25k speedup).

This patch also fixes a totalWeight initialization error in
DeltaTask.computeTopPaths().

Change-Id: I2ae37efa83bca42b0e716266ae6aa9d182e76d9c
Signed-off-by: Terry Parker <tparker@google.com>

[style] Add braces to DeltaTask

Change-Id: I3306cc77544dec6d9273b2c35b8db0ed43799992
Signed-off-by: Terry Parker <tparker@google.com>

Building bitmaps: Simplify the logic to sort by chains

Change-Id: I3da98e85107154c159093c138893f54dfa7ef435
Signed-off-by: Terry Parker <tparker@google.com>

Merge "Pack bitmaps: improve readability"

Merge "Bitmap generation: Add a test of ordering commits by "chains""

Merge "reset command should support the -- <paths> parameters"

reset command should support the -- <paths> parameters

Bug: 480750
Change-Id: Ia85b1aead03dcf2fcb50ce0391b656f7c60a08d4
Signed-off-by: Kaloyan Raev <kaloyan.r@zend.com>

JGit CLI should check if calling itself when invoking native git exe

Bug: 480782
Change-Id: I0d7f7a648671e7ff678f2b19fe39b85f8835c061
Signed-off-by: Kaloyan Raev <kaloyan.r@zend.com>

Bitmap generation: Add a test of ordering commits by "chains"

When commits are selected for bitmap generation, they are reordered
so that related "chains" of commits are grouped together. Chains are
"subbranches" of commits that may branch off of and re-merge with the
main line. Grouping by chains means that the XOR difference between
consecutive selected commits will be smaller, resulting in better
run-length compression of the XORed bitmaps.

Add a new testSelectionOrderingWithChains() test in a new
GcCommitSelectionTest test class. Also move related GC commit selection
tests out of GcBasicPackingTest and into GcCommitSelectionTest.

Change-Id: I8e80cac29c4ca8193b41c9898e5436c22a659f11
Signed-off-by: Terry Parker <tparker@google.com>

Pack bitmaps: improve readability

Add comments and rename variables in PackWriterBitmapPreparer to improve
readability.

Change-Id: I49e7a1c35307298f7bc32ebfc46ab08e94290125
Signed-off-by: Terry Parker <tparker@google.com>

[performance] Remove synthetic access$ methods in dfs, diff and merge

Java compiler must generate synthetic access methods for private methods
and fields of the enclosing class if they are accessed from inner
classes and vice versa.

While invisible in the code, those synthetic access methods exist in the
bytecode and seem to produce some extra execution overhead at runtime
(compared with the direct access to this fields or methods), see
https://git.eclipse.org/r/58948/.

By removing the "private" access modifier from affected methods and
fields we help compiler to avoid generation of synthetic access methods
and hope to improve execution performance.

To validate changes, one can either use javap or use Bytecode Outline
plugin in Eclipse. In both cases one should look for "synthetic
access$<number>" methods at the end of the class and inner class files
in question - there should be none.

NB: don't mix this "synthetic access$" methods up with "public synthetic
bridge" methods generated to allow generic method override return types.

Change-Id: I94fb481b68c84841c1db1a5ebe678b13e13c962b
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

[performance] Remove synthetic access$ methods in lib, util and dircache

Java compiler must generate synthetic access methods for private methods
and fields of the enclosing class if they are accessed from inner
classes and vice versa.

While invisible in the code, those synthetic access methods exist in the
bytecode and seem to produce some extra execution overhead at runtime
(compared with the direct access to this fields or methods), see
https://git.eclipse.org/r/58948/.

By removing the "private" access modifier from affected methods and
fields we help compiler to avoid generation of synthetic access methods
and hope to improve execution performance.

To validate changes, one can either use javap or use Bytecode Outline
plugin in Eclipse. In both cases one should look for "synthetic
access$<number>" methods at the end of the class and inner class files
in question - there should be none.

NB: don't mix this "synthetic access$" methods up with "public synthetic
bridge" methods generated to allow generic method override return types.

Change-Id: Ie7b65f251ec4452d5a5ed48aa0f272cf49a9aecd
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

[performance] Remove synthetic access$ methods in transport package

Java compiler must generate synthetic access methods for private methods
and fields of the enclosing class if they are accessed from inner
classes and vice versa.

While invisible in the code, those synthetic access methods exist in the
bytecode and seem to produce some extra execution overhead at runtime
(compared with the direct access to this fields or methods), see
https://git.eclipse.org/r/58948/.

By removing the "private" access modifier from affected methods and
fields we help compiler to avoid generation of synthetic access methods
and hope to improve execution performance.

To validate changes, one can either use javap or use Bytecode Outline
plugin in Eclipse. In both cases one should look for "synthetic
access$<number>" methods at the end of the class and inner class files
in question - there should be none.

NB: don't mix this "synthetic access$" methods up with "public synthetic
bridge" methods generated to allow generic method override return types.

Change-Id: I0ebaeb2bc454cd8051b901addb102c1a6688688b
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

[performance] Remove synthetic access$ methods in pack and file packages

Java compiler must generate synthetic access methods for private methods
and fields of the enclosing class if they are accessed from inner
classes and vice versa.

While invisible in the code, those synthetic access methods exist in the
bytecode and seem to produce some extra execution overhead at runtime
(compared with the direct access to this fields or methods), see
https://git.eclipse.org/r/58948/.

By removing the "private" access modifier from affected methods and
fields we help compiler to avoid generation of synthetic access methods
and hope to improve execution performance.

To validate changes, one can either use javap or use Bytecode Outline
plugin in Eclipse. In both cases one should look for "synthetic
access$<number>" methods at the end of the class and inner class files
in question - there should be none.

NB: don't mix this "synthetic access$" methods up with "public synthetic
bridge" methods generated to allow generic method override return types.

Change-Id: If53ec94145bae47b74e2561305afe6098012715c
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

Add missing dependency to org.slf4j for org.eclipse.jgit.test bundle

This OSGi dependency was missed in 6c424d32.

Change-Id: I5bad895896194ad468c260d6c0810132a358f152
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Merge "WalkEncryptionTest: get rid of Log4J dependency"

Restore TestRepository.getClock(), it is used by Gerrit/Gitiles

Change-Id: I88880f18998e33377c0c684cbd8007b5d27d76e7
Signed-off-by: Terry Parker <tparker@google.com>

Add best-effort variant of File.getCanonicalFile()

See https://git.eclipse.org/r/#/c/58405/.

Change-Id: I097c4b1369754f59137eb8848a680c61b06510ad
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>

Expose bitmap selection parameters via PackConfig

Expose the following bitmap selection parameters via PackConfig:
"bitmapContiguousCommitCount", "bitmapRecentCommitCount",
"bitmapRecentCommitSpan", "bitmapDistantCommitSpan",
"bitmapExcessiveBranchCount", and "bitmapInactiveBranchAge".

The value of bitmapContiguousCommitCount, whereby bitmaps are
created for the most recent N commits in a branch, has never
been verified. If experiments show that they are not valuable,
then we can simplify the implementation so that there is only
a concept of recent and distant commit history (defined by
"bitmapRecentCommitCount"), and the only controls we need are
"bitmapRecentCommitSpan" and "bitmapDistantCommitSpan".

Change-Id: I288bf3f97d6fbfdfcd5dde2699eff433a7307fb9
Signed-off-by: Terry Parker <tparker@google.com>

Update bitmap selection throttling to fully span active branches.

Replace the “bitmapCommitRange” parameter that was recently introduced
with two new parameters: “bitmapExcessiveBranchCount” and
“bitmapInactiveBranchAgeInDays”. If the count of branches does not
exceed “bitmapExcessiveBranchCount”, then the current algorithm is kept
for all branches.

If the branch count is excessive, then the commit time for the tip
commit for each branch is used to determine if a branch is “inactive”.
"Active" branches get full commit selection using the existing
algorithm. "Inactive" branches get fewer bitmaps near the branch tips.

Introduce a "contiguousCommitCount" parameter that always enforces that
the N most recent commits in a branch are selected for bitmaps. The
previous nextSelectionDistance() algorithm created anywhere from 1-100
contiguous bitmaps at branch tips.

For example, consider a branch with commits numbering 0-300, with 0
being the most recent commit. If the most recent 200 commits are not
merge commits and the 200th commit was the last one selected,
nextSelectionDistance() returned 100, causing commits 200-101 to be
ignored. Then a window of size 100 was evaluated, searching for merge
commits. Since no merge commits are found, the next commit (commit 0)
was selected, for a total of 1 commit in the topmost 100 commits.

If instead the 250th commit was selected, then by the same logic
commit 50 is selected. At that point nextSelectionDistance() switches to
selecting consecutive commits, so commits 0-50 in the topmost 100
commits are selected. The "contiguousCommitCount" parameter provides
more determinism by always selecting a constant number or topmost
commits.

Add an optimization to break out of the inner loop of selectCommits() if
all of the commits for the current branch have already been found.

When reusing bitmaps from an existing pack, remove unnecessary
populating and clearing of the writeBitmaps/PackBitmapIndexBuilder.

Add comments to PackWriterBitmapPreparer, rename methods and variables
for readability.

Add tests for bitmap selection with and without merge commits and with
excessive branch pruning triggered.

Note: I will follow up with an additional change that exposes the new
parameters through PackConfig.

Change-Id: I5ccbb96c8849f331c302d9f7840e05f9650c4608
Signed-off-by: Terry Parker <tparker@google.com>

WalkEncryptionTest: get rid of Log4J dependency

Instead use SLF4J as all other code does

Change-Id: I0fee9c0aaa1d98f918f25aa11c2160d86399467e
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>

Merge "Push control of time into MockSystemReader"

Push control of time into MockSystemReader

LocalDiskRepositoryTestCase and TestRepository have competing ideas
about time. Push them into MockSystemReader so they can
cooperate.

Rename getClock() methods that return Dates to getDate().

Change-Id: Ibbd9fe7f85d0064b0a19e3b675b9718a9e67c479
Signed-off-by: Terry Parker <tparker@google.com>

Silence Maven complaining about unset versions of reporting plugins

Since we use the reporting plugins only in the parent pom.xml there's no
point in using the new pluginManagement tag in the reporting section
which was introduced to fix
https://issues.apache.org/jira/browse/MSITE-443

Change-Id: I750ca3765e95afb06609a362fb3354afc3b66b90
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Adding JGitV1 and JGitV2 Walk Encryption

Building on top of https://git.eclipse.org/r/#/c/56391/

Here we preserve compatibility with JetS3t
and add 2 new native JGit encryption implementations.

For reference, see connection configuration files:
* Version 0: jgit-s3-connection-v-0.properties
* Version 1: jgit-s3-connection-v-1.properties
* Version 2: jgit-s3-connection-v-2.properties

Change-Id: I713290bcacbe92d88e5ef28ce137de73dd1abe2f
Signed-off-by: Andrei Pozolotin <andrei.pozolotin@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Adding AES Walk Encryption support in http://www.jets3t.org/ mode

See previous attempt: https://git.eclipse.org/r/#/c/16674/

Here we preserve as much of JetS3t mode as possible
while allowing to use new Java 8+ PBE algorithms
such as PBEWithHmacSHA512AndAES_256

Summary of changes:
* change pom.xml to control long tests
* add WalkEncryptionTest.launch to run long tests
* add AmazonS3.Keys to to normalize use of constants
* change WalkEncryption to support AES in JetS3t mode
* add WalkEncryptionTest to test remote encryption pipeline
* add support for CI configuration for live Amazon S3 testing
* add log4j based logging for tests in both Eclipse and Maven build

To test locally, check out the review branch, then:
* create amazon test configuration file
* located your home dir: ${user.home}
* named jgit-s3-config.properties
* file format follows AmazonS3 connection settings file:
accesskey = your-amazon-access-key
secretkey = your-amazon-secret-key
test.bucket = your-bucket-for-testing
* finally:
* run in Eclipse: WalkEncryptionTest.launch
* or
* run in Shell: mvn test --define test=WalkEncryptionTest

Change-Id: I6f455fd9fb4eac261ca73d0bec6a4e7dae9f2e91
Signed-off-by: Andrei Pozolotin <andrei.pozolotin@gmail.com>

Test stability: add fsTick() to avoid random testPruneNone() failures

At least on Windows the test failed each second time on the last assert.
Adding a small timeout before gc.prune() makes the test stable again.

Change-Id: I23d98dd565912c58dcf2f24f3ebc24824670cff3
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

Fixed jgit test failures on Windows

RepoCommandTest was failing because of open file handle left.
IgnoreNodeTest was failing because of problems with creation of files
with trailing spaces on Windows.
HookTest was failing because of wrong line delimiter.

Change-Id: I34f074ac447eb4c3ada8b250309bb568b426189d
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

Merge "Don't call reader.close() 2 times on dispose()"

Merge "Limit the range of commits for which bitmaps are created."

Don't call reader.close() 2 times on dispose()

Bug: 479406
Change-Id: I6645a8f36ea349a5f04fd14d2c1ef2ecac2bcc37
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

Delete non empty directories before checkout a path

If the checkout path is currently a non-empty directory (and was a link
or a regular file before), this directory will be removed before
performing checkout, but only if the checkout path is specified.

Bug: 474973
Change-Id: Ifc6c61592d9b54d26c66367163acdebea369145c
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

Limit the range of commits for which bitmaps are created.

A bitmap index contains bitmaps for a set of commits in a pack file.
Creating a bitmap for every commit is too expensive, so heuristics
select the most "important" commits. The most recent commits are the
most valuable. To clone a repository only those for the branch tips are
needed. When fetching, only commits since the last fetch are needed.

The commit selection heuristics generally work, but for some
repositories the number of selected commits is prohibitively high. One
example is the MSM 3.10 Linux kernel. With over 1 million commits on
2820 branches, the current heuristics resulted in +36k selected commits.
Each uncompressed bitmap for that repository is ~413k, making it
difficult to complete a GC operation in available memory.

The benefit of creating bitmaps over the entire history of a repository
like the MSM 3.10 Linux kernel isn't clear. For that repository, most
history for the last year appears to be in the last 100k commits.
Limiting bitmap commit selection to just those commits reduces the count
of selected commits from ~36k to ~10.5k. Dropping bitmaps for older
commits does not affect object counting times for clones or for fetches
on clients that are reasonably up-to-date.

This patch defines a new "bitmapCommitRange" PackConfig parameter to
limit the commit selection process when building bitmaps. The range
starts with the most recent commit and walks backwards. A range of 10k
considers only the 10000 most recent commits. A range of zero creates
bitmaps only for branch tips. A range of -1 (the default) does not limit
the range--all commits in the pack are used in the commit selection
process.

Change-Id: Ied92c70cfa0778facc670e0f14a0980bed5e3bfb
Signed-off-by: Terry Parker <tparker@google.com>

Add utility method allowing to check for empty folders in workdir

Previously the method DirCacheCheckoutTest#assertWorkDir() silently
skipped over empty folders. If tests would have left unexpected empty
folders in the worktree this would be overlooked. Now empty folders have
to be specified by something like mkmap("<foldername>", "/", ...]

Change-Id: Idb8b270e92daf02ecdc381d148a5958bd83ec057
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Align docstring for RepoCommand.setRecordRemoteBranch with argument

Change-Id: Ia3aa1130795d162e482b4088f190956d70857244
Signed-off-by: Stefan Beller <sbeller@google.com>

Merge "RepoCommand: Add setRecordRemoteBranch option to record upstream branch"

RepoCommand: Add setRecordRemoteBranch option to record upstream branch

On a server also running Gerrit that is using RepoCommand to
convert from an XML manifest to a git submodule superproject
periodically, it would be handy to be able to use Gerrit's
submodule subscription feature[1] to update the superproject
automatically between RepoCommand runs as changes are merged
in each subprojects.

This requires setting the 'branch' field for each submodule
so that Gerrit knows what branch to watch. Add an option to
do that.

Setting the branch field also is useful for plain Git users,
since it allows them to use "git submodule update --remote" to
manually update all submodules between RepoCommand runs.

[1] https://gerrit-review.googlesource.com/Documentation/user-submodules.html

Change-Id: I1a10861bcd0df3b3673fc2d481c8129b2bdac5f9
Signed-off-by: Stefan Beller <sbeller@google.com>

Compare API against 4.1.0.201509280440-r

Change-Id: Ia04d09ca337c357004567a54a9fc01ea1e55e508
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Merge "[ignore rules] fix for handling unmatched '[' brackets"

Merge branch 'stable-4.1'

* stable-4.1:
  pgm: Open RevWalk and TreeWalk in try-with-resource
  ant: Open Repository and Git in try-with-resource
  pgm: Create instances of Git in try-with-resource
  FanoutBucket: Create ObjectInserter.Formatter in try-with-resource
  Fix compiler warnings in DiffFormatter.writeGitLinkText

Change-Id: I448ecc9a1334977d9f304dd61ea20c7a8e692b10
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

pgm: Open RevWalk and TreeWalk in try-with-resource

To prevent potential resource leaks.

Change-Id: I2039af04d9fb75405f8e13abf508623b7d4ef324
Signed-off-by: David Pursehouse <david.pursehouse@sonymobile.com>

ant: Open Repository and Git in try-with-resource

To prevent potential resource leak.

Change-Id: I3f4af9037c9d26ec575b529ab66066365ab918a5
Signed-off-by: David Pursehouse <david.pursehouse@sonymobile.com>

pgm: Create instances of Git in try-with-resource

To prevent potential resource leak.

Change-Id: I8ac4ae61193324849bafb46501a55f93c5029a4e
Signed-off-by: David Pursehouse <david.pursehouse@sonymobile.com>

FanoutBucket: Create ObjectInserter.Formatter in try-with-resource

To prevent potential resource leak.

Change-Id: Ife09be2822bc476199f10da8d1eb7ccc8da05b79
Signed-off-by: David Pursehouse <david.pursehouse@sonymobile.com>

Use japicmp instead of clirr to detect API changes

Clirr doesn't support Java 8 hence use japicmp instead.
See https://github.com/siom79/japicmp

Change-Id: If4b30a6d6aa849b4d6b3b0c900558c609822840c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

[ignore rules] fix for handling unmatched '[' brackets

This patch makes JGit parsing of ignore rules containing unmatched '['
bracket compatible to the Git CLI.

Since '[' starts character group, Git tries to parse the ignore rule as
a shell glob pattern and if the character group is not closed, the glob
pattern is invalid and so the ignore rule never matches anything.

See also http://article.gmane.org/gmane.comp.version-control.git/278699.

Bug: 478490
Change-Id: I734a4d14fcdd721070e3f75d57e33c2c0700d503
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

Add test case comparing CGit vs JGit ignore behavior for random patterns

This test case was developed in the scope of bug 478065.

Bug: 478065
Change-Id: Ibcce1ed375d4a6ba05461e6c6b287d16752fa681
Signed-off-by: Sébastien Arod <sebastien.arod@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Fix compiler warnings in DiffFormatter.writeGitLinkText

- Remove declaration of IOException that is no longer thrown

- Add missing //$NON-NLS-1$ to prevent "Non-externalized string literal"
warning.

These warnings seem to have been introduced by If13f7b406.

Change-Id: I30058eed31b92067a6ab22e787732b08e29f8d63
Signed-off-by: David Pursehouse <david.pursehouse@sonymobile.com>

Prepare 4.2.0-SNAPSHOT builds

Change-Id: If559d3565b1f84c93a533e1ce18d5293605d1950
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Merge branch 'stable-4.1'

* stable-4.1:
Prepare 4.1.1-SNAPSHOT builds
JGit v4.1.0.201509280440-r

Change-Id: Ie898f10c39f32419628f2b6dff7e3ec083ddfdfe
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Prepare 4.1.1-SNAPSHOT builds

Change-Id: I035f3a8d0f0de86e8b8f00e668be5ce008402e82
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

JGit v4.1.0.201509280440-r

Change-Id: I9a536870b9f5c1247c52d6c976a954115982fa1c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Use java.nio.file consistently in FS

Since 4.0 we require Java 7 so there is no longer a need to override the
following methods in FS_POSIX, FS_Win32, FS_Win32_Cygwin
- lastModified()
- setLastModified()
- length()
- isSymlink()
- exists()
- isDirectory()
- isFile()
- isHidden()
Hence implement these methods in FS and remove overrides in subclasses.

Change-Id: I5dbde6ec806c66c86ac542978918361461021294
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Deprecate FileUtil and move the code to FileUtils

As discussed on https://git.eclipse.org/r/53836 it does not make sense
to have two similar utility classes in same package with intersecting
functionality. To not break the API, all methods from FileUtil are
copied to FileUtils, all FileUtil API is made deprecated and redirecting
now to FileUtils. Moved simple methods which are available in Java 7 API
are made package private and can be removed at any point later entirely
(right now they are in use).

Bug: 475070
Change-Id: Idffcf9840496c448173af7c052d8898ada68e27b
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

[performance] Cache platform name in SystemReader

SystemReader.isMacOs() and SystemReader.isWindows() return values are
unlikely to change during the JVM lifetime (except tests). Don't read
system properties each time the methods are called, just use previously
calculated value.

Change-Id: I495521f67a8b544e7b7247d99bbd05a42ea16d20
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

[ignore rules] fix for backslash handling

An attempt to re-implement not well documented Git CLI behavior for
patterns with backslashes.

It looks like Git silently ignores all \ characters in ignore rules, if
they are NOT covered by 3 cases described in [1]:

{quote}
1) ... Put a backslash ("\") in front of the first hash for patterns
that begin with a hash.
...
2) Trailing spaces are ignored unless they are quoted with backslash
("\").
...
3) Put a backslash ("\") in front of the first "!" for patterns that
begin with a literal "!", for example, "\!important!.txt".
{quote}

Undocumented also is the fact that backslash itself can be escaped by
backslash.

So \h\e\l\l\o\.t\x\t rule matches hello.txt and a\\\\b a\b in Git CLI.

Additionally, the glob parser [2] knows special meaning of backslash:

{quote}
One can remove the special meaning of '?', '*' and '[' by preceding
them by a backslash, or, in case this is part of a shell command
line, enclosing them in quotes. Between brackets these characters
stand for themselves. Thus, "[[?*\]" matches the four characters
'[', '?', '*' and '\'.
{quote}

[1] https://www.kernel.org/pub/software/scm/git/docs/gitignore.html
[2] http://man7.org/linux/man-pages/man7/glob.7.html

Bug: 478065
Change-Id: I3dc973475d1943c5622103701fa8cb3ea0684e3e
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

[ignore rules] Fix for character group matcher

Currently we fail to properly recognize character group if the pattern
before character group contains opening bracket.

See comment from Sebastien Arod on https://git.eclipse.org/r/56678/

Change-Id: I70d3657a2a328818ea2bdc1409d18ecb3a85825b
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

Merge "Show submodule difference as a hunk"

Show submodule difference as a hunk

Current DiffFormat behavior regarding submodules (aka git links) is
incorrect. The "Subproject commit <sha1>" appears as part of the diff
header, rather than as its own hunk.

--> From JGit implementation
diff --git a/plugins/cookbook-plugin b/plugins/cookbook-plugin
index b9d3ca8..ec6ed89 160000
--- a/plugins/cookbook-plugin
+++ b/plugins/cookbook-plugin
-Subproject commit b9d3ca8a65030071e28be19296ba867ab424fbbf
+Subproject commit ec6ed89c47ba7223f82d9cb512926a6c5081343e

--> From C Git 2.5.2
diff --git a/plugins/cookbook-plugin b/plugins/cookbook-plugin
index b9d3ca8..ec6ed89 160000
--- a/plugins/cookbook-plugin
+++ b/plugins/cookbook-plugin
@@ -1 +1 @@
-Subproject commit b9d3ca8a65030071e28be19296ba867ab424fbbf
+Subproject commit ec6ed89c47ba7223f82d9cb512926a6c5081343e

The current way of processing submodules results in no hunk header and
includes the contents of the hunk as part of the headers. To fix this, we
can't just have our writeGitLinkDiffText output the hunk header. We have
to change the flow so that the raw text gets parsed as a diff. The easiest
way to do this is to fake the RawText in the FormatResult when we have a
GITLINK.

It should be noted that it seems possible for there to be a difference
between a GITLINK and a non-GITLINK, but I don't think this can happen in
practice, so I don't think we need to worry too much about it.

This patch also fixes up the test for GitLink headers, as the test was
for the old behavior. My setup has 3 other failing tests which may or
may not be the result of environmental changes. However, the same tests
fail without this commit, so I do not believe they are related.

Bug: 477759
Change-Id: If13f7b406904fad814416c93ed09ea47ef183337
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>

Properly support special regex characters in ignore rules

Ignore rules should escape $^(){}+| chars if using regular expressions,
because they should be treated literally if they aren't part of a
character group.

Bug: 478055
Change-Id: Ic7276442d7f8f02594b85eae1ef697362e62d3bd
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

Remove unneeded @SuppressWarnings("boxing")

Fix the unit tests to not do boxing by using assertEquals(int, int)
instead of assertThat with a matcher.

Change-Id: I5412fe2f72c8ea0227b9ff3a3352ccb555e22231
Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>

Fix endless loop in ObjectChecker for MacOS

Bug: 477090
Change-Id: I0ba416f1cc172a835dd2723ff7fa904597ffd097
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Fix warnings about missing serialVersionUID

Change-Id: Ieff64896aebeab793ff08ab89f10d5ccaee66021
Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>

Fix integer boxing eclipse warning

Change-Id: I89a8495a799254586016393e51697cfbceacac8b
Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>

Fix integer boxing eclipse warning

There was this warning because private assertEquals(Object, Object)
method was shadowing JUnit assertEquals methods.

Change-Id: I889bfe1d8c48210d9a42147a523c4829c5b5d1e3
Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>

Merge branch 'stable-4.0'

Change-Id: I1b448ce01f4cdfa62611da9e4d37321a4af9c12d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Prepare 4.0.3-SNAPSHOT builds

Change-Id: Ic5ab059bee460c76c6ff3e08141ce351a559691c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

UploadPack: Verify clients send only commits for shallow lines

If a client mistakenly tries to send a tag object as a shallow line
JGit blindly assumes this is a commit and tries to parse the tag
buffer using the commit parser. This can cause an obtuse error like:

InvalidObjectIdException: Invalid id: t c0ff331234...

The "t" comes from the "object c0ff331234..." line of the tag tring
to be parsed as though it where the "tree" line of a commit.

Run any client supplied shallow lines through the RevWalk to lookup
the object types. Fail fast with a protocol exception if any of them
are non-commit.

Skip objects not known to this repository. This matches behavior
with git-core's upload-pack, which sliently skips over any shallow
line object named by the client but not known by the server.

Change-Id: Ic6c57a90a42813164ce65c2244705fc42e84d700

JGit v4.0.2.201509141540-r

Change-Id: I766d95aa282c92dcbd2846145ee52e9cc62dd1f8
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

S3 transport: Fix check for tmpdir

Properties.containsKey() is the correct call here; contains() was testing
if a value is present but the key is what was meant.

Change-Id: Ice72c9f4388583e18cf8aca6e837cc4299fd07fd

URIish: fall back to host as humanish name

When we have a URI that contains an empty path component (that is
it only contains a "/") we want to fall back to the host as
humanish name.

This change is according to the behavior of upstream git, which
falls back on the hostname when guessing directory names for
newly cloned repositories (see [1] for the discussion).

[1] http://article.gmane.org/gmane.comp.version-control.git/274669

Change-Id: I44400c6ab72a2722d2155d53d63671bd867d6c44
Signed-off-by: Patrick Steinhardt <ps@pks.im>

Update build to final R20150821153341 Orbit repository for Mars.1

Change-Id: I32d4c21f7cdd0c1a24f797012f98daa9a7f48acf
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Update org.apache.httpcomponents

- update org.apache.httpcomponents.httpcore to 4.3.3
- update org.apache.httpcomponents.httpclient to 4.3.6, 4.3.5 and later
are reported to fix vulnerability CVE-2014-3577

CQ: 9220
CQ: 9221
Bug: 470523
Change-Id: I024448c941e81f7c1dc1cc2394329df90e9b3048
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

PushCertificateStore: Don't add no-op command to batch

If no refs match the input list and we are writing to a batch,
the returned new commit from write() will match the current commit.
Adding a command to the batch for this case is harmless as it will
succeed, but it's more straightforward to just skip adding a command
in this case.

Add tests or the combination of saving matching refs and saving to a
batch.

Change-Id: I6837389b08e6c80bc2d4c9e9c506d07293ea5fb2

Restore lazy Bundle-ActivationPolicy removed in 3a4a5a4e

This header was removed unintentionally from some bundles in
3a4a5a4e57f41c595ba950ea6f6680260669bf34. Restore it to ensure lazy
activation of bundles.

Change-Id: I1f841f978fb93278e3ec0533a01f1363510dd976
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Update uses-clauses in OSGi manifests

In Bug 476164 it was reported that EGit doesn't start when the platform
comes with jsch 0.1.51 while this version of EGit/JGit brings jsch
0.1.53. This could be caused by outdated uses-clauses. Hence recompute
them using PDE tooling.

Bug: 476164
Change-Id: I185ba097884ead9cd034eba842bd3bf34181a99b
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Use java.io.File to check existence of loose objects in ObjectDirectory

It was reported in [1] that 197e3393a51424fae45e51dce4a649ba26e5a368 led
to a performance regression in a BFG benchmark. Analysis showed that
this is caused by the exists() method in FS_POSIX, now overriding the
default implementation in FS. The default implementation of FS.exists()
uses java.io.File.exists(), while the new implementation in FS_POSIX
uses java.nio.file.Files.exists() - by simply removing the override in
FS_POSIX, performance was restored.

Profiling showed that java.nio.file.Files.exists() is substantially
slower than java.io.File.exists(), to the point where the exists() call
doubles the average cost of a call to
ObjectDirectory.insertUnpackedObject() - which the BFG uses a lot,
because it's rewriting history. Average times measured on Ubuntu were:

java.io.File.exists() - 4 microseconds
java.nio.file.Files.exists() - 60 microseconds

The loose object exists test should be using java.io.File and not FS.
ObjectDirectory uses FS.resolve() to traverse symlinks to objects but
then once inside objects all 256 sharded directories should be real
directories, and the object files should be real files, not dangling
symlinks. java.io.File.exists() is sufficient here, and faster.

Change ObjectDirectory to use File.exists() once its computed the File
handle.

This does mean JGit cannot run ObjectDirectory code on an abstract
virtual filesystem plugged into NIO2. If you really want to run JGit on
an esoteric non-standard filesystem like "in memory" you should look at
the DFS storage backend, which has fewer abstraction points to deal
with. Or write your own from scratch.

[1] https://dev.eclipse.org/mhonarc/lists/jgit-dev/msg02954.html

Change-Id: I74684dc3957ae1ca52a7097f83a6c420aa24310f
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Fix warnings on FileUtils.isStaleFileHandle()

- add @since annotation for new API method
- silence non-externalized String warning

Change-Id: I864176ced64e9569e7f2cdf8f777720655bfc578
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Deprecate redundant FileUtil.delete(File), use FileUtils instead

Bug: 475070
Change-Id: I6dc651f4b47e1b2c8d7954ec982e21ae6bb5f7a6
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Handle stale file handles on packed-refs file

On a local filesystem the packed-refs file will be orphaned if it is
replaced by another client while the current client is reading the old
one. However, since NFS servers do not keep track of open files, instead
of orphaning the old packed-refs file, such a replacement will cause the
old file to be garbage collected instead.  A stale file handle exception
will be raised on NFS servers if the file is garbage collected (deleted)
on the server while it is being read.  Since we no longer have access to
the old file in these cases, the previous code would just fail. However,
in these cases, reopening the file and rereading it will succeed (since
it will reopen the new replacement file).  So retrying the read is a
viable strategy to deal with stale file handles on the packed-refs file,
implement such a strategy.

Since it is possible that the packed-refs file could be replaced again
while rereading it (multiple consecutive updates can easily occur with
ref deletions), loop on stale file handle exceptions, up to 5 extra
times, trying to read the packed-refs file again, until we either read
the new file, or find that the file no longer exists. The limit of 5 is
arbitrary, and provides a safe upper bounds to prevent infinite loops
consuming resources in a potential unforeseen persistent error
condition.

Change-Id: I085c472bafa6e2f32f610a33ddc8368bb4ab1814
Signed-off-by: Martin Fick<mfick@codeaurora.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Merge "Add public isStaleFileHandle() API, improve detection."

Add public isStaleFileHandle() API, improve detection.

Add a public API to the FileUtils to determine if an IOException is a
stale NFS file handle exception. This will make it easier to detect
such errors, and interpret them consistently throughout the codebase.
This new API is a bit more lenient in its detection than the previous
detection, and should be able to detect some errors which previously
were not identified as stale file handle exceptions because they had the
word NFS in the error message. Adjust the packfile handling code to use
this new API for detection.

Change-Id: I21f80014546ba1afec7335890e5ae79e7f521412
Signed-off-by: Martin Fick<mfick@codeaurora.org>

Set "potentialNullReference" to "error" level and fixed all issues

There should be no functional change, the logic updated only to make
code simple so that compiler can understand what is going for. Removed
all @SuppressWarnings("null") annotations since they cannot be used if
"org.eclipse.jdt.core.compiler.problem.potentialNullReference" option is
set to the "error" level.

Bug: 470647
Change-Id: Ie93c249fa46e792198d362e531d5cbabaf41fdc4
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

Enable annotation based NPE analysis in jgit

Bug: 470647
Change-Id: I14d1983bb7c208faeffee0504e0567e38d8a89f3
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

Update Jetty to 9.2.13.v20150730

Change-Id: I0c2a4cafcd1992431888c2a48592d9cb1ac04747
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Update com.jcraft.jsch to 0.1.53

Update target platform to Orbit M20150818205559 for Mars in order to
update com.jcraft.jsch to 0.1.53. Also update pom.xml to use Mars target
platform profile by default.

CQ: 10045
Bug: 463580
Change-Id: I1bf151fbee7b00c7bd38cf1236c9bad50e3c64bd
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Restored obsoleted createSymLink/readSymLink in FileUtil

Bug: 475070
Change-Id: I425ad842dc26b55f747f192348398a3912c0ca6b
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

Use NIO2 to implement FileUtils.rename() and expose options

- use NIO2's Files.move() to reimplement rename()
- provide a second method accepting CopyOptions which can be used to
request atomic move.

Change-Id: Ibcf722978e65745218a1ccda45344ca295911659
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>

Move createSymLink/readSymLink to FileUtils

Bug: 475070
Change-Id: I258f4bf291e02ef8e6f867b5d71c04ec902b6bcb
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

Merge "Change FS not to throw NPE when facing InMemory databases"

Expose the set of root commits in PackStatistics

Root commits are commits with zero parents.  If a commmit has no
parents it is the first commit in the repository.  In general the root
commits should be unique for any given project, as the first commit
will be created at a different time, by a different user with its own
message.  These root commits can be used as a "fingerprint" to
identify disjoint histories.

Change-Id: Id891dbc1f17c816cea404569578bb7635ff85cdb

Change FS not to throw NPE when facing InMemory databases

The FS class and the subclasses FS_POSIX assumed in the findHook()
method that every repository has a valid gitDir. But in tests when using
in-memory-repositories this is not true and this method was generating
NPEs. Change the method to return null if no repository directory can be
determined.

Change-Id: I38a4d36dc6452b5dacae3d0dbf562b569ca3c19b

Fix NPE in DfsGarbageCollector and further reduce memory

DfsGarbageCollector asks PackWriter for the set of objects packed
after the bitmap index is written out.  This is now null as it was
cleared to release memory. Instead use PackBitmapIndexBuilder to
build the set as it also has the objects.

Reduce memory in PackBitmapIndexBuilder by fully discarding the
ObjectToPack instances. This was the original intent of commit
4bb523475d44 ("PackWriter: shed memory while creating bitmaps")
but failed as the instances were still held live here.

Switch to BlockList instead of ObjectToPack[]. This allows the
JVM to allocate many smaller arrays instead of one contiguous
array with 5.2M reference pointers. In a tight heap the smaller
allocations are more feasible.

Reduce the initial EWAHCompressedBitmaps for the 4 type maps.  On
average a typical repository is 30% commits, 30% trees and 30% blobs.
These bitmaps are typically very dense.  PackWriter orders objects by
commit, tree, blob when writing the file so these should always be a
very dense run of 1s with some 0s before and after. So even the 1/3rd
allocation is likely too large, but the later trim() will reduce the
internal buffer anyway.

Change-Id: If0b80a31cb00894f1485ff8f53ef7ae5a759a046

Cleanup Attributes and remove obsoleted Java7BasicAttributes class

After jgit moved to Java 7 there is no need in an extra
Java7BasicAttributes class. Also all fields of Attributes can be made
final now.

Change-Id: I0be6daf7758189b0eecc4e26294bd278ed8bf7a0
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>

Merge "Remove "experimental" from the description of the pack bitmap index"

PackWriter: shed memory while creating bitmaps

Once bitmap creation begins the internal maps required for packing are
no longer necessary. On a repository with 5.2M objects this can save
more than 438 MiB of memory by allowing the ObjectToPack instances to
get garbage collected away.

Downside is the PackWriter cannot be used for any further opertions
except to write the bitmap index. This is an acceptable trade-off as
in practice nobody uses the PackWriter after the bitmaps are built.

Change-Id: Ibfaf84b22fa0590896a398ff659a91fcf03d7128

Merge "Do not retain commit body during bitmap generation"

Merge "Bitmap builder: actually compress EWAH bitmaps in memory"

Bitmap builder: actually compress EWAH bitmaps in memory

For construction performance each new EWAHBitmap is allocated at the
roughly worst-case size the bitmap would need if all of the words must
be literal and no run length compression is available. In practice
this is far larger than is required, wasting heap memory while the
bitmaps are computed.

Trim down each bitmap to its minimum required size. This copies the
internal array to a new smaller array, allowing the GC to reclaim the
prior larger array for reuse.

A single bitmap of 5.2M bits is only 79 KiB of memory without this
trim call but 15,000 such bitmaps is 1.1 GiB. Trimming can help fit
a larger number of bitmaps during processing.

Change-Id: I2bd19a786189db5b01c4c96f209b83de50e10c3b