Dave Borowitz [Fri, 31 Mar 2017 15:49:12 +0000 (11:49 -0400)]
Support creating Mergers without a Repository
All that's really required to run a merge operation is a single
ObjectInserter, from which we can construct a RevWalk, plus a Config
that declares a diff algorithm. Provide some factory methods that don't
take Repository.
Masaya Suzuki [Fri, 31 Mar 2017 20:54:02 +0000 (13:54 -0700)]
Buffer the response until request parsing has done
This is a continuation from https://git.eclipse.org/r/#/c/4716/. For a
non-bidirectional request, we need to consume the request before writing
any response. In UploadPack, we write "shallow"/"unshallow" responses
before parsing "have" lines. This has happened not to be a problem most
of the time in the smart HTTP protocol because the underlying
InputStream has a 32 KiB buffer in SmartOutputStream.
KB Sriram [Wed, 22 Mar 2017 18:42:38 +0000 (11:42 -0700)]
Make diff locations more consistent
DiffAlgorithms can return different edit locations for inserts or
deletes, if they can be "shifted" up or down repeating blocks of
lines. This causes the 3-way merge to apply both edits, resulting in
incorrectly removing or duplicating lines.
Augment an existing "tidy-up" stage in DiffAlgorithm to move all
shiftable edits (not just the last INSERT edit) to a consistent
location, and add test cases for previously incorrect merges.
Matthias Sohn [Mon, 27 Mar 2017 08:52:36 +0000 (10:52 +0200)]
Merge branch 'stable-4.6'
* stable-4.6:
Only mark packfile invalid if exception signals permanent problem
Don't flag a packfile invalid if opening existing file failed
Prepare 4.5.2-SNAPSHOT builds
Change-Id: Ife4efad1135d3870a5a0fb71e60b9524fb8777ab Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
David Pursehouse [Mon, 27 Mar 2017 01:14:50 +0000 (10:14 +0900)]
Merge branch 'stable-4.5' into stable-4.6
* stable-4.5:
Only mark packfile invalid if exception signals permanent problem
Don't flag a packfile invalid if opening existing file failed
Prepare 4.5.2-SNAPSHOT builds
Change-Id: I20b50981adc54c426666015ff04fe3bb1db9abd9 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Matthias Sohn [Sat, 25 Mar 2017 01:33:06 +0000 (02:33 +0100)]
Only mark packfile invalid if exception signals permanent problem
Add NoPackSignatureException and UnsupportedPackVersionException to
explicitly mark permanent unrecoverable problems with a pack
Assume problem with a pack is permanent only if we are sure the
exception signals a non-transient problem we can't recover from:
- AccessDeniedException: we lack permissions
- CorruptObjectException: we detected corruption
- EOFException: file ended unexpectedly
- NoPackSignatureException: pack has no pack signature
- NoSuchFileException: file has gone missing
- PackMismatchException: pack no longer matches its index
- UnpackException: unpacking failed
- UnsupportedPackIndexVersionException: unsupported pack index version
- UnsupportedPackVersionException: unsupported pack version
Do not attempt to handle Errors since they are thrown for serious
problems applications should not try to recover from.
Change-Id: I2c416ce2b0e23255c4fb03a3f9a0ee237f7a484a Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Luca Milanesio [Fri, 24 Mar 2017 00:18:12 +0000 (00:18 +0000)]
Don't flag a packfile invalid if opening existing file failed
A packfile random file open operation may fail with a
FileNotFoundException even if the file exists, possibly
for the temporary lack of resources.
Instead of managing the FileNotFoundException as any generic
IOException it is best to rethrow the exception but prevent
the packfile for being flagged as invalid until it is actually
opened and read successfully or unsuccessfully.
David Ostrovsky [Thu, 23 Mar 2017 05:44:51 +0000 (06:44 +0100)]
bazel: Consume hamcrest through transitive dependency
In I3ab958ce8 explicit dependency in lib/BUILD were defined and most
of the bazel build implementation was switched to using it. Switch
test.bzl test implementation to using explicit dependencies as well.
Change-Id: I4413d1a45addeeb2a980d07669fa034c2eebb3a4 Signed-off-by: David Ostrovsky <david@ostrovsky.org>
bazel test --test_tag_filters=api,dfs,revplot,treewalk //...
Change-Id: Ic41b05a79d855212e67b1b4707e9c6b4dc9ea70d Signed-off-by: David Ostrovsky <david@ostrovsky.org> Signed-off-by: Jonathan Nieder <jrn@google.com>
Jonathan Nieder [Mon, 20 Mar 2017 01:52:55 +0000 (18:52 -0700)]
bazel: Mark junit targets testonly
Only testonly targets (such as tests) need to use junit.
In particular this involves making the toplevel :all rule testonly.
It's not clear to me what that rule is for --- "bazel build //..."
already works to build all targets. In any case it appears to be for
testing, so marking it as testonly shouldn't be harmful.
Jonathan Nieder [Mon, 20 Mar 2017 00:41:26 +0000 (17:41 -0700)]
bazel: Add explicit targets for library dependencies
This provides a place to declare visibility restrictions and
transitive dependencies for each library.
Other targets should only declare dependencies on what they directly
use, making dependencies easier to maintain.
Trim the dependencies of org.eclipse.jgit:jgit to follow that rule.
It declares dependencies on Apache httpcomponents and the servlet
API but doesn't use them.
Tested:
* 'bazel build //...' succeeds
* applying the change https://gerrit-review.googlesource.com/90843
to a copy of Gerrit, following the instructions there, and running
'bazel test //...' in that copy of Gerrit still succeeds
David Pursehouse [Sun, 19 Mar 2017 01:23:29 +0000 (10:23 +0900)]
Fix test configuration to run RacyGitTests, and fix testRacyGitDetection
With the filename suffix "Tests", the module was not included in tests
when building with Maven, and without the @Test annotations the tests
didn't get executed under Eclipse or buck test.
testRacyGitDetection was failing because the index file did not exist.
Add the missing configuration, the missing annotations, and add a call
to reset() in testRacyGitDetection to force creation of the index file.
Change-Id: I29dd8f89c36fef4ab40bedce7f4a26bd9b2390e4 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
David Ostrovsky [Sat, 18 Mar 2017 10:29:26 +0000 (11:29 +0100)]
RevFlagSetTest: Fix compilation error flagged by error prone
This fixes error flagged by error prone:
Java compilation in rule '//org.eclipse.jgit.test:jgit' failed: Worker
process sent response with exit code: 1.
org.eclipse.jgit.test/tst/org/eclipse/jgit/revwalk/RevFlagSetTest.java:149:
error: [CollectionIncompatibleType] Argument '"bob"' should not be
passed to this method; its type String is not compatible with its
collection's type argument RevFlag
assertFalse(set.contains("bob"));
Change-Id: I4a971ce92fee55e28b2ab0c7b716ac20fa9c6709 Signed-off-by: David Ostrovsky <david@ostrovsky.org>
David Ostrovsky [Sat, 18 Mar 2017 09:41:29 +0000 (10:41 +0100)]
Move SHA1 compress/recompress files to resource folder
This fixes Bazel build:
in srcs attribute of java_library rule //org.eclipse.jgit:jgit:
file '//org.eclipse.jgit:src/org/eclipse/jgit/util/sha1/SHA1.recompress'
is misplaced here (expected .java, .srcjar or .properties).
Another option that was considered is to exclude the non source files.
Change-Id: I7083f27a4a49bf6681c85c7cf7b08a83c9a70c77 Signed-off-by: David Ostrovsky <david@ostrovsky.org>
Luca Milanesio [Fri, 10 Mar 2017 00:20:23 +0000 (00:20 +0000)]
Don't remove pack when FileNotFoundException is transient
The FileNotFoundException is typically raised in three conditions:
1. file doesn't exist
2. incompatible read vs. read/write open modes
3. filesystem locking
4. temporary lack of resources (e.g. too many open files)
1. is already managed, 2. would never happen as packs are not
overwritten while with 3. and 4. it is worth logging the exception and
retrying to read the pack again.
Log transient errors using an exponential backoff strategy to avoid
flooding the logs with the same error if consecutive retries to access
the pack fail repeatedly.
FetchCommand: Fix detection of submodule recursion mode
The submodule.name.fetchRecurseSubmodules value was being read from the
configuration of the submodule, but it should be read from the config
of the parent repository.
Also, the fetch.recurseSubmodules value from the parent repository's
configuration was not being considered at all.
Fix both of these and add tests. Now the precedence of the recurse mode
is determined as follows:
1. Value passed to the API
2. Value configured in submodule.name.fetchRecurseSubmodules
3. Value configured in fetch.recurseSubmodules
4. Default to "on demand"
* stable-4.6:
Update Jetty to 9.4.1.v20170120 in buck build
Update Jetty to 9.4.1.v20170120
Update build to use Tycho 1.0.0
Update minimum JDK version in README
Change-Id: I735697c112094e883986ce13026d967291d88494 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Jonathan Nieder [Sun, 26 Feb 2017 23:09:04 +0000 (15:09 -0800)]
Update Jetty to 9.4.1.v20170120 in buck build
5e8e2179b218ede7d14b69dc5149b0691b5859cf (Update Jetty to
9.4.1.v201470120, 2017-01-26) updated Jetty in the maven build.
Update the buck build to match so buck builds work again.
The buck build will go away soon, but in the meantime (until the bazel
build gets the same level of support) it is convenient as a faster way
of running tests than using maven.
The bazel build doesn't need this change since it doesn't build or run
http tests yet.
David Pursehouse [Mon, 13 Feb 2017 12:37:30 +0000 (21:37 +0900)]
FetchCommand: Add basic support for recursing into submodules
Extend FetchCommand to expose a new method, setRecurseSubmodules(mode),
which allows to set the mode to ON, OFF or ON_DEMAND.
After fetching a repository, its submodules are recursively fetched:
- When the mode is YES, submodules are always fetched.
- When the mode is NO, submodules are not fetched.
- When the mode is ON_DEMAND, submodules are only fetched when the
parent repository receives an update of the submodule and the new
revision is not already in the submodule.
The mode is determined in the following order of precedence:
- Value specified in the API call using setRecurseSubmodules.
- Value specified in the repository's config under the key
submodule.name.fetchRecurseSubmodules
- Defaults to ON_DEMAND if neither of the previous is set.
Extend FetchResult to recursively include results for submodules, as
a map of the submodule path to an instance of FetchResult.
Test setup is based on testCloneRepositoryWithNestedSubmodules.
Change-Id: Ibc841683763307cb76e78e142e0da5b11b1add2a Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Thomas Wolf [Thu, 23 Feb 2017 21:49:43 +0000 (22:49 +0100)]
Make Repository.normalizeBranchName less strict
This operation was added recently with the goal to provide some
way to auto-correct invalid user input, or to provide a correction
suggestion to the user -- EGit uses it now that way. But the initial
implementation was very restrictive; it removed all non-ASCII
characters and even slashes.
Understandably end users were not happy with that. Git has no such
restriction to ASCII-only; nor does JGit. Branch names should be
meaningful to the end user, and if a user-supplied branch name is
invalid for technical reasons, a "normalized" name should still
be meaningful to the user.
Rewrite to attempt a minimal fix such that the result will pass
isValidRefName.
* Replace all Unicode whitespace by underscore.
* Replace troublesome special characters by dash.
* Collapse sequences of underscores, dots, and dashes.
* Remove underscores, dots, and dashes following slashes, and
collapse sequences of slashes.
* Strip leading and trailing sequences of slashes, dots, dashes,
and underscores.
* Avoid the ".lock" extension.
* Avoid the Windows reserved device names.
* If input name is null return an empty String so callers don't need to
check for null.
This still allows branch names with single slashes as separators
between components, avoids some pitfalls that isValidRefName() tests
for, and leaves other character untouched and thus allows non-ASCII
branch names.
Also move the function from the bottom of the file up to where
isValidRefName is implemented.
Bug: 512508
Change-Id: Ia0576d9b2489162208c05e51c6d54e9f0c88c3a7 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Shawn Pearce [Sat, 25 Feb 2017 19:43:42 +0000 (11:43 -0800)]
SHA-1: collision detection support
Update SHA1 class to include a Java port of sha1dc[1]'s ubc_check,
which can detect the attack pattern used by the SHAttered[2] authors.
Given the shattered example files that have the same SHA-1, this
modified implementation can identify there is risk of collision given
only one file in the pair:
When JGit detects probability of a collision the SHA1 class now warns
on the logger, reporting the object's SHA-1 hash, and then throws a
Sha1CollisionException to the caller.
From the paper[3] by Marc Stevens, the probability of a false positive
identification of a collision is about 14 * 2^(-160), sufficiently low
enough for any detected collision to likely be a real collision.
git-core[4] may adopt sha1dc before the system migrates to an entirely
new hash function. This commit enables JGit to remain compatible with
that move to sha1dc, and help protect users by warning if similar
attacks as SHAttered are identified.
Performance declined about 8% (detection off), now:
This decline in throughput is attributed to the step loop unrolling in
compress(), which was necessary to easily fit the UbcCheck logic into
the hash function. Using helper functions s1-s4 reduces the code
explosion, providing acceptable throughput.
sha1dc (native C) ~206.28 MiB/s
sha1dc (native C) ~204.47 MiB/s
sha1dc (native C) ~203.74 MiB/s
Average time across 100,000 calls to hash 4100 bytes (such as a commit
or tree) for the various algorithms available to JGit also shows SHA1
is slower than MessageDigest, but by an acceptable margin:
Being implemented in Java with these additional safety checks is
clearly a penalty, but throughput is still acceptable given the
increased security against object name collisions.
Magnus Vigerlöf [Sat, 18 Feb 2017 18:28:39 +0000 (19:28 +0100)]
Correct the boolean logic for filtering paths
The TreeWalk filtering classes need to support the three different
meanings of the return value the path comparison generates.
A new path comparison method (isPathMatch) is created with
three distinct return values (isPathPrefix use value '0' to
encode two of these) which will makes it possible for the logical
operators (especially NOT) to aggregate a correct verdict.
A filter like: AND(Path("path"), NOT(Path("path/to/other")))
Should filter out 'path/to/other/file', but not 'path/to/my/file'.
The path-limiting feature when testing path/to/my/file, would
result to run test for the following paths:
path
path/to
path/to/my
path/to/my/file
isPathPrefix('path/to/other') will return '0' for the first two
and since there is no way for NOT to distinguish between an exact
match and a match indicating that the tested path is a 'parent',
it will incorrectly return false and thus remove everything below
'path' immediately.
isPathMatch has a distinguished value for 'parent' matches that
will be preserved through the logic operators and should not
cause an over-eager removal of paths.
The functionality of isPathPrefix is required by other parts
and is untouched.
Unit tests are included to ensure that the logical functionality
is correct and can be preserved.
Change-Id: Ice2ca9406f09f1b179569e99b86a0e5d77baa20d Signed-off-by: Magnus Vigerlöf <magnus.vigerlof@gmail.com>
Shawn Pearce [Sun, 26 Feb 2017 19:44:51 +0000 (11:44 -0800)]
SHA1: support reset() and reuse instances
Allow SHA1 instances to be reused to compute another hash value, and
resume caching them in ObjectInserter and PackParser. This shaves a
small amount of running time off parsing git.git's pack file:
before after
------ ------
25.25s 25.55s
25.48s 25.06s
25.26s 24.94s
Almost noise (small difference), but recycling the instances reduces
some stress on the memory allocator finding two 80 word message block
arrays needed for hashing and collision detection.