Shawn Pearce [Wed, 13 Jan 2016 00:11:36 +0000 (16:11 -0800)]
PackWriter: Declare preparePack object sets as @NonNull
Require callers to pass in valid sets for both want and have
collections. Offer PackWriter.NONE as a handy constant for an
empty collection for the have part of preparePack instead of null.
Shawn Pearce [Tue, 12 Jan 2016 18:50:36 +0000 (10:50 -0800)]
GC: Pack RefTrees in their own pack
The RefTree graph needs to be quickly accessed to read references.
It is also distinct graph disconnected from the rest of the
repository. Store the commit and tree objects in their own pack.
Shawn Pearce [Sun, 10 Jan 2016 01:27:25 +0000 (17:27 -0800)]
RefTree: Change peel suffix to " ^" (space carrot)
Using ^{} as the peel suffix has caused problems when projects used
tags like v2.1 and then v2.1.1, v2.2.2, etc. The peeled value for
v2.1 was stored very far away in the tree relative to v2.1 itself as
^ sorts in the ASCII/UTF-8 encoding after all other common tag
characters like digits and dots.
Use " ^" instead as space is not valid in a reference name, sorts
before all other valid reference characters (thus forcing next entry
locality) and this looks like a peeled marker for the prior tag.
Shawn Pearce [Sat, 9 Jan 2016 20:51:14 +0000 (12:51 -0800)]
FileRepository: Support extensions.refsBackendType = RefTree
This experimental code can be enabled in $GIT_DIR/config:
[core]
repositoryformatversion = 1
[extensions]
refsBackendType = RefTree
When these are set the repository will read references from the
RefTree rooted by the $GIT_DIR/refs/txn/committed reference.
Update debug-rebuild-ref-tree to rebuild refs/txn/committed only from
the bootstrap layer. This avoids misuse by rebuilding using packed-refs
and $GIT_DIR/refs tree.
Shawn Pearce [Sat, 28 Nov 2015 07:21:43 +0000 (23:21 -0800)]
RefTreeDatabase: Ref database using refs/txn/committed
Instead of storing references in the local filesystem rely on the
RefTree rooted at refs/txn/committed. This avoids needing to store
references in the packed-refs file by keeping all data rooted under
a single refs/txn/committed ref.
Performance to scan all references from a well packed RefTree is very
close to reading the packed-refs file from local disk.
Storing a packed RefTree is smaller due to pack file compression,
about 49.39 bytes/ref (on average) compared to packed-refs using
~65.49 bytes/ref.
Shawn Pearce [Mon, 11 Jan 2016 20:30:35 +0000 (12:30 -0800)]
RevCommit: Better support invalid encoding headers
With this support we no longer need the 'utf-8' alias. UTF-8 will be
automatically tried when the encoding header is not recognized and used
if the character sequence cleanly decodes as UTF-8.
Modernize some of the references to use StandardCharsets.
Shawn Pearce [Sat, 28 Nov 2015 03:34:36 +0000 (19:34 -0800)]
debug-rebuild-ref-tree: Simple program to build a RefTree
This tool scans all references in the repository and writes out a new
reference pointing to a single commit whose root tree is a RefTree
containing the current refs of this repository.
It alway skips storing the reference it will write to, avoiding the
obvious cycle.
Shawn Pearce [Wed, 18 Nov 2015 00:22:18 +0000 (16:22 -0800)]
RefTree: Store references in a Git tree
A group of updates can be applied by updating the tree in one step,
writing out a new root tree, and storing its SHA-1. If references
are stored in RefTrees, comparing two repositories is a matter of
checking if two SHA-1s are identical. Without RefTrees comparing two
repositories requires listing all references and comparing the sets.
Track the "refs/" directory as a root tree by storing references
that point directly at an object as a GITLINK entry in the tree.
For example "refs/heads/master" is written as "heads/master".
Annotated tags also store their peeled value with ^{} suffix, using
"tags/v1.0" and "tags/v1.0^{}" GITLINK entries.
Symbolic references are written as SYMLINK entries with the blob of
the symlink carrying the name of the symbolic reference target.
HEAD is outside of "refs/" namespace so it is stored as a special
"..HEAD" entry. This name is chosen because ".." is not valid in
a reference name and it almost looks like "../HEAD" which names
HEAD if the reader was inside of the "refs/" directory.
A new Command type is required to handle symbolic references and
peeled references.
Andrey Loskutov [Sun, 3 Jan 2016 11:29:55 +0000 (12:29 +0100)]
Make sure CLIGitCommand and Main produce (almost) same results
Currently execution of tests in pgm uses CLIGitCommand which
re-implements few things from Main. Unfortunately this can results in a
different test behavior compared to the real CLI runtime.
The change let CLIGitCommand extend Main and only slightly modifies the
runtime (stream redirection and undesired exit() termination).
Andrey Loskutov [Sun, 3 Jan 2016 14:27:01 +0000 (15:27 +0100)]
Added CLIText.fatalError(String) API for tests
In different places (Main, TextBuiltin, CLIGitCommand) we report fatal
errors and at same time want to check for fatal errors in the tests.
Using common API simplifies the error testing and helps to navigate to
the actual error check implementation.
The recent ObjectChecker changes to pass in AnyObjectId as part
of the checkCommit method signature meant the override here was no
longer throwing an exception as expected.
David Ostrovsky [Sun, 3 Jan 2016 23:44:07 +0000 (00:44 +0100)]
buck: Make :jgit_src target work in cross-cell environment
This artifact is used from unzip utility in Gerrit Code Review
build toolchain and thus the file must exist on the file system.
Moreover, trying to use java_binary() didn't work either, as the
zip layout was wrong: all files contained 'org.eclipse.jgit/src/'
prefix.
Change-Id: I00e3269a7a1a6c6d1fe7e60d1bf1c69b8e57d79d Signed-off-by: David Ostrovsky <david@ostrovsky.org>
Matthias Sohn [Sat, 2 Jan 2016 01:09:16 +0000 (02:09 +0100)]
Ensure all http tests are run and fix broken tests
HttpClientTests were broken. This wasn't discovered since
maven-surefire-plugin's by default only executes test classes
matching **/*Test.java. Fix this by also including **/*.Tests.java
and fix the failing tests.
Change-Id: I487a30fb333de993a9f8d8fff491d3b0e7fb02cc Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Fri, 1 Jan 2016 22:54:15 +0000 (23:54 +0100)]
buck: run http tests
Running tests using buck reveals that HttpClientTests are broken
and weren't executed by Maven since these test classes don't match the
maven-surefire-plugin's default for test classes **/*Test.java.
Will be fixed in a follow-up change.
Change-Id: I82a01b5fd3f0a930bec2423a29a256601dadc248 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Shawn Pearce [Fri, 1 Jan 2016 19:04:11 +0000 (11:04 -0800)]
buck: set vm_args for tests
Maven pom files force the local encoding to UTF-8 to ensure there are
no differences between machines. They also set the JVM max heap to
256m. Match both in Buck so that results are consistent.
Shawn Pearce [Fri, 1 Jan 2016 04:07:03 +0000 (20:07 -0800)]
buck: pin to stable version
Like with Gerrit, pin JGit to a single version of Buck that is known
to work with current Buck files and JUnit tests. Notably a more recent
version of Buck used by Gerrit (01a0c54d827) fails WalkEncryptionTest.
Shawn Pearce [Fri, 1 Jan 2016 17:58:34 +0000 (12:58 -0500)]
Merge changes from topic 'add-df'
* changes:
DirCache: Do not create duplicate tree entries
DirCacheEditor: Replace file-with-tree and tree-with-file
AddCommand: Use NameConflictTreeWalk to identify file-dir changes
Shawn Pearce [Fri, 1 Jan 2016 00:12:51 +0000 (16:12 -0800)]
Fix "remote: Counting objects: ..." formatting
Trailing whitespace is usually removed in properties files so
JGitText did not supply a space between : and the remote message.
Ensure the space exists at runtime by reading the localized string
and appending a space if it is missing.
Messages should be dynamically fetched and not held in a static
class variable, as they can be changed using thread locals.
The attributes, that all these build tools have in common, are:
* reliable
* correct
* very fast
* reproducible
It must not always be the other build tool, this project is currently
using. Or, quoting Gerrit Code Review maintainer here:
"Friends, don't let friends use <the other build tool system>!"
This change is non-complete implementation of JGit build in Buck,
needed by Gerrit Code Review to replace its dependency with standlone
JGit cell. This is very useful when a developer is working on both
projects and is trying to integrate changes made in JGit in Gerrit.
The supported workflow is:
$ cd jgit
$ emacs <hack>
$ cd ../gerrit
$ buck build --config repositories.jgit=../jgit gerrit
With --config repositories.jgit=../jgit jgit cell is routed through
JGit development tree.
Eryk Szymanski [Wed, 17 Jun 2015 15:17:17 +0000 (17:17 +0200)]
Fix encoding problem from curl repostory on github
Pushing curl repository to gerrit fails with a message:
remote: error: internal error while processing changes
java.nio.charset.IllegalCharsetNameException: 'utf8'
The zeroPaddedFilemode = ignore is a synonym for the JGit specific
allowLeadingZeroFileMode = true. Only accept the JGit key if git-core
key was not specified.
Shawn Pearce [Tue, 29 Dec 2015 23:52:16 +0000 (15:52 -0800)]
ObjectChecker: allow some objects to skip errors
Some ancient objects may be broken, but in a relatively harmless way.
Allow the ObjectChecker caller to whitelist specific objects that are
going to fail checks, but that have been reviewed by a human and decided
the objects are OK enough to permit continued use of.
This avoids needing to rewrite history to scrub the broken objects out.
Honor the git-core fsck.skipList configuration setting when receiving a
push or fetching from a remote repository.
Shawn Pearce [Wed, 30 Dec 2015 20:23:06 +0000 (12:23 -0800)]
ObjectChecker: use java.text.Normalizer directly
Base Java version for JGit is now Java 7. The java.text.Normalizer
class was available in Java 6. Reflection is no longer required to
normalize strings for Mac OS X.
* changes:
Sort "eager" path-like options to the end of the help
reset command: provide convenient and meaningful options help
commit command: allow to specify path(s) argument(s)
status command: consume more then one argument after --
repo command: properly name the required 'path' argument
Un-ignored existing CLI tests which run just fine on Java 7+
Don't treat command termination due '-h' option as a fatal error
Shawn Pearce [Wed, 30 Dec 2015 00:53:56 +0000 (16:53 -0800)]
Unify fetch and receive ObjectChecker setup
This avoids duplication of code between receive-pack and fetch-pack paths.
Separate methods are still required to check use of receive.fsckobjects vs.
fetch.fsckobjects, both of which default to transfer.fsckobjects.
Shawn Pearce [Tue, 29 Dec 2015 23:11:21 +0000 (15:11 -0800)]
PackWriter: use lib.ObjectIdSet to avoid wrapper
Hoist ObjectIdSet up to lib as part of the public API and add
the interface to some common types like PackIndex and JGit custom
ObjectId map types. This cleans up wrapper code in a number of
places by allowing direct use of the types as an ObjectIdSet.
Future commits can now rely on ObjectIdSet as a simple read-only
type to check a set of objects from a number of storage options.
Shawn Pearce [Fri, 18 Dec 2015 22:31:20 +0000 (14:31 -0800)]
DirCache: Do not create duplicate tree entries
If a file (e.g. "A") and a subtree file (e.g. "A/foo.c") both appear
in the DirCache this cache should not be written out as a tree object.
The "A" file and "A" subtree conflict with each other in the same tree
and will fail fsck.
Detect this condition during DirCacheBuilder and DirCacheEditor
finish() so the application can be halted early before it updates a
DirCache that might later write an invalid tree structure.
Shawn Pearce [Mon, 28 Dec 2015 19:43:07 +0000 (11:43 -0800)]
DirCacheEditor: Replace file-with-tree and tree-with-file
If a PathEdit tries to store a file where a subtree was, or a subtree
where a file was, replace the entry in the DirCache with the new
name(s). This supports switching between file and tree entry types
using a DirCacheEditor.
Add new unit tests to cover the conditions where these can happen.
Shawn Pearce [Thu, 24 Dec 2015 22:27:50 +0000 (14:27 -0800)]
AddCommand: Use NameConflictTreeWalk to identify file-dir changes
Adding a path that already exists but is changing type such as
from symlink to subdirectory requires a NameConflictTreeWalk to
match up the two different entry types that share the same name.
NameConflictTreeWalk needs a bug fix to pop conflicting entries
when PathFilterGroup aborts the walk early so that it does not
allow DirCacheBuilderIterator to copy conflicting entries into
the output cache.
Andrey Loskutov [Mon, 28 Dec 2015 22:27:09 +0000 (23:27 +0100)]
Don't treat command termination due '-h' option as a fatal error
Signal early command termination due '-h' or '--help' option via
TerminatedByHelpException. This allows tests using
CLIGitCommand differentiate between unexpected command parsing errors
and expected command cancellation "on help" (which also allows
validation of expected/unexpected help messages).
Additional side-effect: jgit supports now git style of handling help
option: any unexpected command line options before help are reported as
errors, but after help ignored.
Andrey Loskutov [Mon, 28 Dec 2015 14:42:04 +0000 (15:42 +0100)]
Simplify development of commands: added main() to CLIGitCommand
This will execute git commands (with arguments) specified on the command
line, handy for developing/debugging a sequence of arbitrary git
commands working on same repository.
The git working dir path can be specified via Java system property
"git_work_tree". If not specified, current directory will be used.
Shawn Pearce [Thu, 24 Dec 2015 23:46:19 +0000 (15:46 -0800)]
DirCacheEditor: Cleanup DeleteTree constructor
Neaten up formatting and avoid strings, which prevents the need for
NLS comment tags. Instead check the last character using char
literal, and append a char literal instead of a string.
Andrey Loskutov [Wed, 16 Dec 2015 23:12:04 +0000 (00:12 +0100)]
Allow checkout paths without specifying branch name
JGit CLI should allow to do this: checkout -- <path>
Currently, even if "a" is a valid path in the git repo, jgit CLI can't
checkout it:
$jgit checkout -- a
error: pathspec 'a' did not match any file(s) known to git.
The fix also fixes at same time "unnamed" zombie "[VAL ...]" argument
shown on the command line.
Before fix:
$jgit -h
jgit checkout name [VAL ...] [-- path ... ...] [--force (-f)] [--help
(-h)] [--orphan] [-b]
Matthias Sohn [Fri, 14 Aug 2015 12:03:57 +0000 (14:03 +0200)]
Fix InterruptTimer leak in BasePackConnection
When setting timeout on push, BasePackConnection creates a timer, which
will be terminated when push finishes. But, when using
SmartHttpPushConnection, it dropped the first timer created in the
constructor and then created another timer in doPush. If new threads are
created faster than the gc collects then this may stop the service if
it's hitting the max process limit. Hence don't create a new timer if it
already exists.
Bug: 474947
Change-Id: I6746ffe4584ad919369afd5bdbba66fe736be314 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Jonathan Nieder [Tue, 15 Dec 2015 03:57:24 +0000 (19:57 -0800)]
Do not let PathFilter.create("a/b") match 'a' unless 'a' is a subtree
PathFilter and PathFilterGroup form JGit's implementation of git's
path-limiting feature in commands like log and diff. To save time
when traversing trees, a path specification
foo/bar/baz
tells the tree walker not to traverse unrelated trees like qux/. It
does that by returning false from include when the tree walker is
visiting qux and true when it is visiting foo.
Unfortunately that test was implemented to be slightly over-eager: it
doesn't only return true when asked whether to visit a subtree "foo"
but when asked about a plain file "foo" as well. As a result, diffs
and logs restricted to some-file/non-existing-suffix unexpectedly
match against some-file:
Gitiles +log has the same bug and benefits from the same fix.
Callers know not to worry about what subtrees are included in the tree
walk because shouldBeRecursive() returns true in this case, so this
behavior change should be safe. This also better matches the behavior
of C git:
$ empty=$(git mktree </dev/null)
$ git diff-tree --abbrev $empty HEAD -- LICENSE/no-such-file
$ git diff-tree --abbrev $empty HEAD -- tools/no-such-file
:000000 040000 0000000... b62648d... A tools
Jonathan Nieder [Tue, 15 Dec 2015 03:22:25 +0000 (19:22 -0800)]
Add tests for PathFilterGroup.Single
Expand the existing PathFilterGroup tests to check which paths the
tree entry matches. This expands test coverage by ensuring that
PathFilterGroup's simpler code path to match against a single
PathFilter works correctly.
While at it, move the check on tree entry d/e/f/g.y into two separate
tests: one to check that it doesn't match any of the configured paths,
and another to check that it does not throw StopWalkException to end
the walk early.
Matthias Sohn [Fri, 27 Nov 2015 10:46:21 +0000 (11:46 +0100)]
Fix push with jgit pgm failing with "unauthorized"
Pushing with JGit commandline to e.g. Github failed with "unauthorized"
since HttpUrlConnection calls the configured authenticator implicitly.
The problem is that during a push two requests are sent to the server,
first a GET and then a POST (containing the pack data). The first GET
request sent anonymously is rejected with 401 (unauthorized). When an
Authenticator is installed the java.net classes will use the
Authenticator to ask the user for credentials and retry the request.
But this happens under the hood and JGit level code doesn't see that
this happens.
The next request is the POST but since JGit thinks the first GET request
went through anonymously it doesn't add authentication headers to the
POST request. This POST of course also fails with 401 but since this
request contains a lot of body-data streamed from JGit (the pack file!)
the java.net classes can't simply retry the request with authorization
headers. The whole process fails.
Fix this by using Apache httpclient which doesn't use Authenticator to
retrieve credentials. Instead initialize TransportCommand to use the
default credential provider if no other credentials provider was set
explicitly. org.eclipse.jgit.pgm.Main sets this default for the JGit
command line client.
Change-Id: Ic4e0f8b60d4bd6e69d91eae0c7e1b44cdf851b00 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Fri, 27 Nov 2015 10:26:38 +0000 (11:26 +0100)]
Enable retrieval of credentials from .netrc for AwtCredentialsProvider
This was done for ConsoleCredentialsProvider earlier, we need the
AwtCredentialsProvider for debugging jgit command line since there is no
console in Eclipse. Hence also add support for .netrc here.
Change-Id: Ibbd45b73efc663821866754454cea65e6d03f832 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fix possible arithmetic overflow when setting a timeout
BasePackPushConnection#readStringLongTimeout() was setting a timeout 10
times bigger than some other timeout or the pack transfer time. This
could lead to negative integer values when we hit an arithmetic
overflow. Add a check for this situation and set the timeout to
Integer.MAX_VALUE when overflow happens.
Andrey Loskutov [Fri, 27 Nov 2015 23:15:36 +0000 (00:15 +0100)]
Null-annotated Ref class and fixed related compiler errors
This change fixes all compiler errors in JGit and replaces possible
NPE's with either appropriate exceptions, avoiding multiple "Nullable
return" method calls or early returning from the method.
Shawn Pearce [Mon, 14 Dec 2015 04:26:01 +0000 (20:26 -0800)]
push: Do not blindly overwrite peer
If an application uses PushConnection directly on the native Git wire
protocols JGit should send along the application's expected oldId, not
the advertised value. This allows the remote peer to compare-and-swap
since it was not tested inside JGit.
Discovered when I tried to use a PushConnection (bypassing the
standard PushProcess) and the client blindly overwrote the remote
reference, even though my app had supplied the wrong ObjectId for
the expectedOldObjectId. This was not expected and cost me over an
hour of debugging, plus "corruption" in the remote repository.
By passing along the exact expectedOldObjectId from the app the
remote side can do the check that the application skipped, and
avoid data loss.
Doug Kelly [Wed, 9 Dec 2015 22:36:37 +0000 (16:36 -0600)]
Accept UTF8 BOM with BlobBasedConfig
In I1f5dc07182dbf6bba2a9f4807fdd25b475da4ead, FileBasedConfig got
support for reading a configuration with UTF8 BOM. Apply the same
support to BlobBasedConfig, to make SubmoduleWalk able to parse
.gitmodules configurations with BOM.
Change-Id: I25b5474779952fe2c076180b96fc2869eef190a8 Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>