Throw InvalidObjectIdException from ObjectId.fromString("tooshort")
ObjectId.fromString already throws InvalidObjectIdException for most
malformed object ids, but for this kind it previously threw
IllegalArgumentException. Since InvalidObjectIdException is a child of
IllegalArgumentException, callers that catch IllegalArgumentException
will continue to work.
Change-Id: I24e1422d51607c86a1cb816a495703279e461f01
Signed-off-by: Jonathan Nieder <jrn@google.com>
Add methods that allow to unregister repositories from the
RepositoryCache individually.
Bug: 470234
Change-Id: Ib918a634d829c9898072ae7bdeb22b099a32b1c9
Signed-off-by: Tobias Oberlies <tobias.oberlies@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This may be used by e.g. a custom reflog implementation to record
this information along with the ref update.
Change-Id: I44adbfad704b76f9c1beced6e1ce82eaf71410d2
Add a separate type for the identity in a push certificate
These differ subtly from a PersonIdent, because they can contain
anything that is a valid User ID passed to gpg --local-user. Upstream
git push --signed will just take the configuration value from
user.signingkey and pass that verbatim in both --local-user and the
pusher field of the certificate. This does not necessarily contain an
email address, which means the parsing implementation ends up being
substantially different from RawParseUtils.parsePersonIdent.
Nonetheless, we try hard to match PersonIdent behavior in
questionable cases.
Change-Id: I37714ce7372ccf554b24ddbff56aa61f0b19cbae
Revert "Config: Distinguish between empty and null strings"
This reverts commit 96eb3ee3976e7e9e3e118851fa614cce8a1f7d88, which
broke Gerrit tests that set a config value to 'null', serialize the
result, deserialize, and expect 'null' from Config.getString[1].
The intent of that commit was to make it possible to distinguish between
an absent and an empty config value, which we'll have to do with a new
method.
Revert the behavior change. Keep the tests from 428cb23f2de8, since
they test the behavior more precisely than the old tests did.
[1] https://gerrit-review.googlesource.com/68452
Change-Id: Ie8042f380ea0e34e3203e1991aa0feb2e6e44641
Signed-off-by: Jonathan Nieder <jrn@google.com>
Change-Id: I351cee0ba2e77e3360846ac0c5368da3a322725c
Reported-by: Markus Keller <markus_keller@ch.ibm.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
exactRef(ref1, ref2, ref3) requests multiple specific refs in a single
lookup, which may be faster in some backends than looking them up one by
one.
firstExactRef generalizes getRef by finding the first existing ref from
the list of refs named. Its main purpose is for the default
implementation of getRef (finding the first existing ref in a search
path). Hopefully it can be useful for other operations that look for
refs in a search path (e.g., git log --notes=<name>), too.
Change-Id: I5c6fcf1d3920f6968b8b97f3d4c3a267258c4b86
Signed-off-by: Jonathan Nieder <jrn@google.com>
Introduce exactRef to read a ref whose exact name is known
Unlike getRef(name), the new exactRef method does not walk the search
path. This should produce a less confusing result than getRef when the
exact ref name is known: it will not try to resolve refs/foo/bar to
refs/heads/refs/foo/bar even when refs/foo/bar does not exist.
It can be faster than both getRefs(ALL).get(name) and getRef(name)
because it only needs to examine a single ref.
A follow-up change will introduce a findRef synonym to getRef and
deprecate getRef to make the choice a caller is making more obvious
(exactRef or findRef, with the same semantics as getRefs(ALL).get and
getRefs(ALL).findRef).
Change-Id: If1bd09bcfc9919e7976a4d77f13184ea58dcda52
Signed-off-by: Jonathan Nieder <jrn@google.com>
Config: Allow ending a file with "=" and no newline
This is a perfectly valid construction according to C git:
$ echo -en '[a]\nx =' > foo.config
$ git config -f foo.config a.x; echo $?
0
Change-Id: Icfcf8304adb43c79e2b8b998f8d651b2a94f6acb
Config: Distinguish between empty and null strings
The C git API and command line tools distinguish between a key having
the empty string as a value and no key being present in the config
file:
$ echo -e '[a]\nx =' > foo.config
$ git config -f foo.config a.x; echo $?
0
$ git config -f foo.config a.y; echo $?
1
Make JGit make the same distinction. This is in line with the current
Javadoc of getString, which claims to return "a String value from the
config, null if not found". It is more reasonable to interpret "x ="
in the above example as "found" rather than "missing".
We need to maintain the special handling of a key name with no "="
resolving to a boolean true, but "=" with an empty string is still not
a valid boolean.
Change-Id: If0dbb7470c524259de0b167148db87f81be2d04a
ObjectReader release method was replaced by close method but
WindowCursor was still implementing release method.
To prevent the same mistake again, make ObjectReader close method
abstract to force sub classes to implement it.
Change-Id: I50d0d1d19a26e306fd0dba77b246a95a44fd6584
Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>
Use AutoClosable to close resources in bundle org.eclipse.jgit
- use try-with-resource where possible
- replace use of deprecated release() by close()
Change-Id: I0f139c3535679087b7fa09649166bca514750b81
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
In 77030a5e, AutoClosable was implemented on classes that use release().
This caused a resource leak because the ObjectReader.close method was
not calling the now deprecated release method, which is the method that
sub classes implements to release resources.
Change-Id: I247651ec8fd7ca9941d256ca46d14cc43cc35c6e
Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>
This was added a very long time ago to support the failed
DHT storage implementation. Since then no storage system
was able to make use of this API, but it pollutes internals
of the walkers.
Kill the API on ObjectReader and drop the invocations from
the walker code.
Change-Id: I36608afdac13a6c3084d7c7e0af5e0cb22900332
Add fsck.allowInvalidPersonIdent to accept invalid author/committers
A larger than expected number of real-world repositories found on
the Internet contain invalid author, committer and tagger lines
in their history. Many of these seem to be caused by users misusing
the user.name and user.email fields, e.g.:
[user]
name = Au Thor <author@example.com>
email = author@example.com
that some version of Git (or a reimplementation thereof) copied
directly into the object header. These headers are not valid and
are rejected by a strict fsck, making it impossible to transfer
the repository with JGit/EGit.
Another form is an invalid committer line with double negative for
the time zone, e.g.
committer Au Thor <a@b> 1288373970 --700
The real world is messy. :(
Allow callers and users to weaken the fsck settings to accept these
sorts of breakages if they really want to work on a repo that has
broken history. Most routines will still function fine, however
commit timestamp sorting in RevWalk may become confused by a corrupt
committer line and sort commits out of order. This is mostly fine if
the corrupted chain is shorter than the slop window.
Change-Id: I6d529542c765c131de590f4f7ef8e7c1c8cb9db9
Avoid storing large packs in block cache during reuse
When a large pack (> 30% of the block cache) is being reused by
copying it pollutes the block cache with noise by storing blocks
that are never referenced again.
Avoid this by streaming the file directly from its channel onto
the output stream.
Change-Id: I2e53de27f3dcfb93de68b1fad45f75ab23e79fe7
Revert "CommitBuilder should check for duplicate parents"
This reverts commit 6bc48cdc62.
Until git v1.7.10.2~29^2~1 (builtin/merge.c: reduce parents early,
2012-04-17), C git merge would make merge commits with duplicate parents
when asked to with a series of commands like the following:
git checkout origin/master
git merge --no-ff origin/master
Nowadays "git merge" removes redundant parents more aggressively
(whenever one parent is an ancestor of another and not just when
duplicates exist) but merges with duplicate parents are still permitted
and can be created with git fast-import or git commit-tree and history
viewers need to be able to cope with them.
CommitBuilder is an interface analagous to commit-tree, so it should
allow duplicate parents. (That said, an option to automatically remove
redundant parents would be useful.)
Reported-by: Dave Borowitz <dborowitz@google.com>
Change-Id: Ia682238397eb1de8541802210fa875fdd50f62f0
Signed-off-by: Jonathan Nieder <jrn@google.com>
When setting the parents of a commit with setParentIds() or
addParentId() it should be checked that we don't have duplicate parents.
An IllegalArgumentException should be thrown in this case.
Change-Id: I9fa9f31149b7732071b304bca232f037146de454
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Remove AutoCloseable from internal PackFile and friends
PackFile is held by the block cache and cannot be auto closed in a
try-with-resources statement. Remove the interface as JGit does
explicit management of the instances.
ObjectDatabase and RefDatabase are internal details of Repository
and are managed with the Repository. Marking them AutoCloseable
provides no value to the library or an application using the API.
Change-Id: Ibee19eadd66233e6666b601583daa1834a7778f1
This hook uses the file .git/COMMIT_EDITMSG to receive and potentially
modify the commit message.
Change-Id: Ibe2faadfb5d3932a5a3da2252d8156c4c04856c7
Signed-off-by: Laurent Delaigue <laurent.delaigue@obeo.fr>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
According to [1] user name and email are taken first from the
environment variables:
GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL
GIT_COMMITTER_NAME, GIT_COMMITTER_EMAIL
In case (some of) these environment variables are not set, the
information is taken from the git configuration.
JGit doesn not yet support the environment variables GIT_AUTHOR_DATE and
GIT_COMMITTER_DATE.
[1] https://www.kernel.org/pub/software/scm/git/docs/git-commit-tree.html#_commit_information
Bug: 460586
Change-Id: I3ba582b4ae13674cf319652b5b13ebcbb96dd8ec
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Implement AutoClosable interface on classes that used release()
Implement AutoClosable and deprecate the old release() method to give
JGit consumers some time to adapt.
Bug: 428039
Change-Id: Id664a91dc5a8cf2ac401e7d87ce2e3b89e221458
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Introduce hook support into the FS implementations
This introduces the background plumbing necessary to run git hooks from
JGit. This implementation will be OS-dependent as it aims to be
compatible with existing hooks, mostly written in Shell. It is
compatible with unix systems and windows as long as an Unix emulator
such as Cygwin is in its PATH.
Change-Id: I1f82a5205138fd8032614dd5b52aef14e02238ed
Signed-off-by: Laurent Goubet <laurent.goubet@obeo.fr>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Core classes to parse and process .gitattributes files including
support for reading attributes in WorkingTreeIterator and the
dirCacheIterator.
The implementation follows the git ignore implementation. It supports
lazy reading attributes while walking the working tree.
Bug: 342372
CQ: 9078
Change-Id: I05f3ce1861fbf9896b1bcb7816ba78af35f3ad3d
Also-by: Marc Strapetz <marc.strapetz@syntevo.com>
Also-by: Gunnar Wagenknecht <gunnar@wagenknecht.org>
Also-by: Arthur Daussy <arthur.daussy@obeo.fr>
Signed-off-by: Gunnar Wagenknecht <gunnar@wagenknecht.org>
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: Arthur Daussy <arthur.daussy@obeo.fr>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Trim author/committer name and email in commit header
C Git trims name and email before inserting them into the commit header
so that " A U Thor " and " author@example.com " becomes
"A U Thor <author@example.com>" with a single separating space.
This changes PersonIdent#toExternalString() to trim name and email
before concatenating them.
Change-Id: Idd77b659d0db957626824f6632e2da38d7731625
Signed-off-by: Rüdiger Herrmann <ruediger.herrmann@gmx.de>
ObjectChecker: Disallow names potentially mapping to ".git" on HFS+
Mac's HFS+ folds concatentations of ".git" and ignorable Unicode
characters [1] to ".git" [2]. Hence we need to disallow all names which
could potentially be a shortname for ".git". Example: in an empty
directory create a folder ".g\U+200Cit". Now you can't create another
folder ".git".
The following characters are ignorable Unicode which are ignored on
HFS+:
unicode hex name
-------------------------------------------------
U+200C 0xe2808c ZERO WIDTH NON-JOINER
U+200D 0xe2808d ZERO WIDTH JOINER
U+200E 0xe2808e LEFT-TO-RIGHT MARK
U+200F 0xe2808f RIGHT-TO-LEFT MARK
U+202A 0xe280aa LEFT-TO-RIGHT EMBEDDING
U+202B 0xe280ab RIGHT-TO-LEFT EMBEDDING
U+202C 0xe280ac POP DIRECTIONAL FORMATTING
U+202D 0xe280ad LEFT-TO-RIGHT OVERRIDE
U+202E 0xe280ae RIGHT-TO-LEFT OVERRIDE
U+206A 0xe281aa INHIBIT SYMMETRIC SWAPPING
U+206B 0xe281ab ACTIVATE SYMMETRIC SWAPPING
U+206C 0xe281ac INHIBIT ARABIC FORM SHAPING
U+206D 0xe281ad ACTIVATE ARABIC FORM SHAPING
U+206E 0xe281ae NATIONAL DIGIT SHAPES
U+206F 0xe281af NOMINAL DIGIT SHAPES
U+FEFF 0xefbbbf ZERO WIDTH NO-BREAK SPACE
[1] http://www.unicode.org/versions/Unicode7.0.0/ch05.pdf#G40025http://www.unicode.org/reports/tr31/#Layout_and_Format_Control_Characters
[2] http://dubeiko.com/development/FileSystems/HFSPLUS/tn1150.html#UnicodeSubtleties
Change-Id: Ib6a1dd090b2649bdd8ec16387c994ed29de2860d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Windows creates shortnames for all non-8.3 files (see [1]). Hence we
need to disallow all names which could potentially be a shortname for
".git". Example: in an empty directory create a folder "GIT~1". Now you
can't create another folder ".git".
The path "GIT~1" may map to ".git" on Windows. A potential victim to
such an attack first has to initialize a git repository in order to
receive any git commits. Hence the .git folder created by init will get
the shortname "GIT~1". ".git" will only get a different shortname if the
user has created a file "GIT~1" before initialization of the git
repository.
[1] http://en.wikipedia.org/wiki/8.3_filename
Change-Id: I9978ab8f2d2951c46c1b9bbde57986d64d26b9b2
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Windows treats "foo." and "foo " as "foo". The ".git" directory is
special, as it contains metadata for a local Git repository. Disallow
variations that Windows considers to be the same.
Change-Id: I28eb48859a95a89111b4987c91de97557e3bb539
Always ignore case when forbidding .git in ObjectChecker
The component name ".GIT" inside a tree entry could confuse a
case insensitive filesystem into looking at a submodule and
not a directory entry.
Disallow any case permutations of ".git" to prevent this
confusion from entering a repository and showing up at a
later date on a case insensitive system.
Change-Id: Iaa3f768931d0d5764bf07ac5f6f3ff2b1fdda01b
When updating a submodule (e.g. during recursive clone) the repository
for the submodule should be located at <gitdir>/modules/<submodule-path>
whereas the working tree of the submodule should be located at
<working-tree>/<submodule-path> (<gitdir> and <working-tree> are
associated to the containing repository). Since CloneCommand has learned
about specifying a separate gitdir this is easy to implement in
SubmoduleUpdateCommand.
Change-Id: I9b56a3dfa50f97f6975c2bb7c97b36296f331b64
Allow explicit configuration of git directory in InitCommand
Native git's "init" command allows to specify the location of the .git
folder with the option "--separate-git-dir". This allows for example to
setup repositories with a non-standard layout. E.g. .git folder under
/repos/a.git and the worktree under /home/git/a. Both directories
contain pointers to the other side: /repos/a.git/config contains
core.worktree=/home/git/a . And /home/git/a/.git is a file containing
"gitdir: /repos/a.git". This commit adds that option to InitCommand.
This feature is needed to support the new submodule layout where the
.git folder of the submodules is under .git/modules/<submodule>.
Change-Id: I0208f643808bf8f28e2c979d6e33662607775f1f
Without explicitly closing repos we can't delete the test repositories
on windows.
Change-Id: Id5fa17bd764cbf28703c2f21639d7e969289c2d6
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Move checkPath from DirCacheCheckout to ObjectChecker
The bulk of the "is this sane" logic is inside of ObjectChecker. The
only caller for the version in DirCacheCheckout is an obtuse usage for
the static isValidRefName() method in Repository.
Deprecate the weird single use method in DirCacheCheckout and move all
code for checking a sequence of path components into ObjectChecker,
where it makes sense alongside the existing code that checks a single
component at a time.
Reuse a single ObjectChecker for the local platform, to avoid looking
up the system properties on each path string considered.
Change-Id: Iae6e769f2bfcad05c166e70ff255f9cf9fcdc87e
Detect buffering failures while writing rebase todo file
By routing writes through SafeBufferedOutputStream the caller can be
alerted to any flush at close failures while writing or appending to
the rebase todo script.
Switch the character encoding to be done at the line granularity, as
this is sufficiently long enough that encoding overheads will not be a
bottleneck, but short enough that the amount of temporary data will
not cause memory problems for the JVM.
Change-Id: Ice5ec10a7cbadc58486d481b92940056f9ffc43a
BaseRepositoryBuilder.findGitDir() was not searching correctly for bare
repositories. E.g. when running org.eclipse.jgit.pgm.Log and the current
directory was that of a bare git repository an error "fatal: error:
can't find git directory" was raised. With this fix RepositoryBuilder
will also check whether the given directory is the root of a bare
repository.
Bug: 450193
Change-Id: I4d4ad42e24ca397745adb0f3385caee3bcf3a186
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
JGit's ObjectDirectory implements the optimization that it remembers the
pack folders (.git/objects/pack) lastModified timestamp and doesn't
check for new packfiles in this folder if the lastModified attribute has
not changed.
In environments using NFS this can cause trouble. If multiple JGit
instances from multiple machines work on the same repository and one
instance creates a new ref and a new packfile (e.g. by doing a fetch)
then the other machines may detect the new ref but can't resolve the
referenced object because it doesn't detect that pack folder has a new
packfile. That's because NFS may cache file/folder metadata for quite a
long time and the pack folders modification time is not updated although
a new packfile is there and could be read.
The new config parameter core.trustfolderstat controls this behaviour.
The default is true and jgits behaviours is unchanged. But if this
parameter is set to false then jgit doesn't trust the pack directories
lastmodified anymore. Instead it will always iterate through the content
of that folder to detect new packfiles.
Change-Id: Ie3b4e92933286aa9916070a22422e629b3147f54
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fix possible NPE in IndexDiff when not all submodules are cloned
The latest changes to IndexDiff just assumed that all configured
submodules are allways cloned. If a configured submodule did not exist
an exception was thrown. This is fixed by this commit.
Bug: 450567
Change-Id: Iabe3b196d998c19483082e5720038ebddaeb1890
Implement atomic refs update, if possible by database
Inspired by the series[1], this implements the possibility to
have atomic ref transactions.
If the database supports atomic ref update capabilities, we'll
advertise these. If the client wishes to use this feature, either
all refs will be updated or none at all.
[1] http://thread.gmane.org/gmane.comp.version-control.git/259019/focus=259024
Change-Id: I7b5d19c21f3b5557e41b9bcb5d359a65ff1a493d
Signed-off-by: Stefan Beller <sbeller@google.com>