aboutsummaryrefslogtreecommitdiffstats
path: root/org.eclipse.jgit/resources/org/eclipse/jgit
Commit message (Collapse)AuthorAgeFilesLines
* Minor improvements in git config file inclusionsThomas Wolf2018-01-281-0/+1
| | | | | | | | | | | * Section and key names in git config files are case-insensitive. * If an include directive is invalid, include the line in the exception message. * If inclusion of the included file fails, put the file name into the exception message so that the user knows in which file the problem is. Change-Id: If920943af7ff93f5321b3d315dfec5222091256c Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
* Progress reporting for checkoutMarkus Duft2018-01-231-0/+1
| | | | | | | | | | | | | | The reason for the change is LFS: when using a lot of LFS files, checkout can take quite some time on larger repositories. To avoid "hanging" UI, provide progress reporting. Also implement (partial) progress reporting for cherry-pick, reset, revert which are using checkout internally. The feature is also useful without LFS, so it is independent of it. Change-Id: I021e764241f3c107eaf2771f6b5785245b146b42 Signed-off-by: Markus Duft <markus.duft@ssi-schaefer.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* Enforce DFS blockLimit is a multiple of blockSizeDave Borowitz2018-01-221-0/+2
| | | | Change-Id: I2821124ff88d7d1812a846ed20f3828fc9123b38
* Add a command to deinitialize submodulesDavid Turner2017-12-271-0/+2
| | | | | Change-Id: Iaaefc2cbafbf083d6ab158b1c378ec69cc76d282 Signed-off-by: David Turner <dturner@twosigma.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* Use submodule name instead of path as key in configDavid Turner2017-12-271-0/+1
| | | | | | | | | When a submodule is moved, the "name" field remains the same, while the "path" field changes. Git uses the "name" field in .git/config when a submodule is initialized, so this patch makes JGit do so too. Change-Id: I48d8e89f706447b860c0162822a8e68170aae42b Signed-off-by: David Turner <dturner@twosigma.com>
* Config: Rewrite subsection and value escaping and parsingDave Borowitz2017-12-181-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, Config was using the same method for both escaping and parsing subsection names and config values. The goal was presumably code savings, but unfortunately, these two pieces of the git config format are simply different. In git v2.15.1, Documentation/config.txt says the following about subsection names: "Subsection names are case sensitive and can contain any characters except newline (doublequote `"` and backslash can be included by escaping them as `\"` and `\\`, respectively). Section headers cannot span multiple lines. Variables may belong directly to a section or to a given subsection." And, later in the same documentation section, about values: "A line that defines a value can be continued to the next line by ending it with a `\`; the backquote and the end-of-line are stripped. Leading whitespaces after 'name =', the remainder of the line after the first comment character '#' or ';', and trailing whitespaces of the line are discarded unless they are enclosed in double quotes. Internal whitespaces within the value are retained verbatim. Inside double quotes, double quote `"` and backslash `\` characters must be escaped: use `\"` for `"` and `\\` for `\`. The following escape sequences (beside `\"` and `\\`) are recognized: `\n` for newline character (NL), `\t` for horizontal tabulation (HT, TAB) and `\b` for backspace (BS). Other char escape sequences (including octal escape sequences) are invalid." The main important differences are that subsection names have a limited set of supported escape sequences, and do not support newlines at all, either escaped or unescaped. Arguably, it would be easy to support escaped newlines, but C git simply does not: $ git config -f foo.config $'foo.bar\nbaz.quux' value error: invalid key (newline): foo.bar baz.quux I468106ac was an attempt to fix one bug in escapeValue, around leading whitespace, without having to rewrite the whole escaping/parsing code. Unfortunately, because escapeValue was used for escaping subsection names as well, this made it possible to write invalid config files, any time Config#toText is called with a subsection name with trailing whitespace, like {foo }. Rather than pile hacks on top of hacks, fix it for real by largely rewriting the escaping and parsing code. In addition to fixing escape sequences, fix (and write tests for) a few more issues in the old implementation: * Now that we can properly parse it, always emit newlines as "\n" from escapeValue, rather than the weird (but still supported) syntax with a non-quoted trailing literal "\n\" before the newline. In addition to producing more readable output and matching the behavior of C git, this makes the escaping code much simpler. * Disallow '\0' entirely within both subsection names and values, since due to Unix command line argument conventions it is impossible to pass such values to "git config". * Properly preserve intra-value whitespace when parsing, rather than collapsing it all to a single space. Change-Id: I304f626b9d0ad1592c4e4e449a11b136c0f8b3e3
* Fix typo in key of a JGitText externalized stringMatthias Sohn2017-12-101-1/+1
| | | | Change-Id: I0d22e24a0aa3b17339ef68849554f7c99b350dde Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* Merge branch 'stable-4.9'Matthias Sohn2017-12-101-0/+2
|\ | | | | | | | | | | | | * stable-4.9: Fix IllegalThreadStateException if stderr closed without exiting Change-Id: I8a6a6788c2bb000171233b88d9592ed0640ad15e
| * Fix IllegalThreadStateException if stderr closed without exitingDmitry Pavlenko2017-12-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | If some process executed by FS#readPipe lived for a while after closing stderr, FS#GobblerThread#run failed with an IllegalThreadStateException exception when accessing p.exitValue() for the process which is still alive. Add Process#waitFor calls to wait for the process completion. Bug: 528335 Change-Id: I87e0b6f9ad0b995dbce46ddfb877e33eaf3ae5a6 Signed-off-by: Dmitry Pavlenko <pavlenko@tmatesoft.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* | Merge branch 'stable-4.9'Matthias Sohn2017-11-111-0/+1
|\| | | | | | | | | | | | | | | | | * stable-4.9: Work around a Jsch bug: ensure the user name is set from URI Reintroduce protected method which removal broke EMF Compare Change-Id: I335587eee279f91bd36c9ba9fc149b17a6db6110 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
| * Work around a Jsch bug: ensure the user name is set from URIThomas Wolf2017-11-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | JSch unconditionally overrides the user name given in the connection URI by the one found in ~/.ssh/config (if that does specify one for the used host). If the SSH config file has a different user name, we'll end up using the wrong name, which typically results in an authentication failure or in Eclipse/EGit asking for a password for the wrong user. Unfortunately there is no way to prevent or circumvent this Jsch behavior up front; it occurs already in the Session constructor at com.jcraft.jsch.Session() and the Session.applyConfig() method. And while there is a Session.setUserName() that would enable us to correct this, that latter method has package visibility only. So resort to reflection to invoke that setUserName() method to ensure that Jsch uses the user name from the URI, if there is one. Bug: 526778 Change-Id: Ia327099b5210a037380b2750a7fd76ff25c41a5a Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
* | Merge branch 'stable-4.9'David Pursehouse2017-11-022-2/+2
|\| | | | | | | | | | | | | | | | | | | | | * stable-4.9: PackInserter: Implement newReader() Move some strings from DfsText to JGitText FileRepository: Add pack-based inserter implementation ObjectDirectory: Factor a method to close open pack handles ObjectDirectory: Remove last modified check in insertPack Change-Id: Ifc9ed6f5d8336bc978818a64eae122bceb933e5d
| * Move some strings from DfsText to JGitTextDave Borowitz2017-11-012-2/+2
| | | | | | | | Change-Id: I60050e5127d12b6139d81859dba929fcfaabe504
* | Support symbolic references in ReceiveCommandShawn Pearce2017-10-181-0/+3
|/ | | | | | | | | | | | | | | | | | | | Allow creating symbolic references with link, and deleting them or switching to ObjectId with unlink. How this happens is up to the individual RefDatabase. The default implementation detaches RefUpdate if a symbolic reference is involved, supporting these command instances on RefDirectory. Unfortunately the packed-refs file does not support storing symrefs, so atomic transactions involving more than one symref command are failed early. Updating InMemoryRepository is deferred until reftable lands, as I plan to switch InMemoryRepository to use reftable for its internal storage representation. Change-Id: Ibcae068b17a2fc6d958f767f402a570ad88d9151 Signed-off-by: Minh Thai <mthai@google.com> Signed-off-by: Terry Parker <tparker@google.com>
* Ensure ReflogWriter only works with a RefDirectoryDave Borowitz2017-09-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | The ReflogWriter constructor just took a Repository and called getDirectory() on it to figure out the reflog dirs, but not all Repository instances use this storage format for reflogs, so it's incorrect to attempt to use ReflogWriter when there is not a RefDirectory directly involved. In practice, ReflogWriter was mostly only used by the implementation of RefDirectory, so enforcing this is mostly just shuffling around calls in the same internal package. The one exception is StashDropCommand, which writes to a reflog lock file directly. This was a reasonable implementation decision, because there is no general reflog interface in JGit beyond using (Batch)RefUpdate to write new entries to the reflog. So to implement "git stash drop <N>", which removes an arbitrary element from the reflog, it's fair to fall back to the RefDirectory implementation. Creating and using a more general interface is well beyond the scope of this change. That said, the old behavior of writing out the reflog file even if that's not the reflog format used by the given Repository is clearly wrong. Fail fast in this case instead. Change-Id: I9bd4b047bc3e28a5607fd346ec2400dde9151730
* Handle SSL handshake failures in TransportHttpThomas Wolf2017-09-131-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a https connection could not be established because the SSL handshake was unsuccessful, TransportHttp would unconditionally throw a TransportException. Other https clients like web browsers or also some SVN clients handle this more gracefully. If there's a problem with the server certificate, they inform the user and give him a possibility to connect to the server all the same. In git, this would correspond to dynamically setting http.sslVerify to false for the server. Implement this using the CredentialsProvider to inform and ask the user. We offer three choices: 1. skip SSL verification for the current git operation, or 2. skip SSL verification for the server always from now on for requests originating from the current repository, or 3. always skip SSL verification for the server from now on. For (1), we just suppress SSL verification for the current instance of TransportHttp. For (2), we store a http.<uri>.sslVerify = false setting for the original URI in the repo config. For (3), we store the http.<uri>.sslVerify setting in the git user config. Adapt the SmartClientSmartServerSslTest such that it uses this mechanism instead of setting http.sslVerify up front. Improve SimpleHttpServer to enable setting it up also with HTTPS support in anticipation of an EGit SWTbot UI test verifying that cloning via HTTPS from a server that has a certificate that doesn't validate pops up the correct dialog, and that cloning subsequently proceeds successfully if the user decides to skip SSL verification. Bug: 374703 Change-Id: Ie1abada9a3d389ad4d8d52c2d5265d2764e3fb0e Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
* Support http.<url>.* configsThomas Wolf2017-09-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Git has a rather elaborate mechanism to specify HTTP configuration options per URL, based on pattern matching the URL against "http" subsection names.[1] The URLs used for this matching are always the original URLs; redirected URLs do not participate. * Scheme and host must match exactly case-insensitively. * An optional user name must match exactly. * Ports must match exactly after default ports have been filled in. * The path of a subsection, if any, must match a segment prefix of the path of the URL. * Matches with user name take precedence over equal-length path matches without, but longer path matches are preferred over shorter matches with user name. Implement this for JGit. Factor out the HttpConfig from TransportHttp and implement the matching and override mechanism. The set of supported settings is still the same; JGit currently supports only followRedirects, postBuffer, and sslVerify, plus the JGit-specific maxRedirects key. Add tests for path normalization and prefix matching only on segment separators, and use the new mechanism in SmartClientSmartServerSslTest to disable sslVerify selectively for only the test server URLs. Compare also bug 374703 and bug 465492. With this commit it would be possible to set sslVerify to false for only the git server using a self-signed certificate instead of having to switch it off globally via http.sslVerify. [1] https://git-scm.com/docs/git-config Change-Id: I42a3c2399cb937cd7884116a2a32fcaa7a418fcb Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
* Cleanup: message reporting for HTTP redirect handlingThomas Wolf2017-08-231-6/+5
| | | | | | | | | | | | | | | | | | The addition of "tooManyRedirects" in commit 7ac1bfc ("Do authentication re-tries on HTTP POST") was an error I didn't catch after rebasing that change. That message had been renamed in the earlier commit e17bfc9 ("Add support to follow HTTP redirects") to "redirectLimitExceeded". Also make sure we always use the TransportException(URIish, ...) constructor; it'll prefix the message given with the sanitized URI. Change messages to remove the explicit mention of that URI inside the message. Adapt tests that check the expected exception message text. For the info logging of redirects, remove a potentially present password component in the URI to avoid leaking it into the log. Change-Id: I517112404757a9a947e92aaace743c6541dce6aa Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
* Do authentication re-tries on HTTP POSTThomas Wolf2017-08-221-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is at least one git server out there (GOGS) that does not require authentication on the initial GET for info/refs?service=git-receive-pack but that _does_ require authentication for the subsequent POST to actually do the push. This occurs on GOGS with public repositories; for private repositories it wants authentication up front. Handle this behavior by adding 401 handling to our POST request. Note that this is suboptimal; we'll re-send the push data at least twice if an authentication failure on POST occurs. It would be much better if the server required authentication up-front in the GET request. Added authentication unit tests (using BASIC auth) to the SmartClientSmartServerTest: - clone with authentication - clone with authentication but lacking CredentialsProvider - clone with authentication and wrong password - clone with authentication after redirect - clone with authentication only on POST, but not on GET Also tested manually in the wild using repositories at try.gogs.io. That server offers only BASIC auth, so the other paths (DIGEST, NEGOTIATE, fall back from DIGEST to BASIC) are untested and I have no way to test them. * public repository: GET unauthenticated, POST authenticated Also tested after clearing the credentials and then entering a wrong password: correctly asks three times during the HTTP POST for user name and password, then gives up. * private repository: authentication already on GET; then gets applied correctly initially to the POST request, which succeeds. Also fix the authentication to use the credentials for the redirected URI if redirects had occurred. We must not present the credentials for the original URI in that case. Consider a malicious redirect A->B: this would allow server B to harvest the user credentials for server A. The unit test for authentication after a redirect also tests for this. Bug: 513043 Change-Id: I97ee5058569efa1545a6c6f6edfd2b357c40592a Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* reftable: scan and lookup reftable filesShawn Pearce2017-08-171-0/+4
| | | | | | | | | | | | | ReftableReader provides sequential scanning support over all references, a range of references within a subtree (such as "refs/heads/"), and lookup of a single reference. Reads can be accelerated by an index block, if it was created by the writer. The BlockSource interface provides an abstraction to read from the reftable's backing storage, supporting a future commit to connect to JGit DFS and the DfsBlockCache. Change-Id: Ib0dc5fa937d0c735f2a9ff4439d55c457fea7aa8
* reftable: create and write reftable filesShawn Pearce2017-08-171-0/+3
| | | | | | | | This is a simple writer to create reftable formatted files. Follow-up commits will add support for reading from reftable, debugging utilities, and tests. Change-Id: I3d520c3515c580144490b0b45433ea175a3e6e11
* Add support to follow HTTP redirectsThomas Wolf2017-08-171-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git-core follows HTTP redirects so JGit should also provide this. Implement config setting http.followRedirects with possible values "false" (= never), "true" (= always), and "initial" (only on GET, but not on POST).[1] We must do our own redirect handling and cannot rely on the support that the underlying real connection may offer. At least the JDK's HttpURLConnection has two features that get in the way: * it does not allow cross-protocol redirects and thus fails on http->https redirects (for instance, on Github). * it translates a redirect after a POST to a GET unless the system property "http.strictPostRedirect" is set to true. We don't want to manipulate that system setting nor require it. Additionally, git has its own rules about what redirects it accepts;[2] for instance, it does not allow a redirect that adds query arguments. We handle response codes 301, 302, 303, and 307 as per RFC 2616.[3] On POST we do not handle 303, and we follow redirects only if http.followRedirects == true. Redirects are followed only a certain number of times. There are two ways to control that limit: * by default, the limit is given by the http.maxRedirects system property that is also used by the JDK. If the system property is not set, the default is 5. (This is much lower than the JDK default of 20, but I don't see the value of following so many redirects.) * this can be overwritten by a http.maxRedirects git config setting. The JGit http.* git config settings are currently all global; JGit has no support yet for URI-specific settings "http.<pattern>.name". Adding support for that is well beyond the scope of this change. Like git-core, we log every redirect attempt (LOG.info) so that users may know about the redirection having occurred. Extends the test framework to configure an AppServer with HTTPS support so that we can test cloning via HTTPS and redirections involving HTTPS. [1] https://git-scm.com/docs/git-config [2] https://kernel.googlesource.com/pub/scm/git/git/+/6628eb41db5189c0cdfdced6d8697e7c813c5f0f [3] https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html CQ: 13987 Bug: 465167 Change-Id: I86518cb76842f7d326b51f8715e3bbf8ada89859 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com> Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
* Add dfs fsck implementationZhen Chen2017-07-261-0/+4
| | | | | | | | | | | | JGit already had some fsck-like classes like ObjectChecker which can check for an individual object. The read-only FsckPackParser which will parse all objects within a pack file and check it with ObjectChecker. It will also check the pack index file against the object information from the pack parser. Change-Id: Ifd8e0d28eb68ff0b8edd2b51b2fa3a50a544c855 Signed-off-by: Zhen Chen <czhen@google.com>
* ReceiveCommand: Explicitly check constructor preconditionsDave Borowitz2017-07-171-0/+6
| | | | | | | | | | | | | | | | | | Some downstream code checks whether a ReceiveCommand is a create or a delete based on the type field. Other downstream code (in particular a good chunk of Gerrit code I wrote) checks the same thing by comparing oldId/newId to zeroId. Unfortunately, there were no strict checks in the constructor that ensures that zeroId is only set for oldId/newId if the type argument corresponds, so a caller that passed mismatched IDs and types would observe completely undefined behavior as a result. This is and always has been a misuse of the API; throw IllegalArgumentException so the caller knows that it is a misuse. Similarly, throw from the constructor if oldId/newId are null. The non-nullness requirement was already documented. Fix RefDirectoryTest to not do the wrong thing. Change-Id: Ie2d0bfed8a2d89e807a41925d548f0f0ce243ecf
* Merge branch 'stable-4.7' into stable-4.8Matthias Sohn2017-06-071-0/+2
|\ | | | | | | | | | | | | * stable-4.7: Run auto GC in the background Change-Id: I5e25765f65d833f13cbe99696ef33055d7f5c4cf Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
| * Run auto GC in the backgroundDavid Turner2017-06-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running an automatic GC on a FileRepository, when the caller passes a NullProgressMonitor, run the GC in a background thread. Use a thread pool of size 1 to limit the number of background threads spawned for background gc in the same application. In the next minor release we can make the thread pool configurable. In some cases, the auto GC limit is lower than the true number of unreachable loose objects, so auto GC will run after every (e.g) fetch operation. This leads to the appearance of poor fetch performance. Since these GCs will never make progress (until either the objects become referenced, or the two week timeout expires), blocking on them simply reduces throughput. In the event that an auto GC would make progress, it's still OK if it runs in the background. The progress will still happen. This matches the behavior of regular git. Git (and now jgit) uses the lock file for gc.log to prevent simultaneous runs of background gc. Further, it writes errors to gc.log, and won't run background gc if that file is present and recent. If gc.log is too old (according to the config gc.logexpiry), it will be ignored. Change-Id: I3870cadb4a0a6763feff252e6eaef99f4aa8d0df Signed-off-by: David Turner <dturner@twosigma.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* | RepoCommand: Add linkfile support.Dan Willemsen2017-04-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Android wants them to work, and we're only interested in them for bare repos, so add them just for that. Make sure to use symlinks instead of just using the copyfile implementation. Some scripts look up where they're actually located in order to find related files, so they need the link back to their project. Change-Id: I929b69b2505f03036f69e25a55daf93842871f30 Signed-off-by: Dan Willemsen <dwillemsen@google.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jeff Gaston <jeffrygaston@google.com> Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
* | Merge branch 'stable-4.7'Matthias Sohn2017-04-111-0/+1
|\| | | | | | | | | | | | | | | | | * stable-4.7: Cleanup and test trailing slash handling in ManifestParser ManifestParser: Throw exception if remote does not have fetch attribute Change-Id: Ia9dc3110bcbdae05175851ce647ffd11c542f4c0 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
| * ManifestParser: Throw exception if remote does not have fetch attributeHan-Wen Nienhuys2017-04-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In the repo manifest documentation [1] the fetch attribute is marked as "#REQUIRED". If the fetch attribute is not specified, this would previously result in NullPointerException. Throw a SAXException instead. [1] https://gerrit.googlesource.com/git-repo/+/master/docs/manifest-format.txt Change-Id: Ib8ed8cee6074fe6bf8f9ac6fc7a1664a547d2d49 Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
* | Support creating Mergers without a RepositoryDave Borowitz2017-04-051-0/+1
|/ | | | | | | | | All that's really required to run a merge operation is a single ObjectInserter, from which we can construct a RevWalk, plus a Config that declares a diff algorithm. Provide some factory methods that don't take Repository. Change-Id: Ib884dce2528424b5bcbbbbfc043baec1886b9bbd
* Explain in error message how to recover from lock failureMatthias Sohn2017-03-221-1/+1
| | | | | Bug: 483897 Change-Id: I70f8d9c82c1efe2928f072a2fb69461160f7c5f7 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* Move SHA1 compress/recompress files to resource folderDavid Ostrovsky2017-03-182-0/+284
| | | | | | | | | | | | | This fixes Bazel build: in srcs attribute of java_library rule //org.eclipse.jgit:jgit: file '//org.eclipse.jgit:src/org/eclipse/jgit/util/sha1/SHA1.recompress' is misplaced here (expected .java, .srcjar or .properties). Another option that was considered is to exclude the non source files. Change-Id: I7083f27a4a49bf6681c85c7cf7b08a83c9a70c77 Signed-off-by: David Ostrovsky <david@ostrovsky.org>
* Merge branch 'stable-4.6'Matthias Sohn2017-03-161-2/+2
|\ | | | | | | | | | | | | * stable-4.6: Don't remove pack when FileNotFoundException is transient Change-Id: I82941a98385cda27c89e1e6750b7b6db4e39f414 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
| * Merge branch 'stable-4.5' into stable-4.6Matthias Sohn2017-03-161-2/+2
| |\ | | | | | | | | | | | | | | | | | | | | | * stable-4.5: Don't remove pack when FileNotFoundException is transient Change-Id: Ic17c542d78a4cad48ff1ed77dcdc853a4ef2dc06 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
| | * Don't remove pack when FileNotFoundException is transientLuca Milanesio2017-03-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The FileNotFoundException is typically raised in three conditions: 1. file doesn't exist 2. incompatible read vs. read/write open modes 3. filesystem locking 4. temporary lack of resources (e.g. too many open files) 1. is already managed, 2. would never happen as packs are not overwritten while with 3. and 4. it is worth logging the exception and retrying to read the pack again. Log transient errors using an exponential backoff strategy to avoid flooding the logs with the same error if consecutive retries to access the pack fail repeatedly. Bug: 513435 Change-Id: I03c6f6891de3c343d3d517092eaa75dba282c0cd Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* | | SHA-1: collision detection supportShawn Pearce2017-02-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update SHA1 class to include a Java port of sha1dc[1]'s ubc_check, which can detect the attack pattern used by the SHAttered[2] authors. Given the shattered example files that have the same SHA-1, this modified implementation can identify there is risk of collision given only one file in the pair: $ jgit ... [main] WARN org.eclipse.jgit.util.sha1.SHA1 - SHA-1 collision 38762cf7f55934b34d179ae6a4c80cadccbb7f0a When JGit detects probability of a collision the SHA1 class now warns on the logger, reporting the object's SHA-1 hash, and then throws a Sha1CollisionException to the caller. From the paper[3] by Marc Stevens, the probability of a false positive identification of a collision is about 14 * 2^(-160), sufficiently low enough for any detected collision to likely be a real collision. git-core[4] may adopt sha1dc before the system migrates to an entirely new hash function. This commit enables JGit to remain compatible with that move to sha1dc, and help protect users by warning if similar attacks as SHAttered are identified. Performance declined about 8% (detection off), now: MessageDigest 238.41 MiB/s MessageDigest 244.52 MiB/s MessageDigest 244.06 MiB/s MessageDigest 242.58 MiB/s SHA1 216.77 MiB/s (was ~240.83 MiB/s) SHA1 220.98 MiB/s SHA1 221.76 MiB/s SHA1 221.34 MiB/s This decline in throughput is attributed to the step loop unrolling in compress(), which was necessary to easily fit the UbcCheck logic into the hash function. Using helper functions s1-s4 reduces the code explosion, providing acceptable throughput. With detection enabled (default): SHA1 detectCollision 180.12 MiB/s SHA1 detectCollision 181.59 MiB/s SHA1 detectCollision 181.64 MiB/s SHA1 detectCollision 182.24 MiB/s sha1dc (native C) ~206.28 MiB/s sha1dc (native C) ~204.47 MiB/s sha1dc (native C) ~203.74 MiB/s Average time across 100,000 calls to hash 4100 bytes (such as a commit or tree) for the various algorithms available to JGit also shows SHA1 is slower than MessageDigest, but by an acceptable margin: MessageDigest 17 usec SHA1 18 usec SHA1 detectCollision 22 usec Time to index-pack for git.git (217982 objects, 69 MiB) has increased: MessageDigest SHA1 w/ detectCollision ------------- ----------------------- 20.12s 25.25s 19.87s 25.48s 20.04s 25.26s avg 20.01s 25.33s +26% Being implemented in Java with these additional safety checks is clearly a penalty, but throughput is still acceptable given the increased security against object name collisions. [1] https://github.com/cr-marcstevens/sha1collisiondetection [2] https://shattered.it/ [3] https://marc-stevens.nl/research/papers/C13-S.pdf [4] https://public-inbox.org/git/20170223230621.43anex65ndoqbgnf@sigill.intra.peff.net/ Change-Id: I9fe4c6d8fc5e5a661af72cd3246c9e67b1b9fee6
* | | Limit receive commandsShawn Pearce2017-02-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Place a configurable upper bound on the amount of command data received from clients during `git push`. The limit is applied to the encoded wire protocol format, not the JGit in-memory representation. This allows clients to flexibly use the limit; shorter reference names allow for more commands, longer reference names permit fewer commands per batch. Based on data gathered from many repositories at $DAY_JOB, the average reference name is well under 200 bytes when encoded in UTF-8 (the wire encoding). The new 3 MiB default receive.maxCommandBytes allows about 11,155 references in a single `git push` invocation. A Gerrit Code Review system with six-digit change numbers could still encode 29,399 references in the 3 MiB maxCommandBytes limit. Change-Id: I84317d396d25ab1b46820e43ae2b73943646032c Signed-off-by: David Pursehouse <david.pursehouse@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* | | Repository: Include repository name when logging corrupt use countDavid Pursehouse2017-01-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Logging the repository name makes it easier to track down what is incorrectly closing a repository. Change-Id: I42a8bdf766c0e67f100adbf76d9616584e367ac2 Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
* | | Fix possible InvalidObjectIdException in ObjectDirectoryMarc Strapetz2017-01-151-0/+1
|/ / | | | | | | | | | | | | | | | | | | ObjectDirectory.getShallowCommits should throw an IOException instead of an InvalidArgumentException if invalid SHAs are present in .git/shallow (as this file is usually edited by a human). Change-Id: Ia3a39d38f7aec4282109c7698438f0795fbec905 Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com> Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
* | Define MonotonicClock interface for advanced timestampsShawn Pearce2016-11-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | MonotonicClock can be implemented to provide more certainity about time than the standard System.currentTimeMillis() can provide. This can be used by classes such as PersonIdent and Ketch to rely on more certainity about time moving in a strictly ascending order. Gerrit Code Review can also leverage this interface through its embedding of JGit and use MonotonicClock and ProposedTimestamp to provide stronger assurance that NoteDb time is moving forward. Change-Id: I1a3cbd49a39b150a0d49b36d572da113ca83a786
* | Add {get,set}GitwebDescription to RepositoryShawn Pearce2016-11-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This method pair allows the caller to read and modify the description file that is traditionally used by gitweb and cgit when rendering a repository on the web. Gerrit Code Review has offered this feature for years as part of its GitRepositoryManager interface, but its fundamentally a feature of JGit and its Repository abstraction. git-core typically initializes a repository with a default value inside the description file. During getDescription() this string is converted to null as it is never a useful description. Change-Id: I0a457026c74e9c73ea27e6f070d5fbaca3439be5
* | Switch JSchSession to simple isolated OutputStreamShawn Pearce2016-11-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Work around issues with JSch not handling interrupts by isolating the JSch interactions onto another thread. Run write and flush on a single threaded Executor using simple Callable operations wrapping the method calls, waiting on the future to determine the outcome before allowing the caller to continue. If any operation was interrupted the state of the stream becomes fuzzy at close time. The implementation tries to interrupt the pending write or flush, but this is very likely to corrupt the stream object, so exceptions are ignored during such a dirty close. Change-Id: I42e3ba3d8c35a2e40aad340580037ebefbb99b53
* | Check that DfsBlockCache#blockSize is a power of 2Philipp Marx2016-11-111-0/+1
| | | | | | | | | | | | | | | | | | | | In case a value is used which isn’t a power of 2 there will be a high chance of java.lang.ArrayIndexOutBoundsException and org.eclipse.jgit.errors.CorruptObjectException due to a mismatching assumption for the DfsBlockCache#blockSizeShift parameter. Change-Id: Ib348b3704edf10b5f93a3ffab4fa6f09cbbae231 Signed-off-by: Philipp Marx <smigfu@googlemail.com>
* | Fix possible SIOOBE in RefDirectory.parsePackedRefsMarc Strapetz2016-10-211-0/+1
| | | | | | | | | | | | | | | | | | This SIOOBE happens reproducibly when trying to access a repository containing Cygwin symlinks Change-Id: I25f103fcc723bac7bfaaeee333a86f11627a92c7 Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com> Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
* | TreeFormatter: disallow empty filenames in treesDavid Turner2016-10-191-0/+1
| | | | | | | | | | | | | | | | | | Git barfs on these (and they don't make any sense), so we certainly shouldn't write them. Change-Id: I3faf8554a05f0fd147be2e63fbe55987d3f88099 Signed-off-by: David Turner <dturner@twosigma.com> Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
* | Add support for built-in smudge filtersChristian Halstrick2016-09-201-0/+1
|/ | | | | | | | | | | | | | | | | | | | | | | JGit supports smudge filters defined in repository configuration. The filters are implemented as external programs filtering content by accepting the original content (as seen in git's object database) on stdin and which emit the filtered content on stdout. This content is then written to the file in the working tree. To run such a filter JGit has to start an external process and pump data into/from this process. This commit adds support for built-in smudge filters which are implemented in Java and which are executed by jgit's main thread. When a filter is defined in the configuration as "jgit://builtin/<filterDriverName>/smudge" then JGit will lookup in a static map whether a builtin filter is registered under this name. If found such a filter is called to do the filtering. The functionality in this commit requires that a program using JGit explicitly calls the JGit API to register built-in implementations for specific smudge filters. In follow-up commits configuration parameters will be added which trigger such registrations. Change-Id: Ia743aa0dbed795e71e5792f35ae55660e0eb3c24
* Add support for post-commit hooksMartin Goellnitz2016-09-131-0/+1
| | | | | Change-Id: I6691b454404dd4db3c690ecfc7515de765bc2ef7 Signed-off-by: Martin Goellnitz <m.goellnitz@outlook.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* Don't log error if system git config does not existMatthias Sohn2016-09-051-1/+1
| | | | | | | | | | - enhance FS.readPipe to throw an exception if the external command fails to enable the caller to handle the command failure - reduce log level to warning if system git config does not exist - improve log message Bug: 476639 Change-Id: I94ae3caec22150dde81f1ea8e1e665df55290d42 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* Shallow fetch/clone: Make --depth mean the total history depthTerry Parker2016-08-051-0/+1
| | | | | | | | | | | | | | | | | | | | | cgit changed the --depth parameter to mean the total depth of history rather than the depth of ancestors to be returned [1]. JGit still uses the latter meaning, so update it to match cgit. depth=0 still means a non-shallow clone. depth=1 now means only the wants rather than the wants and their direct parents. This is accomplished by changing the semantic meaning of "depth" in UploadPack and PackWriter to mean the total depth of history desired, while keeping "depth" in DepthWalk.{RevWalk,ObjectWalk} to mean the depth of traversal. Thus UploadPack and PackWriter always initialize their DepthWalks with "depth-1". [1] upload-pack: fix off-by-one depth calculation in shallow clone https://code.googlesource.com/git/+/682c7d2f1a2d1a5443777237450505738af2ff1a Change-Id: I87ed3c0f56c37e3491e367a41f5e555c4207ff44 Signed-off-by: Terry Parker <tparker@google.com>
* Shallow fetch: Respect "shallow" linesTerry Parker2016-08-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When fetching from a shallow clone, the client sends "have" lines to tell the server about objects it already has and "shallow" lines to tell where its local history terminates. In some circumstances, the server fails to honor the shallow lines and fails to return objects that the client needs. UploadPack passes the "have" lines to PackWriter so PackWriter can omit them from the generated pack. UploadPack processes "shallow" lines by calling RevWalk.assumeShallow() with the set of shallow commits. RevWalk creates and caches RevCommits for these shallow commits, clearing out their parents. That way, walks correctly terminate at the shallow commits instead of assuming the client has history going back behind them. UploadPack converts its RevWalk to an ObjectWalk, maintaining the cached RevCommits, and passes it to PackWriter. Unfortunately, to support shallow fetches the PackWriter does the following: if (shallowPack && !(walk instanceof DepthWalk.ObjectWalk)) walk = new DepthWalk.ObjectWalk(reader, depth); That is, when the client sends a "deepen" line (fetch --depth=<n>) and the caller has not passed in a DepthWalk.ObjectWalk, PackWriter throws away the RevWalk that was passed in and makes a new one. The cleared parent lists prepared by RevWalk.assumeShallow() are lost. Fortunately UploadPack intends to pass in a DepthWalk.ObjectWalk. It tries to create it by calling toObjectWalkWithSameObjects() on a DepthWalk.RevWalk. But it doesn't work: because DepthWalk.RevWalk does not override the standard RevWalk#toObjectWalkWithSameObjects implementation, the result is a plain ObjectWalk instead of an instance of DepthWalk.ObjectWalk. The result is that the "shallow" information is thrown away and objects reachable from the shallow commits can be omitted from the pack sent when fetching with --depth from a shallow clone. Multiple factors collude to limit the circumstances under which this bug can be observed: 1. Commits with depth != 0 don't enter DepthGenerator's pending queue. That means a "have" cannot have any effect on DepthGenerator unless it is also a "want". 2. DepthGenerator#next() doesn't call carryFlagsImpl(), so the uninteresting flag is not propagated to ancestors there even if a "have" is also a "want". 3. JGit treats a depth of 1 as "1 past the wants". Because of (2), the only place the UNINTERESTING flag can leak to a shallow commit's parents is in the carryFlags() call from markUninteresting(). carryFlags() only traverses commits that have already been parsed: commits yet to be parsed are supposed to inherit correct flags from their parent in PendingGenerator#next (which doesn't happen here --- that is (2)). So the list of commits that have already been parsed becomes relevant. When we hit the markUninteresting() call, all "want"s, "have"s, and commits to be unshallowed have been parsed. carryFlags() only affects the parsed commits. If the "want" is a direct parent of a "have", then it carryFlags() marks it as uninteresting. If the "have" was also a "shallow", then its parent pointer should have been null and the "want" shouldn't have been marked, so we see the bug. If the "want" is a more distant ancestor then (2) keeps the uninteresting state from propagating to the "want" and we don't see the bug. If the "shallow" is not also a "have" then the shallow commit isn't parsed so (2) keeps the uninteresting state from propagating to the "want so we don't see the bug. Here is a reproduction case (time flowing left to right, arrows pointing to parents). "C" must be a commit that the client reports as a "have" during negotiation. That can only happen if the server reports it as an existing branch or tag in the first round of negotiation: A <-- B <-- C <-- D First do git clone --depth 1 <repo> which yields D as a "have" and C as a "shallow" commit. Then try git fetch --depth 1 <repo> B:refs/heads/B Negotiation sets up: have D, shallow C, have C, want B. But due to this bug B is marked as uninteresting and is not sent. Change-Id: I6e14b57b2f85e52d28cdcf356df647870f475440 Signed-off-by: Terry Parker <tparker@google.com>