When the client is clearly making a smart HTTP request to our smart
HTTP server, return any errors like RepositoryNotFoundException or
ServiceNotEnabledException inside of the payload as a Git level ERR
message, rather than an HTTP error code.
This prevents the C Git command line client from retrying a failed
"$URL/info/refs?service=git-upload-pack" request without the smart
service URL, only to fail again with "403 Forbidden" when the dumb
as-is service has been disabled by the server configuration, or is
unavailable because the repository is not on the local filesystem.
Change-Id: I57e8756d5026e885e0ca615979bfcd729703be6c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
UploadPack: Add a PreUploadHook to monitor and control behavior
Embedding applications can use this hook to watch actions within
UploadPack and possibly reject them. This could be useful to prevent
clones of a large repository from this server, or to stop abusive
negotiation rounds that offer thousands of objects in a single batch.
Change-Id: Id96f1885ac4d61f22c80b6418fff54184b7348ba
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Allow application filters on smart HTTP operations
Permit applications embedding GitServlet to wrap the
info/refs?service=$name and /$name operations with a
servlet Filter.
To help applications inspect state of the operation,
expose the UploadPack or ReceivePack object into a
request attribute. This can be useful for logging,
or to implement throttling of requests like Gerrit
Code Review uses to prevent server overload.
Change-Id: Ib8773c14e2b7a650769bd578aad745e6651210cb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Using a resolver and factory pattern for the anonymous git:// Daemon
class makes transport.Daemon more useful on non-file storage systems,
or in embedded applications where the caller wants more precise
control over the work tasks constructed within the daemon.
Rather than defining new interfaces, move the existing HTTP ones
into transport.resolver and make them generic on the connection
handle type. For HTTP, continue to use HttpServletRequest, and
for transport.Daemon use DaemonClient.
To remain compatible with transport.Daemon, FileResolver needs to
learn how to use multiple base directories, and how to export any
Repository instance at a fixed name.
Change-Id: I1efa6b2bd7c6567e983fbbf346947238ea2e847e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Update a number of calling sites of RevWalk to ensure the walker's
internal ObjectReader is released after the walk is no longer used.
Because the ObjectReader is likely to hold onto a native resource
like an Inflater, we don't want to leak them outside of their
useful scope.
Where possible we also try to share ObjectReaders across several
walk pools, or between a walker and a PackWriter. This permits
the ObjectReader to actually do some caching if it felt inclined
to do so.
Not everything was updated, we'll probably need to come back and
update even more call sites, but these are some of the biggest
offenders. Test cases in particular aren't updated. My plan is to
move most storage-agnostic tests onto some purely in-memory storage
solution that doesn't do compression.
Change-Id: I04087ec79faeea208b19848939898ad7172b6672
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
UploadPack: Permit flushing progress messages under smart HTTP
If UploadPack invokes flush() on the output stream we pass it, its
most likely the progress messages coming down the side band stream.
As pack generation can take a while, we want to push that down
at the client as early as we can, to keep the connection alive,
and to let the user know we are still working on their behalf.
Ensure we dump the temporary buffer whenever flush() is invoked,
otherwise the messages don't get sent in a timely fashion to the
user agent (in this case, git fetch).
We specifically don't implement flush() for ReceivePack right now,
as that protocol currently does not provide progress messages to
the user, but it does invoke flush several times, as the different
streams include '0000' type flush-pkts to denote various end points.
Change-Id: I797c90a2c562a416223dc0704785f61ac64e0220
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The strings are externalized into the root resource bundles.
The resource bundles are stored under the new "resources" source
folder to get proper maven build.
Strings from tests are, in general, not externalized. Only in
cases where it was necessary to make the test pass the strings
were externalized. This was typically necessary in cases where
e.getMessage() was used in assert and the exception message was
slightly changed due to reuse of the externalized strings.
Change-Id: Ic0f29c80b9a54fcec8320d8539a3e112852a1f7b
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
http.server: Use TemporaryBuffer and compress some responses
The HTTP server side code now uses the same approach that the smart
HTTP client code uses when preparing a request body. The payload
is streamed into a TemporaryBuffer of limited size. If the entire
data fits, its compressed with gzip if the user agent supports that,
and a Content-Length header is used to transmit the fixed length
body to the peer. If however the data overflows the limited memory
segment, its streamed uncompressed to the peer.
One might initially think that larger contents which overflow
the buffer should also be compressed, rather than sent raw, since
they were deemed "large". But usually these larger contents are
actually a pack file which has been already heavily compressed by
Git specific routines. Trying to deflate that with gzip is probably
going to take up more space, not less, so the compression overhead
isn't worthwhile.
This buffer and compress optimization helps repositories with a
large number of references, as their text based advertisements
compress well. For example jgit's own native repository currently
requires 32,628 bytes for its full advertisement of 489 references.
Most repositories have fewer references, and thus could compress
their entire response in one buffer.
Change-Id: I790609c9f763339e0a1db9172aa570e29af96f42
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Clients can request smart fetch support by examining the info/refs URL
with the service parameter set to the magic git-upload-pack string:
GET /$GIT_DIR/info/refs?service=git-upload-pack HTTP/1.1
The response is formatted with the upload pack capabilities, using
the standard packet line formatter. A special header line is put
in front of the standard upload-pack advertisement to let clients
know the service was recognized and is supported.
If the requested service is disabled an authorization status code is
returned, allowing the user agent to retry once they have obtained
credentials from a human, in case authentication is required by
the configured UploadPackFactory implementation.
Change-Id: Ib0f1a458c88b4b5509b0f882f55f83f5752bc57a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Clients can request smart push support by examining the info/refs URL
with the service parameter set to the magic git-receive-pack string:
GET /$GIT_DIR/info/refs?service=git-receive-pack HTTP/1.1
The response is formatted with the receive pack capabilities, using
the standard packet line formatter. A special header block is put
in front of the standard receive-pack advertisement to let clients
know the service was recognized and is supported.
If the requested service is disabled an authorization status code is
returned, allowing the user agent to retry once they have obtained
credentials from a human, in case authentication is required by
the configured ReceivePackFactory implementation.
Change-Id: Ie4f6e0c7b68a68ec4b7cdd5072f91dd406210d4f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This is a simple HTTP server that provides the minimum server side
support required for dumb (non-git aware) transport clients.
We produce the info/refs and objects/info/packs file on the fly
from the local repository state, but otherwise serve data as raw
files from the on-disk structure.
In the future we could better optimize the FileSender class and the
servlets that use it to take advantage of direct file to network
APIs in more advanced servlet containers like Jetty.
Our glue package borrows the idea of a micro embedded DSL from
Google Guice and uses it to configure a collection of Filters
and HttpServlets, all of which are matched against requests using
regular expressions. If a subgroup exists in the pattern, it is
extracted and used for the path info component of the request.
Change-Id: Ia0f1a425d07d035e344ae54faf8aeb04763e7487
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Per CQ 3448 this is the initial contribution of the JGit project
to eclipse.org. It is derived from the historical JGit repository
at commit 3a2dd9921c.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>