UploadPackServlet: Use uploadWithExceptionPropagation
As UploadPackErrorHandler's Javadoc says, UploadPackServlet should have
called uploadWithExceptionPropagation and let UploadPackErrorHandler to
handle the exception. Fix UploadPackServlet.
Change-Id: I1f9686495fcf3ef28598ccdff3e6f76a16c8bca3
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
http: Allow specifying a custom error handler for UploadPack
By abstracting the error handler, this lets a user customize the error
handler for UploadPack. A customized error handler can show a custom
error message to the clients based on the exception thrown from the
hook, create a monitoring system for server errors, or do custom
logging.
Change-Id: Idd3b87d6bd471fef807c0cf1183e904b2886157e
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
This reverts the workaround introduced by
1c6c73c5a9b8dd700be45d658f165a464265dba7, which is a patch for dealing
with a buggy C Git client v1.7.5 in 2012. We'll stop supporting very old
C Git clients.
Change-Id: I94999a39101c96f210b5eca3c2f620c15eb1ac1b
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Teach UploadPack to support protocol v2 with non-bidirectional pipes,
and add support to the HTTP protocol for v2. This is only activated if
the repository's config has "protocol.version" equal to 2.
Change-Id: I093a14acd2c3850b8b98e14936a716958f35a848
Helped-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
Enable and fix 'Should be tagged with @Override' warning
Set missingOverrideAnnotation=warning in Eclipse compiler preferences
which enables the warning:
The method <method> of type <type> should be tagged with @Override
since it actually overrides a superclass method
Justification for this warning is described in:
http://stackoverflow.com/a/94411/381622
Enabling this causes in excess of 1000 warnings across the entire
code-base. They are very easy to fix automatically with Eclipse's
"Quick Fix" tool.
Fix all of them except 2 which cause compilation failure when the
project is built with mvn; add TODO comments on those for further
investigation.
Change-Id: I5772061041fd361fe93137fd8b0ad356e748a29c
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Add HTTP status code to ServiceMayNotContinueException
The exception can be thrown in a various reason, and sometimes 403
Forbidden is not appropriate. Make the HTTP status code customizable.
Change-Id: If2ef6f454f7479158a4e28a12909837db483521c
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Use message from ServiceNotAuthorizedException, ServiceNotEnabledException
When sending an error response due to ServiceNotAuthorizedException or
ServiceNotEnabledException, usually we send a default message. In the
ServiceNotEnabledException case, we use
403 Git access forbidden
except in a dumb-HTTP-specific filter where we use the servlet
container's default 403 response:
403 Forbidden
In the ServiceNotAuthorizedException case, we use the servlet
container's default 401 response:
401 Unauthorized
There is one exception: a ServiceNotEnabledException when handling a
smart HTTP /info/refs request uses the message from the exception:
403 Service not enabled
Be more consistent by always using the message from the exception. This
way, authors of a RepositoryResolver, UploadPackFactory, or
ReceivePackFactory can provide a more detailed message when appropriate.
The defaults are
401 Unauthorized
403 Service not enabled
Change-Id: Id1fe1c2042fb96487c3671c1965c8a65c4b8e1b8
Signed-off-by: Jonathan Nieder <jrn@google.com>
Add repository name to failures in HTTP server log
If UploadPack or ReceivePack has an exception record an identifier
associated with the repository as part of the log message. This can
help the HTTP admin track down the offending repository and take
action to repair the root cause.
Change-Id: I58f22b33cdb40994f044a26fba9fe965b45be51d
Since git-core ff5effd (v1.7.12.1) the native wire protocol transmits
the server and client implementation and version strings using
capability "agent=git/1.7.12.1" or similar.
Support this in JGit and hang the implementation data off UploadPack
and ReceivePack. On HTTP transports default to the User-Agent HTTP
header until the client overrides this with the optional capability
string in the first line.
Extract the user agent string into a UserAgent class under transport
where it can be specified to a different value if the application's
build process has broken the Implementation-Version header in the
JGit package.
Change-Id: Icfc6524d84a787386d1786310b421b2f92ae9e65
Enable streaming compression for any response that is bigger than
the 32 KiB buffer used by SmartOutputStream. This is useful on the
info/refs file which can have many branches and tags listed, and
is often bigger than 32 KiB, but also compresses by at least 50%.
Disable streaming compression on large git-upload-pack responses,
as these are usually highly compressed Git pack data. Trying to
compress these with gzip will only waste CPU time and additional
transfer space with the gzip wrapper. Small git-upload-pack data
is usually text based negotiation responses and can be squeezed
smaller with a little bit of CPU usage.
Change-Id: Ia13e63ed334f594d5e1ab53a97240eb2e8f550e2
I have unfortunately introduced a few bugs in the native Git client
over the years. 1.7.5 is unable to send chunked requests correctly,
resulting in corrupt data at the server. Ban this client whenever
it uses chunked encoding with an error message.
Prior to some more recent versions, git push over HTTP failed to
report status information and error messages due to a race within
the client and its helper process. Check for these bad versions and
send errors as messages before the status report, enabling users
to see the failures on their terminal.
Change-Id: Ic62d6591cbd851d21dbb3e9b023d655eaecb0624
Modify refs in UploadPack/ReceivePack using a hook interface
This is intended to replace the RefFilter interface (but does not yet,
for backwards compatibility). That interface required lots of extra
scanning and copying in filter cases such as only advertising a subtree
of the refs directory. Instead, provide a hook that can be executed
right before ref advertisement, using the public methods on
UploadPack/ReceivePack to explicitly set the map of advertised refs.
Change-Id: I0067019a191c8148af2cfb71a675f2258c5af0ca
The HTTP RFCs require a server to fully consume the request body before
it can return a non-error status code, which is any code below 400.
JGit returns most Git level errors inside of an HTTP 200 OK response,
and sometimes this happens before the entire request was consumed from
the servlet container. In such cases the body must be skipped or read
until EOF is reached, ensuring the HTTP keep-alive semantics will work
for the next request on the same TCP connection.
HTTP status codes >= 400 may be returned without consuming the body,
and a servlet container must set "Connection: close" in the response
headers when this happens, since the state of the request body is not
well defined with an early abort.
With the introduction of sendError() in GitSmartHttpTools there are
only a handful of locations that need to worry about the request body
being consumed, so sprinkle the call in as necessary.
Change-Id: I5381e110585f780c01a764df8e27c80aacf5146e
Error messages are typically short, below the 32 KiB in-memory buffer
size of the SmartOutputStream. When an error is queued up for sending
to a client and an exception is thrown up into the servlet handler we
discarded the message and sent nothing to the client, as the messages
were stuck inside of the SmartOutputStream buffer.
Hoist the creation of the output stream above the invocation of try
block of the service, and use close() in the few catch blocks that
assume there are buffered messages ready for transmission. This will
ensure errors from unpacking a stream in ReceivePack are sent off to
a client correctly, as previously these were causing no status report
to arrive at the client side as the data was stuck in the buffer.
Change-Id: I5534b560697731121f48979ae077aa7c95b8e39c
The GitSmartHttpTools class started as utility functions to help report
useful error messages to users of the android.googlesource.com service.
Now that the GitServlet and GitFilter classes support filters before a
git-upload-pack or git-receive-pack request, server implementors may
these routines helpful to report custom messages to clients. Using the
sendError() method to return an HTTP 200 OK with error text embedded in
the payload prevents native Git clients from retrying the action with a
dumb Git or WebDAV HTTP request.
Refactor some of the existing code to use these new error functions and
protocol constants. The new sendError() function is very close to being
identical to the old error handling code in RepositoryFilter, however we
now use the POST Content-Type rather than the Accept HTTP header to check
if the client will accept the error data in the response body rather than
using the HTTP status code. This is a more reliable way of checking for
native Git clients, as the Accept header was not always populated with the
correct string in older versions of Git smart HTTP.
Change-Id: I828ac2deb085af12b6689c10f86662ddd39bd1a2
If an internal exception occurs while packing and the request
needs to abort, the HTTP response might already be committed due
to progress message having already been delivered to the client.
This prevents UploadPackServlet from resetting the response and
sending back an HTTP 500 response.
Try to catch all exceptions and report internal errors over the
sideband stream or as an ERR command during the initial ACK/NAK
negotiation phase. This allows JGit to transmit an error message
that the user will receive on their console without needing to
worry about resetting the (already gone) HTTP response.
Change-Id: Ie393fb8bb55d2b79ab1276adf71c781c1807f9fe
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Smart HTTP clients may request both multi_ack_detailed and no-done in
the same request to prevent the client from needing to send a "done"
line to the server in response to a server's "ACK %s ready".
For smart HTTP, this can save 1 full HTTP RPC in the fetch exchange,
improving overall latency when incrementally updating a client that
has not diverged very far from the remote repository.
Unfortuantely this capability cannot be enabled for the traditional
bi-directional connections. multi_ack_detailed has the client sending
more "have" lines at the same time that the server is creating the
"ACK %s ready" and writing out the PACK stream, resulting in some race
conditions and/or deadlock, depending on how the pipe buffers are
implemented. For very small updates, a server might actually be able
to send "ACK %s ready", then the PACK, and disconnect before the
client even finishes sending its first batch of "have" lines. This
may cause the client to fail with a broken pipe exception. To avoid
all of these potential problems, "no-done" is restricted only to the
smart HTTP variant of the protocol.
Change-Id: Ie0d0a39320202bc096fec2e97cb58e9efd061b2d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When the client is clearly making a smart HTTP request to our smart
HTTP server, return any errors like RepositoryNotFoundException or
ServiceNotEnabledException inside of the payload as a Git level ERR
message, rather than an HTTP error code.
This prevents the C Git command line client from retrying a failed
"$URL/info/refs?service=git-upload-pack" request without the smart
service URL, only to fail again with "403 Forbidden" when the dumb
as-is service has been disabled by the server configuration, or is
unavailable because the repository is not on the local filesystem.
Change-Id: I57e8756d5026e885e0ca615979bfcd729703be6c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
UploadPack: Add a PreUploadHook to monitor and control behavior
Embedding applications can use this hook to watch actions within
UploadPack and possibly reject them. This could be useful to prevent
clones of a large repository from this server, or to stop abusive
negotiation rounds that offer thousands of objects in a single batch.
Change-Id: Id96f1885ac4d61f22c80b6418fff54184b7348ba
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Allow application filters on smart HTTP operations
Permit applications embedding GitServlet to wrap the
info/refs?service=$name and /$name operations with a
servlet Filter.
To help applications inspect state of the operation,
expose the UploadPack or ReceivePack object into a
request attribute. This can be useful for logging,
or to implement throttling of requests like Gerrit
Code Review uses to prevent server overload.
Change-Id: Ib8773c14e2b7a650769bd578aad745e6651210cb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Using a resolver and factory pattern for the anonymous git:// Daemon
class makes transport.Daemon more useful on non-file storage systems,
or in embedded applications where the caller wants more precise
control over the work tasks constructed within the daemon.
Rather than defining new interfaces, move the existing HTTP ones
into transport.resolver and make them generic on the connection
handle type. For HTTP, continue to use HttpServletRequest, and
for transport.Daemon use DaemonClient.
To remain compatible with transport.Daemon, FileResolver needs to
learn how to use multiple base directories, and how to export any
Repository instance at a fixed name.
Change-Id: I1efa6b2bd7c6567e983fbbf346947238ea2e847e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Update a number of calling sites of RevWalk to ensure the walker's
internal ObjectReader is released after the walk is no longer used.
Because the ObjectReader is likely to hold onto a native resource
like an Inflater, we don't want to leak them outside of their
useful scope.
Where possible we also try to share ObjectReaders across several
walk pools, or between a walker and a PackWriter. This permits
the ObjectReader to actually do some caching if it felt inclined
to do so.
Not everything was updated, we'll probably need to come back and
update even more call sites, but these are some of the biggest
offenders. Test cases in particular aren't updated. My plan is to
move most storage-agnostic tests onto some purely in-memory storage
solution that doesn't do compression.
Change-Id: I04087ec79faeea208b19848939898ad7172b6672
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
UploadPack: Permit flushing progress messages under smart HTTP
If UploadPack invokes flush() on the output stream we pass it, its
most likely the progress messages coming down the side band stream.
As pack generation can take a while, we want to push that down
at the client as early as we can, to keep the connection alive,
and to let the user know we are still working on their behalf.
Ensure we dump the temporary buffer whenever flush() is invoked,
otherwise the messages don't get sent in a timely fashion to the
user agent (in this case, git fetch).
We specifically don't implement flush() for ReceivePack right now,
as that protocol currently does not provide progress messages to
the user, but it does invoke flush several times, as the different
streams include '0000' type flush-pkts to denote various end points.
Change-Id: I797c90a2c562a416223dc0704785f61ac64e0220
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The strings are externalized into the root resource bundles.
The resource bundles are stored under the new "resources" source
folder to get proper maven build.
Strings from tests are, in general, not externalized. Only in
cases where it was necessary to make the test pass the strings
were externalized. This was typically necessary in cases where
e.getMessage() was used in assert and the exception message was
slightly changed due to reuse of the externalized strings.
Change-Id: Ic0f29c80b9a54fcec8320d8539a3e112852a1f7b
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
http.server: Use TemporaryBuffer and compress some responses
The HTTP server side code now uses the same approach that the smart
HTTP client code uses when preparing a request body. The payload
is streamed into a TemporaryBuffer of limited size. If the entire
data fits, its compressed with gzip if the user agent supports that,
and a Content-Length header is used to transmit the fixed length
body to the peer. If however the data overflows the limited memory
segment, its streamed uncompressed to the peer.
One might initially think that larger contents which overflow
the buffer should also be compressed, rather than sent raw, since
they were deemed "large". But usually these larger contents are
actually a pack file which has been already heavily compressed by
Git specific routines. Trying to deflate that with gzip is probably
going to take up more space, not less, so the compression overhead
isn't worthwhile.
This buffer and compress optimization helps repositories with a
large number of references, as their text based advertisements
compress well. For example jgit's own native repository currently
requires 32,628 bytes for its full advertisement of 489 references.
Most repositories have fewer references, and thus could compress
their entire response in one buffer.
Change-Id: I790609c9f763339e0a1db9172aa570e29af96f42
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Clients can request smart fetch support by examining the info/refs URL
with the service parameter set to the magic git-upload-pack string:
GET /$GIT_DIR/info/refs?service=git-upload-pack HTTP/1.1
The response is formatted with the upload pack capabilities, using
the standard packet line formatter. A special header line is put
in front of the standard upload-pack advertisement to let clients
know the service was recognized and is supported.
If the requested service is disabled an authorization status code is
returned, allowing the user agent to retry once they have obtained
credentials from a human, in case authentication is required by
the configured UploadPackFactory implementation.
Change-Id: Ib0f1a458c88b4b5509b0f882f55f83f5752bc57a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Clients can request smart push support by examining the info/refs URL
with the service parameter set to the magic git-receive-pack string:
GET /$GIT_DIR/info/refs?service=git-receive-pack HTTP/1.1
The response is formatted with the receive pack capabilities, using
the standard packet line formatter. A special header block is put
in front of the standard receive-pack advertisement to let clients
know the service was recognized and is supported.
If the requested service is disabled an authorization status code is
returned, allowing the user agent to retry once they have obtained
credentials from a human, in case authentication is required by
the configured ReceivePackFactory implementation.
Change-Id: Ie4f6e0c7b68a68ec4b7cdd5072f91dd406210d4f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This is a simple HTTP server that provides the minimum server side
support required for dumb (non-git aware) transport clients.
We produce the info/refs and objects/info/packs file on the fly
from the local repository state, but otherwise serve data as raw
files from the on-disk structure.
In the future we could better optimize the FileSender class and the
servlets that use it to take advantage of direct file to network
APIs in more advanced servlet containers like Jetty.
Our glue package borrows the idea of a micro embedded DSL from
Google Guice and uses it to configure a collection of Filters
and HttpServlets, all of which are matched against requests using
regular expressions. If a subgroup exists in the pattern, it is
extracted and used for the path info component of the request.
Change-Id: Ia0f1a425d07d035e344ae54faf8aeb04763e7487
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Per CQ 3448 this is the initial contribution of the JGit project
to eclipse.org. It is derived from the historical JGit repository
at commit 3a2dd9921c.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>