I have unfortunately introduced a few bugs in the native Git client
over the years. 1.7.5 is unable to send chunked requests correctly,
resulting in corrupt data at the server. Ban this client whenever
it uses chunked encoding with an error message.
Prior to some more recent versions, git push over HTTP failed to
report status information and error messages due to a race within
the client and its helper process. Check for these bad versions and
send errors as messages before the status report, enabling users
to see the failures on their terminal.
Change-Id: Ic62d6591cbd851d21dbb3e9b023d655eaecb0624
The HTTP RFCs require a server to fully consume the request body before
it can return a non-error status code, which is any code below 400.
JGit returns most Git level errors inside of an HTTP 200 OK response,
and sometimes this happens before the entire request was consumed from
the servlet container. In such cases the body must be skipped or read
until EOF is reached, ensuring the HTTP keep-alive semantics will work
for the next request on the same TCP connection.
HTTP status codes >= 400 may be returned without consuming the body,
and a servlet container must set "Connection: close" in the response
headers when this happens, since the state of the request body is not
well defined with an early abort.
With the introduction of sendError() in GitSmartHttpTools there are
only a handful of locations that need to worry about the request body
being consumed, so sprinkle the call in as necessary.
Change-Id: I5381e110585f780c01a764df8e27c80aacf5146e
Allow application filters on smart HTTP operations
Permit applications embedding GitServlet to wrap the
info/refs?service=$name and /$name operations with a
servlet Filter.
To help applications inspect state of the operation,
expose the UploadPack or ReceivePack object into a
request attribute. This can be useful for logging,
or to implement throttling of requests like Gerrit
Code Review uses to prevent server overload.
Change-Id: Ib8773c14e2b7a650769bd578aad745e6651210cb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Some clients coming through proxies may advertise a different
Accept-Encoding, for example "Accept-Encoding: gzip(proxy)".
Matching by substring causes us to identify this as a false positive;
that the client understands gzip encoding and will inflate the
response before reading it.
In this particular case however it doesn't. Its the reverse proxy
server in front of JGit letting us know the proxy<->JGit link can
be gzip compressed, while the client<->proxy part of the link is not:
client <-- no gzip --> proxy <-- gzip --> JGit
Use a more standard method of parsing by splitting the value into
tokens, and only using gzip if one of the tokens is exactly the
string "gzip". Add a unit test to make sure this isn't broken in
the future.
Change-Id: Ib4c40f9db177322c7a2640808a6c10b3c4a73819
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Some clients coming through proxies may advertise a different
Accept-Encoding, for example "Accept-Encoding: gzip(proxy)".
Matching by substring causes us to identify this as a false positive;
that the client understands gzip encoding and will inflate the
response before reading it.
In this particular case however it doesn't. Its the reverse proxy
server in front of JGit letting us know the proxy<->JGit link can
be gzip compressed, while the client<->proxy part of the link is not:
client <-- no gzip --> proxy <-- gzip --> JGit
Use a more standard method of parsing by splitting the value into
tokens, and only using gzip if one of the tokens is exactly the
string "gzip". Add a unit test to make sure this isn't broken in
the future.
Change-Id: I30cda8a6d11ad235b56457adf54a2d27095d964e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The strings are externalized into the root resource bundles.
The resource bundles are stored under the new "resources" source
folder to get proper maven build.
Strings from tests are, in general, not externalized. Only in
cases where it was necessary to make the test pass the strings
were externalized. This was typically necessary in cases where
e.getMessage() was used in assert and the exception message was
slightly changed due to reuse of the externalized strings.
Change-Id: Ic0f29c80b9a54fcec8320d8539a3e112852a1f7b
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
http.server: Use TemporaryBuffer and compress some responses
The HTTP server side code now uses the same approach that the smart
HTTP client code uses when preparing a request body. The payload
is streamed into a TemporaryBuffer of limited size. If the entire
data fits, its compressed with gzip if the user agent supports that,
and a Content-Length header is used to transmit the fixed length
body to the peer. If however the data overflows the limited memory
segment, its streamed uncompressed to the peer.
One might initially think that larger contents which overflow
the buffer should also be compressed, rather than sent raw, since
they were deemed "large". But usually these larger contents are
actually a pack file which has been already heavily compressed by
Git specific routines. Trying to deflate that with gzip is probably
going to take up more space, not less, so the compression overhead
isn't worthwhile.
This buffer and compress optimization helps repositories with a
large number of references, as their text based advertisements
compress well. For example jgit's own native repository currently
requires 32,628 bytes for its full advertisement of 489 references.
Most repositories have fewer references, and thus could compress
their entire response in one buffer.
Change-Id: I790609c9f763339e0a1db9172aa570e29af96f42
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This is a simple HTTP server that provides the minimum server side
support required for dumb (non-git aware) transport clients.
We produce the info/refs and objects/info/packs file on the fly
from the local repository state, but otherwise serve data as raw
files from the on-disk structure.
In the future we could better optimize the FileSender class and the
servlets that use it to take advantage of direct file to network
APIs in more advanced servlet containers like Jetty.
Our glue package borrows the idea of a micro embedded DSL from
Google Guice and uses it to configure a collection of Filters
and HttpServlets, all of which are matched against requests using
regular expressions. If a subgroup exists in the pattern, it is
extracted and used for the path info component of the request.
Change-Id: Ia0f1a425d07d035e344ae54faf8aeb04763e7487
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>