BasePackConnection: Check for expected length of ref advertisement
When a server sends a ref advertisement using protocol v2 it contains
lines other than ref names and sha1s. Attempting to get the sha1 out
of such a line using the substring method can result in a SIOOB error
when it doesn't actually contain the sha1 and ref name.
Add a check that the line is of the expected length, and subsequently
that the extracted object id is valid, and if not throw an exception.
Change-Id: Id92fe66ff8b6deb2cf987d81929f8d0602c399f4
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Remove it from
* package private functions.
* try blocks
* for loops
this was done with the following python script:
$ cat f.py
import sys
import re
import os
def replaceFinal(m):
return m.group(1) + "(" + m.group(2).replace('final ', '') + ")"
methodDecl = re.compile(r"^([\t ]*[a-zA-Z_ ]+)\(([^)]*)\)")
def subst(fn):
input = open(fn)
os.rename(fn, fn + "~")
dest = open(fn, 'w')
for l in input:
l = methodDecl.sub(replaceFinal, l)
dest.write(l)
dest.close()
for root, dirs, files in os.walk(".", topdown=False):
for f in files:
if not f.endswith('.java'):
continue
full = os.path.join(root, f)
print full
subst(full)
Change-Id: If533a75a417594fc893e7c669d2c1f0f6caeb7ca
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Enable and fix warnings about redundant specification of type arguments
Since the introduction of generic type parameter inference in Java 7,
it's not necessary to explicitly specify the type of generic parameters.
Enable the warning in Eclipse, and fix all occurrences.
Change-Id: I9158caf1beca5e4980b6240ac401f3868520aad0
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
When setting timeout on push, BasePackConnection creates a timer, which
will be terminated when push finishes. But, when using
SmartHttpPushConnection, it dropped the first timer created in the
constructor and then created another timer in doPush. If new threads are
created faster than the gc collects then this may stop the service if
it's hitting the max process limit. Hence don't create a new timer if it
already exists.
Bug: 474947
Change-Id: I6746ffe4584ad919369afd5bdbba66fe736be314
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Since git-core ff5effd (v1.7.12.1) the native wire protocol transmits
the server and client implementation and version strings using
capability "agent=git/1.7.12.1" or similar.
Support this in JGit and hang the implementation data off UploadPack
and ReceivePack. On HTTP transports default to the User-Agent HTTP
header until the client overrides this with the optional capability
string in the first line.
Extract the user agent string into a UserAgent class under transport
where it can be specified to a different value if the application's
build process has broken the Implementation-Version header in the
JGit package.
Change-Id: Icfc6524d84a787386d1786310b421b2f92ae9e65
A few classes such as Constanrs are marked with @SuppressWarnings, as are
toString() methods with many liternal, but otherwise $NLS-n$ is used for
string containing text that should not be translated. A few literals may
fall into the gray zone, but mostly I've tried to only tag the obvious
ones.
Change-Id: I22e50a77e2bf9e0b842a66bdf674e8fa1692f590
The strings are externalized into the root resource bundles.
The resource bundles are stored under the new "resources" source
folder to get proper maven build.
Strings from tests are, in general, not externalized. Only in
cases where it was necessary to make the test pass the strings
were externalized. This was typically necessary in cases where
e.getMessage() was used in assert and the exception message was
slightly changed due to reuse of the externalized strings.
Change-Id: Ic0f29c80b9a54fcec8320d8539a3e112852a1f7b
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
When listing branches, EGit only reads the advertisement and
then disconnects. When it closes down the pack channel the remote
side is waiting for the client to send our list of commands, or a
flush-pkt to let it know there is nothing to do.
However if an error thread is open watching the SSH stderr stream,
we ask for it to finish before we send the flush-pkt. Unfortunately
the thread won't terminate until the main output stream closes,
which is waiting for the flush-pkt. A classic network deadlock.
If the output stream needs a flush-pkt we send it before we wait
for the error stream to close. If the flush-pkt is rejected, we
close down the output stream early, assuming that the remote side
is broken and we will get error information soon.
Change-Id: I8d078a339077756220c113f49d206b1bf295d434
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Reduce multi-level buffered streams in transport code
Some transports actually provide stream buffering on their own,
without needing to be wrapped up inside of a BufferedInputStream in
order to smooth out system calls to read or write. A great example
of this is the JSch SSH client, or the Apache MINA SSHD server.
Both use custom buffering to packetize the streams into the encrypted
SSH channel, and wrapping them up inside of a BufferedInputStream
or BufferedOutputStream is relatively pointless.
Our SideBandOutputStream implementation also provides some fairly
large buffering, equal to one complete side-band packet on the main
data channel. Wrapping that inside of a BufferedOutputStream just to
smooth out small writes from PackWriter causes extra data copies, and
provides no advantage. We can save some memory and some CPU cycles
by letting PackWriter dump directly into the SideBandOutputStream's
internal buffer array.
Instead we push the buffering streams down to be as close to the
network socket (or operating system pipe) as possible. This allows
us to smooth out the smaller reads/writes from pkt-line messages
during advertisement and negotation, but avoid copying altogether
when the stream switches to larger writes over a side band channel.
Change-Id: I2f6f16caee64783c77d3dd1b2a41b3cc0c64c159
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Catch and report "ERR message" during remote advertisements
GitHub broke the native git protocol a while ago by interjecting an
"ERR message" line into the upload-pack or receive-pack advertisement
list. This didn't match the expected pattern, so it caused existing
C Git clients to abort with a protocol exception.
These days, C Git clients actually look for this message and abort
with a more graceful notice to the end-user. JGit should do the
same, including setting up a custom exception type that makes it
easier for higher-level UIs to identify a message from the remote
site and present it to the user.
Change-Id: I51ab62a382cfaf1082210e8bfaa69506fd0d9786
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Rewrite reference handling to be abstract and accurate
This commit actually does three major changes to the way references
are handled within JGit. Unfortunately they were easier to do as
a single massive commit than to break them up into smaller units.
Disambiguate symbolic references:
---------------------------------
Reporting a symbolic reference such as HEAD as though it were
any other normal reference like refs/heads/master causes subtle
programming errors. We have been bitten by this error on several
occasions, as have some downstream applications written by myself.
Instead of reporting HEAD as a reference whose name differs from
its "original name", report it as an actual SymbolicRef object
that the application can test the type and examine the target of.
With this change, Ref is now an abstract type with different
subclasses for the different types.
In the classical example of "HEAD" being a symbolic reference to
branch "refs/heads/master", the Repository.getAllRefs() method
will now return:
Map<String, Ref> all = repository.getAllRefs();
SymbolicRef HEAD = (SymbolicRef) all.get("HEAD");
ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master");
assertSame(master, HEAD.getTarget());
assertSame(master.getObjectId(), HEAD.getObjectId());
assertEquals("HEAD", HEAD.getName());
assertEquals("refs/heads/master", master.getName());
A nice side-effect of this change is the storage type of the
symbolic reference is no longer ambiguous with the storge type
of the underlying reference it targets. In the above example,
if master was only available in the packed-refs file, then the
following is also true:
assertSame(Ref.Storage.LOOSE, HEAD.getStorage());
assertSame(Ref.Storage.PACKED, master.getStorage());
(Prior to this change we returned the ambiguous storage of
LOOSE_PACKED for HEAD, which was confusing since it wasn't
actually true on disk).
Another nice side-effect of this change is all intermediate
symbolic references are preserved, and are therefore visible
to the application when they walk the target chain. We can
now correctly inspect chains of symbolic references.
As a result of this change the Ref.getOrigName() method has been
removed from the API. Applications should identify a symbolic
reference by testing for isSymbolic() and not by using an arcane
string comparsion between properties.
Abstract the RefDatabase storage:
---------------------------------
RefDatabase is now abstract, similar to ObjectDatabase, and a
new concrete implementation called RefDirectory is used for the
traditional on-disk storage layout. In the future we plan to
support additional implementations, such as a pure in-memory
RefDatabase for unit testing purposes.
Optimize RefDirectory:
----------------------
The implementation of the in-memory reference cache, reading, and
update routines has been completely rewritten. Much of the code
was heavily borrowed or cribbed from the prior implementation,
so copyright notices have been left intact as much as possible.
The RefDirectory cache no longer confuses symbolic references
with normal references. This permits the cache to resolve the
value of a symbolic reference as late as possible, ensuring it
is always current, without needing to maintain reverse pointers.
The cache is now 2 sorted RefLists, rather than 3 HashMaps.
Using sorted lists allows the implementation to reduce the
in-memory footprint when storing many refs. Using specialized
types for the elements allows the code to avoid additional map
lookups for auxiliary stat information.
To improve scan time during getRefs(), the lists are returned via
a copy-on-write contract. Most callers of getRefs() do not modify
the returned collections, so the copy-on-write semantics improves
access on repositories with a large number of packed references.
Iterator traversals of the returned Map<String,Ref> are performed
using a simple merge-join of the two cache lists, ensuring we can
perform the entire traversal in linear time as a function of the
number of references: O(PackedRefs + LooseRefs).
Scans of the loose reference space to update the cache run in
O(LooseRefs log LooseRefs) time, as the directory contents
are sorted before being merged against the in-memory cache.
Since the majority of stable references are kept packed, there
typically are only a handful of reference names to be sorted,
so the sorting cost should not be very high.
Locking is reduced during getRefs() by taking advantage of the
copy-on-write semantics of the improved cache data structure.
This permits concurrent readers to pull back references without
blocking each other. If there is contention updating the cache
during a scan, one or more updates are simply skipped and will
get picked up again in a future scan.
Writing to the $GIT_DIR/packed-refs during reference delete is
now fully atomic. The file is locked, reparsed fresh, and written
back out if a change is necessary. This avoids all race conditions
with concurrent external updates of the packed-refs file.
The RefLogWriter class has been fully folded into RefDirectory
and is therefore deleted. Maintaining the reference's log is
the responsiblity of the database implementation, and not all
implementations will use java.io for access.
Future work still remains to be done to abstract the ReflogReader
class away from local disk IO.
Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
During fetch over http:// clients now try to take advantage of
the info/refs?service=git-upload-pack URL to determine if the
remote side will support a standard upload-pack command stream.
If so each block of 32 have lines is sent in one POST request,
prefixed by all of the 'want' lines and any previously discovered
common bases as 'have' lines.
During push over http:// clients now try to take advantage of
the info/refs?service=git-receive-pack URL to determine if the
remote side will support a standard receive-pack command stream.
If so, commands are sent along with their pack in a single HTTP
POST request.
Bug: 291002
Change-Id: I8c69b16ac15c442e1a4c3bd60b4ea1a47882b851
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Per CQ 3448 this is the initial contribution of the JGit project
to eclipse.org. It is derived from the historical JGit repository
at commit 3a2dd9921c.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>