When writing new packs it should be allowed to specify objects as "have"
(objects which should not be included in the pack) which do not exist in
the local repository.
This works with the traditional PackWriter, but when PackWriter was
working on a repository with bitmap indexes and used
PackWriterBitmapWalker then this feature was broken. Non-existing "have"
objects lead to MissingObjectExceptions. That broke push and Gerrit
replication. When the replication target had branches unknown to the
replication source then the source repository wanted to build pack files
where "have" included branch-tips which were unknown in the source
repository.
Bug: 427107
Change-Id: I6b6598a1ec49af68aa77ea6f1f06e827982ea4ac
Also-by: Matthias Sohn <matthias.sohn@sap.com>
Windows: Hide the .git directory if hidedotfiles is set to non-false
Other .git files are not hidden with this patch
Change-Id: Idf63ca08d08f3a77c33f5848d02074f8d6a75758
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Prevent NPE if no CredentialsProvider is registered
If the git server requires authentication and no CredentialsProvider is
registered TransportHttp.connect() would throw an NPE since it tries to
reset the credentials provider. Instead throw a TransportException
explaining the problem.
Change-Id: Ib274e7d9c43bba301089975423de6a05ca5169f6
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
According to http://stackoverflow.com/a/8381338, the maximum array
size is not Integer.MAX_VALUE, but Integer.MAX_VALUE - 8
Change-Id: I6ddc7470368acd20abf0885c53c89a982bb0f176
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Cleanup use of java.util.Inflater, fixing rare infinite loops
The native implementation of inflate() can set finished to return
true at the same time as it copies the last bytes into the buffer.
Check for finished on each iteration, terminating as soon as libz
knows the stream was completely inflated.
If not finished, it is likely input is required before the next
native call could do any useful work. Most invocations are passing
in a buffer large enough to store the entire result. A partial return
from inflate() will need more input before it can continue. Checking
right away that needsInput() is true saves a native call to determine
no bytes can be inflated without more input.
This should fix a rare infinite loop condition inside of inflation
when an object ends exactly at the end of a block boundary, and
the next block contains only the 20 byte trailing SHA-1.
When the stream is finished each new attempt to inflate() returns
n == 0, as no additional bytes were output. The needsInput() test
tries to add the length of the footer block to itself, but then loops
back around an reloads the same block as the block is smaller than
a full block size. A zero length input is set to the inflater,
which triggers needsInput() condition again.
Change-Id: I95d02bfeab4bf995a254d49166b4ae62d1f21346
Revert "Add a method to DfsOutputStream to read as an InputStream"
This reverts commit b646578d89.
openInputStream() is never used in JGit, nor is it used by any
known working DFS implementation. The method was added as a
utility for reading back from a DfsInserter, but the final
implementation of that feature does not requrire this method.
Change-Id: I075ad95e40af49c92b554480f8993ef5658f7684
Add a method to ObjectInserter to read back inserted objects
In the DFS implementation, flushing an inserter writes a new pack to
the storage system and is potentially very slow, but was the only way
to ensure previously-inserted objects were available. For some tasks,
like performing a series of three-way merges, the total size of all
inserted objects may be small enough to avoid flushing the in-memory
buffered data.
DfsOutputStream already provides a read method to read back from the
not-yet-flushed data, so use this to provide an ObjectReader in the
DFS case.
In the file-backed case, objects are written out loosely on the fly,
so the implementation can just return the existing WindowCursor.
Change-Id: I454fdfb88f4d215e31b7da2b2a069853b197b3dd
DfsInserter: buffer up to streamFileThreshold from InputStream
Since 2badedcbe0 in-core merges can write up to 10 MiB
into a TemporaryBuffer.Heap strategy, where the data is stored
as a chain of byte[] blocks.
Support the inserter reading up to the streamFileThreshold (default 50
MiB) from the supplied input stream and hash the content to determine
if the merged result blob is already present in the repository. This
allows the inserter to avoid creating duplicate objects in more cases,
reducing repository pack file churn.
Change-Id: I38967e2a0cff14c0a856cdb46a2c8fedbeb21ed5
Use bitcheck to check for presence of OPT_FULL option
Previously an equality check was performed so an exception would
be thrown if any other options were set.
Change-Id: I36b60e2c0a8aef9fcfe663055dba520192996872
Fixed message for exception thrown during recursive merge
During recursive merge jgit potentially has to merge multiple
common ancestors. If this fails because there are conflicts then
the exception thrown for that should have a message which states
this clearly. Previously a wrong message was given ("More than 200
merge bases ...")
Change-Id: Ia3c058d5575decdefd50390ed83b63668d31c1d1
Allow retrying connecting SshSession in case of an exception
Connecting to an SshSession may fail due to different reasons. Jsch for
example often throws an com.jcraft.jsch.JschException: verify: false.[1]
The issue is still not fixed in JSch 0.1.51.
In such a case it is worth retrying to connect. The number of connection
attempts can be configured using ssh_config parameter
"ConnectionAttempts" [2].
Don't retry if the user canceled authentication.
[1] http://sourceforge.net/p/jsch/bugs/58/
[2] http://linux.die.net/man/5/ssh_config
Bug: 437656
Change-Id: I6dd2a3786b7d3f15f5a46821d8edac987a57e381
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
DeltaTask$Block.partitionTask was doing an infinite loop if number of
threads was greater than the totalWeight. The weightPerThread was 0
which was causing the infinite loop. Set the weightPerThread to a
minimal value of one.
Bug: 420915
Change-Id: Ia8e3ad956d53d8193937b7fa1bc19aafde9767ff
Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>
Allow to include untracked files in stash operations.
Unstashed changes are saved in a commit which is added as an additional
parent to the stash commit.
This behaviour is fully compatible with C Git stashing of untracked
files.
Bug: 434411
Change-Id: I2af784deb0c2320bb57bc4fd472a8daad8674e7d
Signed-off-by: Andreas Hermann <a.v.hermann@gmail.com>
Fix for reflog corruption caused by multiline message
If a client passes a multiline message as argument to ReflogWriter.log()
the Reflog gets corrupted and cannot be parsed. ReflogWriter.log() is
invoked implicitly from various commands such as StashCreate, Rebase and
many more. However the message is not always filtered for line feeds.
Such an example is the StashCreateOperation of EGit which passes
unchecked user input as commit message. If a multiline comment is pasted
to the stash create dialog, the reflog gets corrupted.
ReflogWriter now replaces line endings in log message with spaces.
Bug: 435509
Change-Id: I3010cc902e13bee4d7b6696dfd11ab51062739d3
Signed-off-by: Andreas Hermann <a.v.hermann@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
By specifying a mainline parent, a merge is cherry picked as if this
parent was its only parent. If no mainline parent is given, cherry
picking merges is not allowed, as before.
Change-Id: I391cb73bf8f49e2df61428c17b40fae8c86a8b76
Signed-off-by: Konrad Kügler <swamblumat-eclipsebugs@yahoo.de>
Streaming packed deltas is so slow that it never feasibly completes
(it will take hours for it to stream a few hundred megabytes on
relatively fast systems with a large amount of storage). This
was indicated as a "failed experiment" by Shawn in the following
mailing list post:
http://dev.eclipse.org/mhonarc/lists/jgit-dev/msg01674.html
Change-Id: Idc12f59e37b122f13856d7b533a5af9d8867a8a5
Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
When working on a non-bare repository with a detached HEAD jgit's GC was
packing the ref named "HEAD" into the packed-refs file and deleted the
loose ref (the file .git/HEAD!). This made the repo unusable for native
git. This is fixed by telling jgit to only pack refs starting from
"refs/"
Change-Id: I50018aa006f18b244d2cae2ff78b5ffe1b821d63
PostReceiveHooks can make use of this information to, for example,
update a cached size of the Git repository.
Change-Id: I2bf1200959a50531e2155a7609c96035ba45b10d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Revert "Add getPackFile to ReceivePack to make PostReceiveHook more
usable"
This reverts commit 2670fd427c.
By returning an instance of File from the ReceivePack.getPackFile the
abstraction of the persistence implementation was broken.
Change-Id: I28e3ebf3a659a7cbc94be51bba9e1ad338f2b786
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Some of our Windows users have reported sporadic file system access
problems related to ObjectDirectory(Inserter) file deletion code in
combination with antiviral/firewall tools. For one of these users the
problem was fairly reproducible and changing deletion to RETRY solved
his problem.
Change-Id: I1e4001d5557fca693b7bac401268599467cb0c9e
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Add getPackFile to ReceivePack to make PostReceiveHook more usable
Having access to the pack file that was created by the ReceivePack
may be useful for post receive hooks. For example, a hook may want
to check the size of the received pack and the created index.
Change-Id: I4d51758e4565d32c9f8892242947eb72644b847d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Possibility to limit the max pack size on receive-pack
The maxPackSizeLimit, when set, will reject a pack if it exceeds
that limit.
This feature is intended to provide a mechanism to control disk space
quota on Git repositories.
Change-Id: I83d8db670875c395f8171461b402083323e623a5
CQ: 7896
Move Apache httpclient based HTTP support to a separate bundle
This move avoids that all consumers of org.eclipse.jgit depend on Apache
httpclient. Also add another feature to make this optional for OSGi
consumers as well.
Change-Id: I5ef5e00c53678b9e1d7cfd54bbca3ff6f1c1c967
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Add an implementation for HttpConnection using Apache HttpClient
This change implements the http connection abstraction with the help of
org.apache.http.client.HttpClient. The default implementation used by
JGit is still the JDK HttpURLConnection. But now JGit users have the
possibility to switch completely to org.apache.httpclient. The reason
for this is that in certain (e.g. cloud) environments you are forced to
use the org.apache classes.
Change-Id: I0b357f23243ed13a014c79ba179fa327dfe318b2
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The change includes comparing symbolic links between disk and index,
adding symbolic links to the index, creating/modifying links on
checkout. The behavior is controlled by the core.symlinks setting, just
as C Git does. When a new repository is created core.symlinks will be
set depending on the capabilities of the operating system and Java
runtime.
If core.symlinks is set to true, the assumption is that symlinks are
supported, which may result in runtime errors if this turns out not to
be the case.
Measuring the cost of jgit status on a repository with ~70000 files,
of which ~30000 are tracked reveals a penalty of about 10% for using
the Java7 (really NIO2) support module.
Bug: 354367
Change-Id: I12f0fdd9d26212324a586896ef7eb1f6ff89c39c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fix MissingObjectException race in ObjectDirectory
Johannes Carlsson identified a race condition[1] that can lead to
spurious MissingObjectExceptions at read time. If two threads are
active inside of ObjectDirectory looking for a packed object and the
packList is currently the empty NO_PACKS list, thread A will find
no object and eventually consider tryAgain1(). If thread A is put
to sleep and this point and thread B also does not find the object,
loads the packs, when thread A wakes up its tryAgain1 would return
false and the thread never considers the packs.
Rework the internal API of ObjectDirectory to keep a handle on the
exact PackList that was iterated by thread A, allowing it to always
retry walking through the packs if the new PackList is different.
This had some ripple effect into the CachedObjectDirectory and
the shared FileObjectDatabase interface. The new code should be
slightly easier to follow, especially from the perspective of the
CachedObjectDirectory trying to minimize the number of open system
calls it makes to files matching "$GIT_DIR/objects/??/?x{38}".
[1] http://dev.eclipse.org/mhonarc/lists/jgit-dev/msg02401.html
Change-Id: I9a1c9d6ad6cb38404b7b9178167b714077561353
Package was renamed, so I had to update the imports. Also, I verified
bitmap serialization was still compatible.
Change-Id: I161ad3875b963b56001beab477ef8d072accee4f
More helpful InvalidPathException messages (include reason)
Instead of just a generic "Invalid path: $path", add a reason for the
cases where it's not obvious what the problem is (e.g. "aux" being
reserved on Windows).
Bug: 413915
Change-Id: Ia6436bd2560e4f049c92d9aac907cb87348605e0
Signed-off-by: Robin Stocker <robin@nibor.org>
Cache SimpleDateFormat in GitDateParser per locale
Otherwise switching to another locale yields wrong results when parsing
date strings in GitDateParser. Since the MockSystemReader explicitly
uses english locale the tests need to specify the locale to be used when
parsing date strings.
Bug: 420772
Change-Id: I313ef6b1e9ef3bfb43d929ce34712ebd21f2cd9c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Do not update the ref hot bit when checking isIndexLoaded
DfsPackFile.isIndexLoaded() uses the DfsBlockCache.Ref.get() method
to check if the index loaded. However, using the get() method marks
a hot bit in the cache, which can cause the index to never be unloaded
and seem hotter than it really is. Add a has() method which only
checks if the value is not null and does not update the hot bit.
Change-Id: I7e9ed216f6e273e8f5d79ae573973197654419b4
Don't delete .idx file if .pack file can't be deleted
If during an garbage collection old packfiles are deleted it could
happen that on certain platforms the index file can be deleted but the
packfile can't be deleted (because someone locked the file). This led
to repositories with packfiles without corresponding index files. Those
zombie-packfiles potentially consume a lot of space on disk and it is
never tried to delete them again. Try to avoid this situation by
deleting packfiles first and don't try to delete the other files if we
can't delete the packfile. This gives us the chance to delete the
packfile during next GC.
This commit only improves the situation - there is still the chance for
orphan files during packfile deletion. We don't have an atomic delete
of multiple files .
Change-Id: I0a19ae630186f07d0cc7fe9df246fa1cedeca8f6
Add Squash/Fixup support for rebase interactive in RebaseCommand
The rebase command now supports squash and fixup. Both actions are not
allowed as the first step of the rebase.
In JGit, before any rebase step is performed, the next commit is
already cherry-picked. This commit keeps that behaviour. In case of
squash or fixup a soft reset to the parent is perfomed afterwards.
CQ: 7684
Bug: 396510
Change-Id: I3c4190940b4d7f19860e223d647fc78705e57203
Signed-off-by: Tobias Pfeifer <to.pfeifer@web.de>
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Enhance reading of git-rebase-todo formatted files
Reading and writing files formatted like the git-rebase-todo files was
hidden in the RebaseCommand. Certain constructs (like leading tabs and
spaces) have not been handled as in native git. Also the upcoming
rebase interactive feature in EGit needs reading/writing these files
independently from a RebaseCommand.
Therefore reading and writing those files has been moved to the
Repository class. RebaseCommand gets smaller because of that and doesn't
have to deal with reading/writing files.
Additional tests for empty todo-list files, or files containing comments
have been added.
Change-Id: I323f3619952fecdf28ddf50139a88e0bea34f5ba
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Also-by: Tobias Pfeifer <to.pfeifer@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Propagate IOException where possible when getting refs.
Currently, Repository.getAllRefs() and Repository.getTags() silently
ignores an IOException and instead returns an empty map. Repository
is a public API and as such cannot be changed until the next major
revision change. Where possible, update the internal jgit APIs to
use the RefDatabase directly, since it propagates the error.
Change-Id: I4e4537d8bd0fa772f388262684c5c4ca1929dc4c
Ignore bitmap indexes that do not match the pack checksum
If `git gc` creates a new pack with the same file name, the
pack checksum may not match that in the .bitmap. Fix the PackFile
implementaion to silently ignore invalid bitmap indexes.
Fixes Issue https://code.google.com/p/gerrit/issues/detail?id=2131
Change-Id: I378673c00de32385ba90f4b639cb812f9574a216
Remove unneeded packs when compacting with no new objects
Previously, the DfsPackCompactor exited without pruning the existing
packs, when no new packs were created.
Change-Id: I5e3b6f8c789706c7a982e6ae93cf7c3d4346797c
OpenJDK 7 does not benefit from using an inflate stride on the input
array. The implementation of java.util.zip.Inflater supplies the
entire input byte[] to libz, with no regards for the bounds supplied.
Slicing at 512 byte increments in DfsBlock no longer has any benefit.
In OpenJDK 6 the native portion of Inflater used GetByteArrayRegion
to obtain a copy of the input buffer for libz. In this use case
supplying a small stride made sense, it avoided allocating space
for and copying data past the end of the object's compressed stream.
In OpenJDK 7 the native code uses GetPrimitiveArrayCritical,
which tries to avoid copying by freezing Java garbage collection
and accessing the byte[] contents in place. On OpenJDK 7 derived
JVMs it is likely more efficient to supply the entire DfsBlock.
Since OpenJDK 5 and 6 are deprecated and replaced by OpenJDK 7
it is reasonable to suggest any consumers running JGit with DFS
support use an OpenJDK 7 derived JVM. However, JGit still targets
local filesystem support on Java 5, so it is still not reasonble to
apply this same simplification to the internal.storage.file package.
See: JDK-6751338 (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6751338)
Change-Id: Ib248b6d383da5c8aa887d9c355a0df6f3e2247a5
Previously it took 1200ms to create a reverse index (sorted by offset).
Using a simple bucket sort algorithm, that time is reduced to 450ms.
The bucket index into the offset array is kept, in order to decrease
the binary search window.
Don't keep a copy of the offsets. Instead, use nth position
to lookup the offset in the PackIndex.
Change-Id: If51ab76752622e04a4430d9a14db95ad02f5329d
Currently, the offset can only be retrieved by ObjectId or iterating all
of the entries. Add a method to lookup the offset by position in the
index sorted by SHA1.
Change-Id: I45e9ac8b752d1dab47b202753a1dcca7122b958e