Buck does not create a target directory for the build output, this
is Maven specific and the project unit tests should not rely on it.
Instead follow the pattern used by org.eclipse.jgit.test which is to
create a temporary directory in the system temporary folder, and
configure the Maven surefire plugin to use the target directory.
Change-Id: Iebe5093332343a90f51080614e083aac0d29c26d
JGit 3.0: move internal classes into an internal subpackage
This breaks all existing callers once. Applications are not supposed
to build against the internal storage API unless they can accept API
churn and make necessary updates as versions change.
Change-Id: I2ab1327c202ef2003565e1b0770a583970e432e9
A pack bitmap index is an additional index of compressed
bitmaps of the object graph. Furthermore, a logical API of the index
functionality is included, as it is expected to be used by the
PackWriter.
Compressed bitmaps are created using the javaewah library, which is a
word-aligned compressed variant of the Java bitset class based on
run-length encoding. The library only works with positive integer
values. Thus, the maximum number of ObjectIds in a pack file that
this index can currently support is limited to Integer.MAX_VALUE.
Every ObjectId is given an integer mapping. The integer is the
position of the ObjectId in the complete ObjectId list, sorted
by offset, for the pack file. That integer is what the bitmaps
use to reference the ObjectId. Currently, the new index format can
only be used with pack files that contain a complete closure of the
object graph e.g. the result of a garbage collection.
The index file includes four bitmaps for the Git object types i.e.
commits, trees, blobs, and tags. In addition, a collection of
bitmaps keyed by an ObjectId is also included. The bitmap for each entry
in the collection represents the full closure of ObjectIds reachable
from the keyed ObjectId (including the keyed ObjectId itself). The
bitmaps are further compressed by XORing the current bitmaps against
prior bitmaps in the index, and selecting the smallest representation.
The XOR'd bitmap and offset from the current entry to the position
of the bitmap to XOR against is the actual representation of the entry
in the index file. Each entry contains one byte, which is currently
used to note whether the bitmap should be blindly reused.
Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
Include supported extensions in PackFile constructor.
Previously a PackFile class was assumed to only support a .pack and .idx
file. Update the constructor to enumerate the supported extensions for
the pack file. This will allow the bitmap code to only be executed if
the bitmap extension file is known to exist.
Change-Id: Ie59041dffec5f60d7ea2771026ffd945106bd4bf
File system time stamps and System.currentTimeMillis() may not
necessarily be running on the same clock so add some slack.
Bug: 396662
Change-Id: I25204d9e3181e15368da2902447518c6ce205017
Update DfsGarbageCollector to not read back a pack index.
Previously, the Dfs GC excluded objects from packs by passing a
previously written index to the PackWriter. Reading back a file on
Dfs is slow. Instead, allow the PackWriter to expose the objects
included in a pack and forward that to invocations of excludeObjects() .
Change-Id: I377cb4ab07f62cf790505e1eeb0b2efe81897c79
Fix concurrent creation of fan-out object directories
If multiple threads attempted to insert loose objects into the same new
fan-out directory, the creation of that directory was subject to a race
condition that could lead to an unnecessary IOException being thrown -
because an inserter could not 'create' a directory that had just been
generated by a different thread. All we require is that the directory
does indeed *exist*, so not being able to _create_ it is not actually a
fatal problem. Setting 'skipExisting' to 'true' on the call to mkdir()
fixes the issue.
I found this issue as a real world occurrence while working on The BFG
Repo Cleaner (https://github.com/rtyley/bfg-repo-cleaner), a tool which
concurrently performs a lot of object creation.
In order to demonstrate the problem here I've added a small test case
which reliably reproduces the issue on the few different hardware
systems I've tried. The error thrown when the race-condition arises is
this:
java.io.IOException: Creating directory /home/roberto/repo.git/objects/e6 failed
at org.eclipse.jgit.util.FileUtils.mkdir(FileUtils.java:182)
at org.eclipse.jgit.storage.file.ObjectDirectory.insertUnpackedObject(ObjectDirectory.java:590)
at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insertOneObject(ObjectDirectoryInserter.java:113)
at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insert(ObjectDirectoryInserter.java:91)
at org.eclipse.jgit.lib.ObjectInserter.insert(ObjectInserter.java:329)
Change-Id: I88eac49bc600c56ba9ad290e6133d8a7113125ab
Remove packIndex field from FileObjDatabase openPack method.
Previously, the FileObjDatabase required both the pack file path and
index file path to be passed to openPack(). A future change to add
a bitmap index will add a .bitmap file parallel to the pack file
(similar to the .idx file). Update the PackFile to support
automatically loading pack index extensions based on the pack file
path.
Change-Id: Ifc8fc3e57f4afa177ba5a88df87334dbfa799f01
These test classes heavily rely on Tree and associated classes. They
are convenient for building test cases and hence not yet replaced, but
there is a deprecation warning at about every line, which is not helpful.
Change-Id: Ia7cc8f3bb980dc03055b94748b6c7529a82ea5a5
For streams that should not be closed, i.e. don't own an underlying
stream, and in-memory streams that do not need to be closed we just
suppress the warning. This mostly apply to test cases. GC is enough.
For streams with external resources (i.e. files) we add the necessary
call to close().
Change-Id: I4d883ba2e7d07f199fe57ccb3459ece00441a570
Suppress boxing warnings where we know they are ok
Invoke the wrapper types' valueOf via static imports.
For booleans used in asserts, add a new assert in
the JUnit utility package since out current version of JUnit
does not have the assert(boolean, boolean) method.
Change-Id: I9099bd8efbc8c133479344d51ce7dabed8958a2b
Some GC tests were sporadically failing. The reason was that they used
the setExpireAgeMillis method to define object expiration before
invoking the prune method. Depending on the CPU load during the test
run, the prune method may reach an object (which is considered
non-expired by the test) too late and actually prune it.
To make the test stable we now use the setExpire(Date expire) method and
define a time instant before which objects are considered to be expired.
This way the outcome of the prune method doesn't depend on the CPU load.
Change-Id: Ifc3323ca55ae56dbccdbc90a282ec3cf18ad7297
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Change-Id: Id5b578f7040c6c896ab9386a6b5ed62b0f495ed5
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
JGit was not able to lookup refs which had the name of files which exist
in the .git folder. When JGit was looking up a ref named X it has a
fixed set of directories where it searched for files named X
(ignore packed refs for now). First directory to search for is .git. In
case of the ref named 'config' it searched there for this file, found it
(it's the .git/config file with the repo configuration in it), parsed
it, found it is an invalid ref and stopped searching. It never looked
for a file .git/refs/heads/config.
I changed JGit in a way that when it finds a file in GIT_DIR which
corresponds to a ref name and if this file doesn't contain a valid ref
then it will ignore the InvalidObjectIdException and continue searching.
Change-Id: Ic26a329fb1624a5b2b2494c78bac4bd76817c100
Bug: 381574
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Implements a garbage collector for FileRepositories. Main ideas are
copied from the garbage collector for DFS based repos
(DfsGarbageCollector). Added functionalities are
- pruning loose objects
- handling of the index
- packing refs
- handling of reflogs (objects referenced from reflog will not be
pruned/)
These are features of a GC which are not handled in this change and
which should come with subsequent changes:
- unpacking packed objects into loose objects (to support that pruning
packed objects doesn't delete them until they are older than two weeks)
- expiration of reflogs
- support for configuration parameters (e.g. gc.pruneExpire)
Change-Id: I14ea5cb7e0fd1b5c50b994fd77f4e05bfbb9d911
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
LockFileTest was failing on Windows because we couldn't delete the lock
file of the index. The reason was that a LockFile instance still had an
open handle to the lock file preventing us to delete the file (in
contrast to the behavior on other platforms).
Change-Id: I1d50442b7eb8a27f98f69ad77c5e24a9698a7b66
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Support gitdir: refs in BaseRepositoryBuilder.findGitDir
This allows findGitDir to be used for repositories containing
a .git file with a gitdir: ref to the repository's directory
such as submodule repositories that point to a folder under the
parent repository's .git/modules folder
Change-Id: I2f1ec7215a2208aa90511c065cadc7e816522f62
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
PackWriter supports excluding objects from being written to the pack.
You may specify a PackIndex which lists all those objects which should
not go into the new pack. This feature was broken because not all
commits have been checked whether they should be excluded or not. For
other object types the exclude algorithm worked. This commit adds the
missing check.
Change-Id: Id0047098393641ccba784c58b8325175c22fcece
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Only increment mod count if packed-refs file changes
Previously if a packed-refs file was racily clean then there
was a 2.5 second window in which each call to getPackedRefs
would increment the mod count causing a RefsChangedEvent to be
fired since the FileSnapshot would report the file as modified.
If a RefsChangedListener called getRef/getRefs from the
onRefsChanged method then a StackOverflowError could occur
since the stack could be exhausted before the 2.5 second
window expired and the packed-refs file would no longer
report being modified.
Now a SHA-1 is computed of the packed-refs file and the
mod count is only incremented when the packed refs are
successfully set and the id of the new packed-refs file
does not match the id of the old packed-refs file.
Change-Id: I8cab6e5929479ed748812b8598c7628370e79697
This allows repositoryies with a missing repositoryformatversion
config value to be successfully opened but still throws exceptions
when the value is a non-long or greater than zero.
git-core attempts to parse this config value as a long as well
and defaults to 0 if the value is missing.
Bug: 368697
Change-Id: I4a93117afca37e591e8e0ab4d2f2eef4273f0cc9
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
Make sure all bytes are written to files on close, or get an error.
Java's BufferedOutputStream swallows any errors that occur when flushing
the buffer in close().
This class overrides close to make sure an error during the final
flush is reported back to the caller.
Change-Id: I74a82b31505fadf8378069c5f6554f1033c28f9b
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This will allow recovery from a LockFailedException where
the file associated with an exception is passed to FileUtils.unlock
to attempt an unlock on the file so the operation can be retried
Change-Id: I580166d386126bfb54a318a65253070a6e325936
Deprecate GitIndex more by using only DirCache internally.
This includes merging ReadTreeTest into DirCacheCheckoutTest and
converting IndexDiffTest to use DirCache only. The GitIndex specific
T0007GitIndex test remains.
GitIndex is deprecated. Let us speed up its demise by focusing the
DirCacheCheckout tests to using DirCache instead.
This also add explicit deprecation comments to methods that depend
on GitIndex in Repository and TreeEntry. The latter is deprecated in
itself.
Change-Id: Id89262f7fbfee07871f444378f196ded444f2783
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Add a helper for parsing branch switch info out of a reflog entry
Change-Id: I91c7e08c4afd2562df2226887a933d93c78a0371
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
During parsing these are used with contains(). If they are a List
type, the contains operation is not efficient. Some callers such
as UploadPack often pass a List here, so convert to Set when the
type isn't efficient for contains().
Change-Id: If948ae3bf1f46e756bd2d5db14795e12ba7a6207
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Fix reading of ref names containing characters that sort before /
A set of ref names like ('a/b' and 'a+b') would cause the RefDirectory
to think that the set of refs have changed because it traversed the
'a' directory in the subtree before looking at 'a+b', but it then
compared with the know refs which are sorted with 'a+b' first.
Fix this by traversing the refs tree in another order. Treat a directory
as if they ends with a '/' before deciding on the order to traverse
the refs tree.
Bug: 348834
Change-Id: I23377f8df00c7252bf27dbcfba5da193c5403917
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
It's useful to have ReflogEntry refactored out so it can be
used by clients via the JGit API.
Change-Id: I03044df9af9f9547777545b7c9b93bdf5f8b7cb5
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Let RefDirectory use FileSnapShot to handle fast updates
Since this change may affect performance and memory consumption on every
access to a loose ref I explicitly made it a RFC to collect opinions.
Previously RefDirectory.scanRef() was not detecting an update of a
loose ref when the update didn't changed the modification time of
the backing file. RefDirectory cached loose refs and the way to detect
outdated cache entries was to compare lastmodification timestamp on the
file representing the ref. If two updates to the same ref happen faster
than the filesystem-timer granularity (for linux this is 2 seconds)
there is the possiblity that we don't detect the update.
Because of this bug EGit's PushOperationTest only works with 2 second
sleeps inside.
This change let RefDirectory use FileSnapshot to detect such situations.
FileSnapshot helps to remember when a file was last read from disk and
therefore enables to decide when to load a file from disk although
modification time has not changed.
Change-Id: I03b9a137af097ec69c4c5e2eaa512d2bdd7fe080
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
RefDirectory did not correctly follow the contract of RefList. The
contract says if you use add() method of RefList builder, you MUST
sort() it afterwards, and later every other method assumes that list
is properly sorted (especially the binary search in the find() and
get() methods). Instead RefDirectory class tried to scan the refs
recursively while sorting every folder in the process before
processing and did not call sort().
For example, when scanning the contents of refs/tags project1 string
is smaller than project1-*, so it will recursively go into the folder
and add these tags first and only then will add project-* ones. This
will result in a broken list (any project1-* string is less than
project1/* one, but they all appear after them in the list), that's
why binary search will fail making loose RefList and the whole local
RefMap completely unusable.
Change-Id: Ibad90017e3b2435b1396b69a22520db4b1b022bb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Instead of tracking the length and modification time by hand, rely
on FileSnapshot to tell RefDirectory when the $GIT_DIR/packed-refs
file has been changed or should be re-read from disk.
Change-Id: I067d268dfdca1d39c72dfa536b34e6a239117cc3
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Fix: possible IndexOutOfBoundsException in ReflogReader
java.lang.IndexOutOfBoundsException
at java.nio.ByteBuffer.wrap(ByteBuffer.java:352)
at org.eclipse.jgit.util.RawParseUtils.decodeNoFallback(RawParseUtils.java:913)
at org.eclipse.jgit.util.RawParseUtils.decode(RawParseUtils.java:880)
at org.eclipse.jgit.util.RawParseUtils.decode(RawParseUtils.java:839)
at org.eclipse.jgit.storage.file.ReflogReader$Entry.<init>(ReflogReader.java:102)
at org.eclipse.jgit.storage.file.ReflogReader.getReverseEntries(ReflogReader.java:183)
at org.eclipse.jgit.storage.file.ReflogReader.getReverseEntries(ReflogReader.java:162)
Change-Id: I22a18bc7193962e5018c40a75337f9976b585c40
PackWriter: Rename getObjectsNumber to getObjectCount
This better matches with PackFile and CachedPack's methods
that return the same value.
Change-Id: Idb9b7c71d2048dd2344a62c2cde20b4e34529ab7
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
PackWriter incorrectly returned 0 from getObjectsNumber() when the
pack has not been written yet. This caused dumb transports like
amazon-s3:// and sftp:// to abort early and never write out a pack,
under the assumption that the pack had no objects.
Until the pack header is written to the output stream, compute the
current object count each time it is requested. Once the header is
started, use the object count from the stats object.
Change-Id: I041a2368ae0cfe6f649ec28658d41a6355933900
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
[findbugs] Do not ignore exceptional return value of mkdir
java.io.File.mkdir() and mkdirs() report failure as an exceptional
return value false. Fix the code which silently ignored this
exceptional return value.
Change-Id: I41244f4b9d66176e68e2c07e2329cf08492f8619
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Refactor IndexPack to not require local filesystem
By moving the logic that parses a pack stream from the network (or
a bundle) into a type that can be constructed by an ObjectInserter,
repository implementations have a chance to inject their own logic
for storing object data received into the destination repository.
The API isn't completely generic yet, there are still quite a few
assumptions that the PackParser subclass is storing the data onto
the local filesystem as a single file. But its about the simplest
split of IndexPack I can come up with without completely ripping
the code apart.
Change-Id: I5b167c9cc6d7a7c56d0197c62c0fd0036a83ec6c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Eclipse has some problem re-running single JUnit tests if
the tests are in Junit 3 format, but the JUnit 4 launcher
is used. This was quite unnecessary and the move was not
completed. We still have no JUnit4 test.
This completes the extermination of JUnit3. Most of the
work was global searce/replace using regular expression,
followed by numerous invocarions of quick-fix and organize
imports and verification that we had the same number of
tests before and after.
- Annotations were introduced.
- All references to JUnit3 classes removed
- Half-good replacement for getting the test name. This was
needed to make the TestRngs work. The initialization of
TestRngs was also made lazily since we can not longer find
out the test name in runtime in the @Before methods.
- Renamed test classes to end with Test, with the exception
of TestTranslateBundle, which fails from Maven
- Moved JGitTestUtil to the junit support bundle
Change-Id: Iddcd3da6ca927a7be773a9c63ebf8bb2147e2d13
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
We cannot use SystemReader to get the time, unless we do that consistently,
which is harder to do and be sure we are really testing what we want.
Then we need to update our lastRead variable whenever we conclude that
our file is not racily clean according to lastRead. It may well be clean,
but we do not know that until we check the system clock again.
Finally add a test for this class.
Change-Id: I1894b032b9bd359d1b5325e5472d48e372599e4c
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>