If we use the default system reader FileStoreAttributes cannot persist
attributes in userConfig when tests run in Bazel due to sandboxing.
Hence we need to ensure that all tests use MockSystemReader (and
especially a mocked userConfig).
Change-Id: Ic1ad8e2ec5a150c5433434a5f6667d6c4674c87d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Persist minimal racy threshold and allow manual configuration
To enable persisting the minimal racy threshold per FileStore add a
new config option to the user global git configuration:
- Config section is "filesystem"
- Config subsection is concatenation of
- Java vendor (system property "java.vendor")
- Java version (system property "java.version")
- FileStore's name, on Windows we use the attribute volume:vsn instead
since the name is not necessarily unique.
- separated by '|'
e.g.
"AdoptOpenJDK|1.8.0_212-b03|/dev/disk1s1"
The same prefix is used as for filesystem timestamp resolution, so
both values are stored in the same config section
- The config key for minmal racy threshold is "minRacyThreshold" as a
time value, supported time units are those supported by
DefaultTypedConfigGetter#getTimeUnit
- measure for 3 seconds to limit runtime which depends on hardware, OS
and Java version being used
If the minimal racy threshold is configured for a given FileStore the
configured value is used instead of measuring it.
When the minimal racy threshold was measured it is persisted in the user
global git configuration.
Rename FileStoreAttributeCache to FileStoreAttributes since this class
is now declared public in order to enable exposing all attributes in one
object.
Example:
[filesystem "AdoptOpenJDK|11.0.3|/dev/disk1s1"]
timestampResolution = 7000 nanoseconds
minRacyThreshold = 3440 microseconds
Change-Id: I22195e488453aae8d011b0a8e3276fe3d99deaea
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Also-By: Marc Strapetz <marc.strapetz@syntevo.com>
Measure minimum racy interval to auto-configure FileSnapshot
By running FileSnapshotTest#detectFileModified we found that the sum of
measured filesystem timestamp resolution and measured clock resolution
may yield a too small interval after a file has been modified which we
need to consider racily clean. In our tests we didn't find this behavior
on all systems we tested on, e.g. on MacOS using APFS and Java 8 and 11
this effect was not observed.
On Linux (SLES 15, kernel 4.12.14-150.22-default) we collected the
following test results using Java 8 and 11:
In 23-98% of 10000 test runs (depending on filesystem type and Java
version) the test failed, which means the effective interval which needs
to be considered racily clean after a file was modified is larger than
the measured file timestamp resolution.
"delta" is the observed interval after a file has been modified but
FileSnapshot did not yet detect the modification:
"resolution" is the measured sum of file timestamp resolution and clock
resolution seen in Java.
Java version filesystem failures resolution min delta max delta
1.8.0_212-b04 btrfs 98.6% 1 ms 3.6 ms 6.6 ms
1.8.0_212-b04 ext4 82.6% 3 ms 1.1 ms 4.1 ms
1.8.0_212-b04 xfs 23.8% 4 ms 3.7 ms 3.9 ms
1.8.0_212-b04 zfs 23.1% 3 ms 4.8 ms 5.0 ms
11.0.3+7 btrfs 98.1% 3 us 0.7 ms 4.7 ms
11.0.3+7 ext4 98.1% 6 us 0.7 ms 4.7 ms
11.0.3+7 xfs 98.5% 7 us 0.1 ms 8.0 ms
11.0.3+7 zfs 98.4% 7 us 0.7 ms 5.2 ms
Mac OS
1.8.0_212 APFS 0% 1 s
11.0.3+7 APFS 0% 6 us
The observed delta is not distributed according to a normal gaussian
distribution but rather random in the observed range between "min delta"
and "max delta".
Run this test after measuring file timestamp resolution in
FS.FileAttributeCache to auto-configure JGit since it's unclear what
mechanism is causing this effect.
In FileSnapshot#isRacyClean use the maximum of the measured timestamp
resolution and the measured "delta" as explained above to decide if a
given FileSnapshot is to be considered racily clean. Add a 30% safety
margin to ensure we are on the safe side.
Change-Id: I1c8bb59f6486f174b7bbdc63072777ddbe06694d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Measure filesystem timestamp resolution already in test setup
This helps to avoid some time critical tests can't prepare the test
fixture intended since measuring timestamp resolution takes time.
Change-Id: Ib34023e682a106070ca97e98ef16789a4dfb97b4
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
- use Path instead of File
- create test directories, files and output stream using Files methods
- delete unused list "files"
Change-Id: I8c5c601eca9f613efb5618d33b262277df92a06a
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Use constants from StandardCharsets instead of hard-coded strings
Instead of hard-coding the charset strings "US-ASCII", "UTF-8", and
"ISO-8859-1", use the corresponding constants from StandardCharsets.
UnsupportedEncodingException is not thrown when the StandardCharset
constants are used, so remove the now redundant handling.
Because the encoding names are no longer hard-coded strings, also
remove redundant $NON-NLS warning suppressions.
Also replace existing usages of the constants with static imports.
Change-Id: I0a4510d3d992db5e277f009a41434276f95bda4e
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
ConfigTest#pathToString is not visible to FileBasedConfigTest when
bulding with bazel.
Move it to FileUtils rather than messing about with the bazel build
rules to make it visible.
Change-Id: Idcfd4822699dac9dc4a426088a929a9cd31bf53f
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Relative include.path are now resolved against the config's parent
directory. include.path starting with ~/ are resolved against the
user's home directory
Change-Id: I91911ef404126618b1ddd3589294824a0ad919e6
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Buck does not create a target directory for the build output, this
is Maven specific and the project unit tests should not rely on it.
Instead follow the pattern used by org.eclipse.jgit.test which is to
create a temporary directory in the system temporary folder, and
configure the Maven surefire plugin to use the target directory.
Change-Id: Iebe5093332343a90f51080614e083aac0d29c26d
JGit 3.0: move internal classes into an internal subpackage
This breaks all existing callers once. Applications are not supposed
to build against the internal storage API unless they can accept API
churn and make necessary updates as versions change.
Change-Id: I2ab1327c202ef2003565e1b0770a583970e432e9
A pack bitmap index is an additional index of compressed
bitmaps of the object graph. Furthermore, a logical API of the index
functionality is included, as it is expected to be used by the
PackWriter.
Compressed bitmaps are created using the javaewah library, which is a
word-aligned compressed variant of the Java bitset class based on
run-length encoding. The library only works with positive integer
values. Thus, the maximum number of ObjectIds in a pack file that
this index can currently support is limited to Integer.MAX_VALUE.
Every ObjectId is given an integer mapping. The integer is the
position of the ObjectId in the complete ObjectId list, sorted
by offset, for the pack file. That integer is what the bitmaps
use to reference the ObjectId. Currently, the new index format can
only be used with pack files that contain a complete closure of the
object graph e.g. the result of a garbage collection.
The index file includes four bitmaps for the Git object types i.e.
commits, trees, blobs, and tags. In addition, a collection of
bitmaps keyed by an ObjectId is also included. The bitmap for each entry
in the collection represents the full closure of ObjectIds reachable
from the keyed ObjectId (including the keyed ObjectId itself). The
bitmaps are further compressed by XORing the current bitmaps against
prior bitmaps in the index, and selecting the smallest representation.
The XOR'd bitmap and offset from the current entry to the position
of the bitmap to XOR against is the actual representation of the entry
in the index file. Each entry contains one byte, which is currently
used to note whether the bitmap should be blindly reused.
Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
Include supported extensions in PackFile constructor.
Previously a PackFile class was assumed to only support a .pack and .idx
file. Update the constructor to enumerate the supported extensions for
the pack file. This will allow the bitmap code to only be executed if
the bitmap extension file is known to exist.
Change-Id: Ie59041dffec5f60d7ea2771026ffd945106bd4bf
File system time stamps and System.currentTimeMillis() may not
necessarily be running on the same clock so add some slack.
Bug: 396662
Change-Id: I25204d9e3181e15368da2902447518c6ce205017
Update DfsGarbageCollector to not read back a pack index.
Previously, the Dfs GC excluded objects from packs by passing a
previously written index to the PackWriter. Reading back a file on
Dfs is slow. Instead, allow the PackWriter to expose the objects
included in a pack and forward that to invocations of excludeObjects() .
Change-Id: I377cb4ab07f62cf790505e1eeb0b2efe81897c79
Fix concurrent creation of fan-out object directories
If multiple threads attempted to insert loose objects into the same new
fan-out directory, the creation of that directory was subject to a race
condition that could lead to an unnecessary IOException being thrown -
because an inserter could not 'create' a directory that had just been
generated by a different thread. All we require is that the directory
does indeed *exist*, so not being able to _create_ it is not actually a
fatal problem. Setting 'skipExisting' to 'true' on the call to mkdir()
fixes the issue.
I found this issue as a real world occurrence while working on The BFG
Repo Cleaner (https://github.com/rtyley/bfg-repo-cleaner), a tool which
concurrently performs a lot of object creation.
In order to demonstrate the problem here I've added a small test case
which reliably reproduces the issue on the few different hardware
systems I've tried. The error thrown when the race-condition arises is
this:
java.io.IOException: Creating directory /home/roberto/repo.git/objects/e6 failed
at org.eclipse.jgit.util.FileUtils.mkdir(FileUtils.java:182)
at org.eclipse.jgit.storage.file.ObjectDirectory.insertUnpackedObject(ObjectDirectory.java:590)
at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insertOneObject(ObjectDirectoryInserter.java:113)
at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insert(ObjectDirectoryInserter.java:91)
at org.eclipse.jgit.lib.ObjectInserter.insert(ObjectInserter.java:329)
Change-Id: I88eac49bc600c56ba9ad290e6133d8a7113125ab
Remove packIndex field from FileObjDatabase openPack method.
Previously, the FileObjDatabase required both the pack file path and
index file path to be passed to openPack(). A future change to add
a bitmap index will add a .bitmap file parallel to the pack file
(similar to the .idx file). Update the PackFile to support
automatically loading pack index extensions based on the pack file
path.
Change-Id: Ifc8fc3e57f4afa177ba5a88df87334dbfa799f01
These test classes heavily rely on Tree and associated classes. They
are convenient for building test cases and hence not yet replaced, but
there is a deprecation warning at about every line, which is not helpful.
Change-Id: Ia7cc8f3bb980dc03055b94748b6c7529a82ea5a5
For streams that should not be closed, i.e. don't own an underlying
stream, and in-memory streams that do not need to be closed we just
suppress the warning. This mostly apply to test cases. GC is enough.
For streams with external resources (i.e. files) we add the necessary
call to close().
Change-Id: I4d883ba2e7d07f199fe57ccb3459ece00441a570
Suppress boxing warnings where we know they are ok
Invoke the wrapper types' valueOf via static imports.
For booleans used in asserts, add a new assert in
the JUnit utility package since out current version of JUnit
does not have the assert(boolean, boolean) method.
Change-Id: I9099bd8efbc8c133479344d51ce7dabed8958a2b
Some GC tests were sporadically failing. The reason was that they used
the setExpireAgeMillis method to define object expiration before
invoking the prune method. Depending on the CPU load during the test
run, the prune method may reach an object (which is considered
non-expired by the test) too late and actually prune it.
To make the test stable we now use the setExpire(Date expire) method and
define a time instant before which objects are considered to be expired.
This way the outcome of the prune method doesn't depend on the CPU load.
Change-Id: Ifc3323ca55ae56dbccdbc90a282ec3cf18ad7297
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Change-Id: Id5b578f7040c6c896ab9386a6b5ed62b0f495ed5
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
JGit was not able to lookup refs which had the name of files which exist
in the .git folder. When JGit was looking up a ref named X it has a
fixed set of directories where it searched for files named X
(ignore packed refs for now). First directory to search for is .git. In
case of the ref named 'config' it searched there for this file, found it
(it's the .git/config file with the repo configuration in it), parsed
it, found it is an invalid ref and stopped searching. It never looked
for a file .git/refs/heads/config.
I changed JGit in a way that when it finds a file in GIT_DIR which
corresponds to a ref name and if this file doesn't contain a valid ref
then it will ignore the InvalidObjectIdException and continue searching.
Change-Id: Ic26a329fb1624a5b2b2494c78bac4bd76817c100
Bug: 381574
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Implements a garbage collector for FileRepositories. Main ideas are
copied from the garbage collector for DFS based repos
(DfsGarbageCollector). Added functionalities are
- pruning loose objects
- handling of the index
- packing refs
- handling of reflogs (objects referenced from reflog will not be
pruned/)
These are features of a GC which are not handled in this change and
which should come with subsequent changes:
- unpacking packed objects into loose objects (to support that pruning
packed objects doesn't delete them until they are older than two weeks)
- expiration of reflogs
- support for configuration parameters (e.g. gc.pruneExpire)
Change-Id: I14ea5cb7e0fd1b5c50b994fd77f4e05bfbb9d911
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
LockFileTest was failing on Windows because we couldn't delete the lock
file of the index. The reason was that a LockFile instance still had an
open handle to the lock file preventing us to delete the file (in
contrast to the behavior on other platforms).
Change-Id: I1d50442b7eb8a27f98f69ad77c5e24a9698a7b66
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Support gitdir: refs in BaseRepositoryBuilder.findGitDir
This allows findGitDir to be used for repositories containing
a .git file with a gitdir: ref to the repository's directory
such as submodule repositories that point to a folder under the
parent repository's .git/modules folder
Change-Id: I2f1ec7215a2208aa90511c065cadc7e816522f62
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
PackWriter supports excluding objects from being written to the pack.
You may specify a PackIndex which lists all those objects which should
not go into the new pack. This feature was broken because not all
commits have been checked whether they should be excluded or not. For
other object types the exclude algorithm worked. This commit adds the
missing check.
Change-Id: Id0047098393641ccba784c58b8325175c22fcece
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Only increment mod count if packed-refs file changes
Previously if a packed-refs file was racily clean then there
was a 2.5 second window in which each call to getPackedRefs
would increment the mod count causing a RefsChangedEvent to be
fired since the FileSnapshot would report the file as modified.
If a RefsChangedListener called getRef/getRefs from the
onRefsChanged method then a StackOverflowError could occur
since the stack could be exhausted before the 2.5 second
window expired and the packed-refs file would no longer
report being modified.
Now a SHA-1 is computed of the packed-refs file and the
mod count is only incremented when the packed refs are
successfully set and the id of the new packed-refs file
does not match the id of the old packed-refs file.
Change-Id: I8cab6e5929479ed748812b8598c7628370e79697
This allows repositoryies with a missing repositoryformatversion
config value to be successfully opened but still throws exceptions
when the value is a non-long or greater than zero.
git-core attempts to parse this config value as a long as well
and defaults to 0 if the value is missing.
Bug: 368697
Change-Id: I4a93117afca37e591e8e0ab4d2f2eef4273f0cc9
Signed-off-by: Chris Aniszczyk <zx@twitter.com>
Make sure all bytes are written to files on close, or get an error.
Java's BufferedOutputStream swallows any errors that occur when flushing
the buffer in close().
This class overrides close to make sure an error during the final
flush is reported back to the caller.
Change-Id: I74a82b31505fadf8378069c5f6554f1033c28f9b
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This will allow recovery from a LockFailedException where
the file associated with an exception is passed to FileUtils.unlock
to attempt an unlock on the file so the operation can be retried
Change-Id: I580166d386126bfb54a318a65253070a6e325936
Deprecate GitIndex more by using only DirCache internally.
This includes merging ReadTreeTest into DirCacheCheckoutTest and
converting IndexDiffTest to use DirCache only. The GitIndex specific
T0007GitIndex test remains.
GitIndex is deprecated. Let us speed up its demise by focusing the
DirCacheCheckout tests to using DirCache instead.
This also add explicit deprecation comments to methods that depend
on GitIndex in Repository and TreeEntry. The latter is deprecated in
itself.
Change-Id: Id89262f7fbfee07871f444378f196ded444f2783
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>