Increase the safety factor to 2.5x for extra safety if max of measured
timestamp resolution and measured minimal racy threshold is < 100ms, use
1.25 otherwise since for large filesystem resolution values the
influence of finite resolution of the system clock should be negligible.
Before, not yet using the newly introduced minRacyThreshold measurement,
the threshold was 1.1x FS resolution, and we could issue the
following sequence of events,
start
create-file
read-file (currentTime)
end
which had the following timestamps:
create-file 1564589081
start 1564589082
read 1564589082
end 1564589082
In this case, the difference between create-file and read is 5ms,
which exceeded the 4ms FS resolution, even though the events together
took just 2ms of runtime.
Reproduce with:
bazel test --runs_per_test=100 \
//org.eclipse.jgit.test:org_eclipse_jgit_internal_storage_file_FileSnapshotTest
The file system timestamp resolution is 4ms in this case.
This code assumes that the kernel and the JVM use the same clock that
is synchronized with the file system clock. This seems plausible,
given the resolution of System.currentTimeMillis() and the latency for
a gettimeofday system call (typically ~1us), but it would be good to
justify this with specifications.
Also cover a source of flakiness: if the test runs under extreme load,
then we could have
start
create-file
<long delay>
read
end
which would register as an unmodified file. Avoid this by skipping the
test if end-start is too big.
[msohn]:
- downported from master to stable-5.1
- skip test if resolution is below 10ms
- adjust safety factor to 1.25 for resolutions above 100ms
Change-Id: I87d2cf035e01c44b7ba8364c410a860aa8e312ef
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fix FileSnapshot#save(long) and FileSnapshot#save(Instant)
Use the fallback timestamp resolution as already described in the
javadoc of these methods. Using zero file timestamp resolution doesn't
make sense.
Change-Id: Iaad2a0f99c3be3678e94980a0a368181b6aed38c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Persist minimal racy threshold and allow manual configuration
To enable persisting the minimal racy threshold per FileStore add a
new config option to the user global git configuration:
- Config section is "filesystem"
- Config subsection is concatenation of
- Java vendor (system property "java.vendor")
- Java version (system property "java.version")
- FileStore's name, on Windows we use the attribute volume:vsn instead
since the name is not necessarily unique.
- separated by '|'
e.g.
"AdoptOpenJDK|1.8.0_212-b03|/dev/disk1s1"
The same prefix is used as for filesystem timestamp resolution, so
both values are stored in the same config section
- The config key for minmal racy threshold is "minRacyThreshold" as a
time value, supported time units are those supported by
DefaultTypedConfigGetter#getTimeUnit
- measure for 3 seconds to limit runtime which depends on hardware, OS
and Java version being used
If the minimal racy threshold is configured for a given FileStore the
configured value is used instead of measuring it.
When the minimal racy threshold was measured it is persisted in the user
global git configuration.
Rename FileStoreAttributeCache to FileStoreAttributes since this class
is now declared public in order to enable exposing all attributes in one
object.
Example:
[filesystem "AdoptOpenJDK|11.0.3|/dev/disk1s1"]
timestampResolution = 7000 nanoseconds
minRacyThreshold = 3440 microseconds
Change-Id: I22195e488453aae8d011b0a8e3276fe3d99deaea
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Also-By: Marc Strapetz <marc.strapetz@syntevo.com>
Measure minimum racy interval to auto-configure FileSnapshot
By running FileSnapshotTest#detectFileModified we found that the sum of
measured filesystem timestamp resolution and measured clock resolution
may yield a too small interval after a file has been modified which we
need to consider racily clean. In our tests we didn't find this behavior
on all systems we tested on, e.g. on MacOS using APFS and Java 8 and 11
this effect was not observed.
On Linux (SLES 15, kernel 4.12.14-150.22-default) we collected the
following test results using Java 8 and 11:
In 23-98% of 10000 test runs (depending on filesystem type and Java
version) the test failed, which means the effective interval which needs
to be considered racily clean after a file was modified is larger than
the measured file timestamp resolution.
"delta" is the observed interval after a file has been modified but
FileSnapshot did not yet detect the modification:
"resolution" is the measured sum of file timestamp resolution and clock
resolution seen in Java.
Java version filesystem failures resolution min delta max delta
1.8.0_212-b04 btrfs 98.6% 1 ms 3.6 ms 6.6 ms
1.8.0_212-b04 ext4 82.6% 3 ms 1.1 ms 4.1 ms
1.8.0_212-b04 xfs 23.8% 4 ms 3.7 ms 3.9 ms
1.8.0_212-b04 zfs 23.1% 3 ms 4.8 ms 5.0 ms
11.0.3+7 btrfs 98.1% 3 us 0.7 ms 4.7 ms
11.0.3+7 ext4 98.1% 6 us 0.7 ms 4.7 ms
11.0.3+7 xfs 98.5% 7 us 0.1 ms 8.0 ms
11.0.3+7 zfs 98.4% 7 us 0.7 ms 5.2 ms
Mac OS
1.8.0_212 APFS 0% 1 s
11.0.3+7 APFS 0% 6 us
The observed delta is not distributed according to a normal gaussian
distribution but rather random in the observed range between "min delta"
and "max delta".
Run this test after measuring file timestamp resolution in
FS.FileAttributeCache to auto-configure JGit since it's unclear what
mechanism is causing this effect.
In FileSnapshot#isRacyClean use the maximum of the measured timestamp
resolution and the measured "delta" as explained above to decide if a
given FileSnapshot is to be considered racily clean. Add a 30% safety
margin to ensure we are on the safe side.
Change-Id: I1c8bb59f6486f174b7bbdc63072777ddbe06694d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Repeat the test 10000 times to get statistics if measured
fsTimestampResolution is working in practice to detect racy git
situations.
Add a class to compute statistics for this test. Log delta between
lastModified and time when FileSnapshot failed to detect modification.
This happens if the racy git limit determined by measuring filesystem
timestamp resolution and clock resolution is too small. If it would be
correct FileSnapshot would always detect modification or mark it
modified if time since modification is smaller than the racy git limit.
Change-Id: Iabe7af1a7211ca58480f8902d4fa4e366932fc77
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
We should not use configuration when creating FileSnapshot when
accessing FileBasedConfig.
Change-Id: Ic521632870f18bb004751642b9d30648dd94049a
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Use Instant instead of milliseconds for filesystem timestamp handling
This enables higher file timestamp resolution on filesystems like ext4,
Mac APFS (1ns) or NTFS (100ns) providing high timestamp resolution on
filesystem level.
Note:
- on some OSes Java 8,9 truncate milliseconds, see
https://bugs.openjdk.java.net/browse/JDK-8177809, fixed in Java 10
- UnixFileAttributes truncates timestamp resolution to microseconds when
converting the internal representation to FileTime exposed in the API,
see https://bugs.openjdk.java.net/browse/JDK-8181493
- WindowsFileAttributes also provides only microsecond resolution
Change-Id: I25ffff31a3c6f725fc345d4ddc2f26da3b88f6f2
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Checking lastModified is time critical hence debug trace is the only way
to analyze issues since debugging is impractical.
Also add configuration for buffering of log4j output to reduce runtime
impact when debug trace is on. Limit buffer to 1MiB and comment this
configuration out since we may not always want to use buffering.
Change-Id: Ib1a0537b67c8dc3fac994a77b42badd974ce6c97
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Persist filesystem timestamp resolution and allow manual configuration
To enable persisting filesystem timestamp resolution per FileStore add a
new config section to the user global git configuration:
- Config section is "filesystem"
- Config subsection is concatenation of
- Java vendor (system property "java.vm.vendor")
- runtime version (system property "java.vm.version")
- FileStore's name
- separated by '|'
e.g.
"AdoptOpenJDK|1.8.0_212-b03|/dev/disk1s1"
The prefix is needed since some Java versions do not expose the full
timestamp resolution of the underlying filesystem. This may also
depend on the underlying operating system hence concrete key values
may not be portable.
- Config key for timestamp resolution is "timestampResolution" as a time
value, supported time units are those supported by
DefaultTypedConfigGetter#getTimeUnit
If timestamp resolution is already configured for a given FileStore
the configured value is used instead of measuring the resolution.
When timestamp resolution was measured it is persisted in the user
global git configuration.
Example:
[filesystem "AdoptOpenJDK|1.8.0_212-b03|/dev/disk1s1"]
timestampResolution = 1 seconds
If locking the git config file fails retry saving the resolution up to 5
times in order to workaround races with another thread.
In order to avoid stack overflow use the fallback filesystem timestamp
resolution when loading FileBasedConfig which creates itself a
FileSnapshot to help checking if the config changed.
Note:
- on some OSes Java 8,9 truncate to milliseconds or seconds, see
https://bugs.openjdk.java.net/browse/JDK-8177809, fixed in Java 10
- UnixFileAttributes up to Java 12 truncates timestamp resolution to
microseconds when converting the internal representation to FileTime
exposed in the API, see https://bugs.openjdk.java.net/browse/JDK-8181493
- WindowsFileAttributes also provides only microsecond resolution up to
Java 12
Hence do not attempt to manually configure a higher timestamp resolution
than supported by the Java version being used at runtime.
Bug: 546891
Bug: 548188
Change-Id: Iff91b8f9e6e5e2295e1463f87c8e95edf4abbcf8
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Add a unittest.
In commit I5485db55 ("Fix FileSnapshot's consideration of file size"),
the special casing of UNKNOWN_SIZE was forgotten.
This change, together with I493f3b57b ("Measure file timestamp
resolution used in FileSnapshot") introduced a regression that would
occasionally surface in Gerrit integration tests marked UseLocalDisk,
with the symptom that creating the Admin user in NoteDb failed with a
LOCK_FAILURE.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Change-Id: I7ffd972581f815c144f810481103c7985af5feb0
Error Prone: Increase severity of NonOverridingEquals to ERROR
Error Prone reports the warning on several classes:
[NonOverridingEquals] equals method doesn't override Object.equals;
if this is a type-specific helper for a method that does override
Object.equals, either inline it into the callers or rename it to
avoid ambiguity.
See https://errorprone.info/bugpattern/NonOverridingEquals
Most of these are in the public API, so we can't rename or inline them
without breaking the API. FileSnapshot is not part of the public API,
but clients may be using it anyway, so we also shouldn't change that.
Suppress all the warnings instead. Having the check at severity ERROR
will at least make sure we don't introduce any new occurrences.
Change-Id: I92345c11256f06b4fa03ccc13337f72af5a43591
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Reference comparison with EMPTY and MISSING_FILE is intended; these
are static instances used as markers, and will always be the same
instances.
Change-Id: Ic27f5b797bdb9370cf8f6b3b7bb3f1523d4a454c
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Extend FileSnapshot for packfiles to also use checksum to detect changes
If the attributes of FileSnapshot don't detect modification of a
packfile read the packfile's checksum and compare it against the
checksum cached in the loaded packfile.
Since reading the checksum needs less IO than reloading the complete
packfile this may help to reduce the overhead to detect modficiation
when a gc completes while ObjectDirectory scans for packfiles in another
thread.
Bug: 546891
Change-Id: I9811b497eb11b8a85ae689081dc5d949ca8c4be5
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Wait opening new packfile until it can't be racy anymore
If
- pack.waitPreventRacyPack = true (default is false)
- packfile size > pack.minSizePreventRacyPack (default is 100 MB)
wait after a new packfile was written and before it is opened until it
cannot be racy anymore.
If a new packfile is accessed while it's still racy at least the pack's
index will be reread by ObjectDirectory.scanPacksImpl(). Hence it may
save resources to wait one tick of the file system timer to avoid this
reloading. On filesystems with a coarse timestamp resolution it may be
beneficial to skip this wait for small packfiles.
Bug: 546891
Change-Id: I0e8bf3d7677a025edd2e397dd2c9134ba59b1a18
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Capture reason for result of FileSnapshot#isModified
This allows to verify the expected behavior in
FileSnapshotTest#testSimulatePackfileReplacement and enables extending
FileSnapshot for packfiles to read the packfile's checksum as another
criterion to detect modifications without reading the full content.
Also add another field capturing the result of the last check if
lastModified was racily clean.
Remove unnecessary determination of raciness in the constructor. It was
determined twice in all relevant cases.
Change-Id: I100a2f49d7949693d7b72daa89437e166f1dc107
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Include filekey file attribute when comparing FileSnapshots
Due to finite filesystem timestamp resolution the last modified
timestamp of files cannot detect file changes which happened in the
immediate past (less than one filesystem timer tick ago).
Some filesystems expose unique file identifiers, e.g. inodes in Posix
filesystems which are named filekeys in Java's BasicFileAttributes. Use
them as another means to detect file modifications based on stat
information.
Running git gc on a repository yields a new packfile with the same id as
a packfile which existed before the gc if these packfiles contain the
same set of objects. The content of the old and the new packfile might
differ if a different PackConfig was used when writing the packfile.
Considering filekeys in FileSnapshot may help to detect such packfile
modifications.
Bug: 546891
Change-Id: I711a80328c55e1a31171d540880b8e80ec1fe095
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Measure file timestamp resolution used in FileSnapshot
FileSnapshot.notRacyClean() assumed a worst case filesystem timestamp
resolution of 2.5 sec (FAT has a resolution of 2 sec). Instead measure
timestamp resolution to avoid unnecessary IO caused by false positives
in detecting the racy git problem caused by finite filesystem timestamp
resolution [1].
Cache the measured resolution per FileStore since timestamp resolution
depends on the respective filesystem type. If timestamp resolution
cannot be measured or fails due to an exception fallback to the worst
case FAT timestamp resolution and avoid caching this value.
Add a 10% safety margin in FileSnapshot.notRacyClean(), though running
FsTest.testFsTimestampResolution() 1000 times which is not using a
safety margin didn't fail on Mac using APFS and Java 8, 11, 12.
Measured Java file timestamp resolution: [2]
[1] https://github.com/git/git/blob/master/Documentation/technical/racy-git.txt
[2] https://docs.google.com/spreadsheets/d/1imy0y6WmRqBf0kjCxzxj2X7M50eIVfa7oaUIzEOHmjo
Bug: 546891
Change-Id: I493f3b57b6b306285ffa7d392339d253e5966ab8
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* fix equals() and hashCode() methods to consider size
* fix toString() to show size
Change-Id: I5485db55eda5110121efd65d86c7166b3b2e93d0
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Due to finite filesystem timestamp resolution the last modified
timestamp of files cannot detect file changes which happened in the
immediate past (less than one filesystem timer tick ago).
Read and consider file size also, so that differing file size can help
to more accurately detect file changes without reading the file content.
Use bulk read to avoid multiple stat calls to retrieve file attributes.
Change-Id: I974288fff78ac78c52245d9218b5639603f67a46
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
FileSnapshot.isModified may have reported a file to be clean although it
was actually dirty.
Imagine you have a FileSnapshot on file f. lastmodified and lastread are
both t0. Now time is t1 and you
1) modify the file
2) update the FileSnapshot of the file (lastModified=t1, lastRead=t1)
3) modify the file again
4) wait 3 seconds
5) ask the Filesnapshot whether the file is dirty or not. It erroneously
answered it's clean.
Any file which has been modified longer than 2.5 seconds ago was
reported to be clean. As the test shows that's not always correct.
The real-world problem fixed by this change is the following:
* A gerrit server using JGit to serve git repositories is processing
fetch requests while simultaneously a native git garbage collection
runs on the repo.
* At time t1 native git writes temporary files in the pack folder
setting the mtime of the pack folder to t1.
* A fetch request causes JGit to search for new packfiles and JGit
remembers this scan in a Filesnapshot on the packs folder. Since the gc
is not finished JGit doesn't see any new packfiles.
* The fetch is processed and the gc ends while the filesystem timer is
still t1. GC writes a new packfile and deletes the old packfile.
* 3 seconds later another request arrives. JGit does not yet know about
the new packfile but is also not rescanning the pack folder because it
cached that the last scan happened at time t1 and pack folder's mtime is
also t1. Now JGit will not be able to resolve any object contained in
this new pack. This behavior may be persistent if objects referenced by
the ref/meta/config branch are affected so gerrit can't read permissions
stored in the refs/meta/config branch anymore and will not allow any
pushes anymore. The pack folder will not change its mtime and therefore
no rescan will take place.
Change-Id: I3efd0ccffeb97b01207dc3e7a6b85c6b06928fad
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This fixes the tests failed in JDK8.
FS uses java.nio API to get file attributes. The timestamps obtained
from that API are more precise than the ones from
java.io.File#lastModified() since Java8.
This difference accidentally makes JGit detect newly added files as
smudged. Use the precised timestamp to avoid this false positive.
Bug: 500058
Change-Id: I9e587583c85cb6efa7562ad6c5f26577869a2e7c
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
JGit 3.0: move internal classes into an internal subpackage
This breaks all existing callers once. Applications are not supposed
to build against the internal storage API unless they can accept API
churn and make necessary updates as versions change.
Change-Id: I2ab1327c202ef2003565e1b0770a583970e432e9
Let RefDirectory use FileSnapShot to handle fast updates
Since this change may affect performance and memory consumption on every
access to a loose ref I explicitly made it a RFC to collect opinions.
Previously RefDirectory.scanRef() was not detecting an update of a
loose ref when the update didn't changed the modification time of
the backing file. RefDirectory cached loose refs and the way to detect
outdated cache entries was to compare lastmodification timestamp on the
file representing the ref. If two updates to the same ref happen faster
than the filesystem-timer granularity (for linux this is 2 seconds)
there is the possiblity that we don't detect the update.
Because of this bug EGit's PushOperationTest only works with 2 second
sleeps inside.
This change let RefDirectory use FileSnapshot to detect such situations.
FileSnapshot helps to remember when a file was last read from disk and
therefore enables to decide when to load a file from disk although
modification time has not changed.
Change-Id: I03b9a137af097ec69c4c5e2eaa512d2bdd7fe080
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Instead of tracking the length and modification time by hand, rely
on FileSnapshot to tell RefDirectory when the $GIT_DIR/packed-refs
file has been changed or should be re-read from disk.
Change-Id: I067d268dfdca1d39c72dfa536b34e6a239117cc3
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
During unit tests and most likely elsewhere, updates come too fast for
a simple timestamp comparison (with one seconds resolution) to work.
I.e. DirCache thinks it hasn't changed.
Use FileSnapshot instead which has more advanced logic.
Change-Id: Ib850f84398ef7d4b8a8a6f5a0ae6963e37f2b470
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
We cannot use SystemReader to get the time, unless we do that consistently,
which is harder to do and be sure we are really testing what we want.
Then we need to update our lastRead variable whenever we conclude that
our file is not racily clean according to lastRead. It may well be clean,
but we do not know that until we check the system clock again.
Finally add a test for this class.
Change-Id: I1894b032b9bd359d1b5325e5472d48e372599e4c
Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
FileBasedConfig: Use FileSnapshot for isOutdated()
Relying only on the last modified time for a file can be tricky.
The "racy git" problem may cause some modifications to be missed.
Use the new FileSnapshot code to track when a configuration file
has been modified, and needs to be reloaded in memory.
Change-Id: Ib6312fdd3b2403eee5af3f8ae711294b0e5f9035
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Pulling the last modified checking logic out of ObjectDirectory
makes it possible to reuse this code for other files, such as
the $GIT_DIR/config or $GIT_DIR/packed-refs files.
Change-Id: If2f27a89fc3b7adde7e65ff40bbca5d55b98b772
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>