Fix PackInvalidException when fetch and repack run concurrently
We are running several servers with jGit. We need to run repack from
time to time to keep the repos performant. I.e. after push we test how
many small packs are in the repo and when a threshold is reached we run
the repack.
After upgrading jGit version we've found that if someone does the clone
at the time repack is running the clone sometimes (not always) fails
because the repack removes .pack file used by the clone. Server
exception and client error attached.
I've tracked down the cause and it seems to be introduced between jGit
5.2 (which we upgraded from) and 5.3 and being caused by this commit:
Move throw of PackInvalidException outside the catch -
afef866a44
The problem is that when the throw was inside of the try block the last
catch block catched the exception and called openFailed(false) method.
It is true that it called it with invalidate = false, which is wrong.
The real problem though is that with the throw outside of the try block
the openFail is not called at all and the fields activeWindows and
activeCopyRawData are not set to 0. Which affects the later called tests
like: if (++activeCopyRawData == 1 && activeWindows == 0).
The fix for this is relatively simple keeping the throw outside of the
try block and still having the invalid field set to true. I did
exhaustive testing of the change running concurrent clones and pushes
indefinitely and with the patch applied it never fails while without the
patch it takes relatively short to get the error.
See: https://www.eclipse.org/lists/jgit-dev/msg04014.html
Bug: 569349
Change-Id: I9dbf8801c8d3131955ad7124f42b62095d96da54
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
GC#deleteOrphans: handle failure to list files in pack directory
- log an error
- either there is no list or it is incomplete hence return immediately
Change-Id: Ieee5378ca06304056b9ccc30c1acd5f52360052d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
If pack or index files are guarded by a pack lock (.keep file)
deleteOrphans() should not touch the respective files protected by the
lock file. Otherwise it may interfere with PackInserter concurrently
inserting a new pack file and its index.
The problem was caused by the following race.
All mentioned files are located in "objects/pack/".
File endings relevant in "pack" dir:
.pack
.keep
.idx
.bitmap
When ReceivePack receives a pack file it executes the following steps:
ReceivePack.service():
receivePackAndCheckConnectivity():
receivePack():
receive the pack
parse the pack, returns packLock (.keep file)
PackInserter.flush():
write tmpPck file: "insert_<random>.pack"
write tmpIdx file: "insert_<random>.idx"
real pack name: "pack-<SHA1>.pack"
real index name: "pack-<SHA1>.idx"
atomic rename tmpPack to realPack
atomic rename tmpIdx to tmpIdx
execute commands
unlock pack by removing .keep file
trigger auto gc if enabled
When PackInserter.flush() renames the temporary pack to the final
"pack-xxx.pack" file the temporary pack index file "insert_xxx.idx"
has no matching .pack file with the same base name for a short interval.
If deleteOrphans() ran during that interval it deduced the pack index
file was orphaned. Subsequently the missing pack index caused
MissingObjectExceptions since objects contained in the pack couldn't be
looked up anymore.
Bug: https://bugs.chromium.org/p/gerrit/issues/detail?id=13544
Change-Id: I559c81e4b1d7c487f92a751bd78b987d32c98719
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
WindowCache: add metric for cached bytes per repository
Since ObjectDatabase and PackFile don't know their repository use the
packfile's grand-grand-parent directory as an identifier for the
repository the packfile resides in.
Remove metric for a repository if the number of cached bytes for the
repository drops to 0 in order to ensure the map of cached bytes per
repository doesn't contain repositories which have no data cached in the
WindowCache.
Change-Id: I969ab8029db0a292e7585cbb36ca0baa797da20b
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
WindowCache: add option to use strong refs to reference ByteWindows
Java GC evicts all SoftReferences when the used heap size comes close to
the maximum heap size. This means peaks in heap memory consumption can
flush the complete WindowCache which was observed to have negative
impact on performance of upload-pack in Gerrit.
Hence add a boolean option core.packedGitUseStrongRefs to allow using
strong references to reference packfile pages cached in the WindowCache.
If this option is set to true Java gc can no longer flush the
WindowCache to free memory if the used heap comes close to the maximum
heap size. On the other hand this provides more predictable performance.
Bug: 553573
Change-Id: I9de406293087ab0fa61130c8e0829775762ece8d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Replace usage of ArrayIndexOutOfBoundsException in treewalk
Using exceptions during normal operations - for example with the
desire of expanding an array in the failure case - can have a
severe performance impact. When exceptions are instantiated,
a stack trace is collected. Generating stack trace can be expensive.
Compared to that, checking an array for length - even if done many
times - is cheap since this is a check that can run in just a
handful of CPU cycles.
Change-Id: Ifaf10623f6a876c9faecfa44654c9296315adfcb
Signed-off-by: Patrick Hiesel <hiesel@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Add the following statistics
- cache hit count and hit ratio
- cache miss count and miss ratio
- count of successful and failed loads
- rate of failed loads
- load, eviction and request count
- average and total load time
Use LongAdder instead of AtomicLong to implement counters in order to
improve scalability.
Optionally expose these metrics via JMX, they are registered with the
platform MBean server if the config option jmx.WindowCacheStats = true
in the user or system level git config.
Bug: 553573
Change-Id: Ia2d5246ef69b9c2bd594a23934424bc5800774aa
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Do not rely on ArrayIndexOutOfBoundsException to detect end of input
In the Config#StringReader we relied on ArrayIndexOutOfBoundsException
to detect the end of the input. Creation of exception with (deep) stack
trace can significantly degrade performance in case when we read
thousands of config files, like in the case when Gerrit reads all
external ids from the NoteDb.
Use the buf.length to detect the end of the input.
Change-Id: I12266f25751373a870ce3fa623cf2a95d882d521
WorkingTreeIterator: handle different timestamp resolutions
Older JGit stored only milliseconds timestamps in the index. Newer
JGit may get finer timestamps from the file system. This leads to
slow index diffs when a new JGit runs against an index produced
by older JGit because many timestamps will differ and JGit will
then do many content checks. See [1].
Handle this migration case by only comparing milliseconds if the
index entry has only millisecond precision.
The inverse may also occur; also compare only milliseconds if the
file timestamp has only millisecond precision.
Do the same also for microsecond resolution. On Windows, NTFS may
provide 100ns resolution and may be used by external programs writing
the index, but Java's WindowsFileAttributes may provide only
microseconds.
File timestamp precision in Java depends not only on the Java APIs
used by different JGit versions but may also change when running the
same Java code on different VMs. And of course the resolution may
vary among operating and file systems. Moreover, timestamp precision
in the index depends on the program that wrote the index. Canonical
git may use a different resolution, maybe even different between git
versions.
[1] https://www.eclipse.org/forums/index.php/t/1100344/
Change-Id: Idfd08606c883cb98787b2138f9baf0cc89a57b56
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fix WorkingTreeIterator.compareMetadata() for CheckStat.MINIMAL
If CheckStat is MINIMAL or timestamps have no nanosecond part
WorkingTreeIterator.compareMetaData only checks the second part of
timestamps and ignores nanoseconds which may have ended up in the index
by using native git.
If
fileLastModified.getEpochSecond() == cacheLastModified.getEpochSecond()
we currently proceed comparing fileLastModified and cacheLastModified
with full precision which is wrong since we determined that we detected
reduced timestamp resolution.
Fix this and also handle smudged index entries for CheckStat.MINIMAL.
Change-Id: I6149885903ac63d79b42d234cc02aa4e19578f3c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
BatchRefUpdate: repro racy atomic update, and fix it
PackedBatchRefUpdate was creating a new packed-refs list that was
potentially unsorted. This would be papered over when the list was
read back from disk in parsePackedRef, which detects unsorted ref
lists on reading, and sorts them. However, the BatchRefUpdate also
installed the new (unsorted) list in-memory in
RefDirectory#packedRefs.
With the timestamp granularity code committed to stable-5.1, we can
more often accurately decide that the packed-refs file is clean, and
will return the erroneous unsorted data more often. Unluckily timed
delays also cause the file to be clean, hence this problem was
exacerbated under load.
The symptom is that refs added by a BatchRefUpdate would stop being
visible directly after they were added. In particular, the Gerrit
integration tests uses BatchRefUpdate in its setup for creating the
Admin group, and then tries to read it out directly afterward.
The tests recreates one failure case. A better approach would be to
revise RefList.Builder, so it detects out-of-order lists and
automatically sorts them.
Fixes https://bugs.eclipse.org/bugs/show_bug.cgi?id=548716 and
https://bugs.chromium.org/p/gerrit/issues/detail?id=11373.
Bug: 548716
Change-Id: I613c8059964513ce2370543620725b540b3cb6d1
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Return a new instance from openSystemConfig and openUserConfig
Move the handling of cached user and system config to getSystemConfig
and getUserConfig methods and revert the implementation of
openSystemConfig and openUserConfig to the old stateless
implementation.
This ensures the open methods respect the passed-in parent config, which
may be different on each invocation. Additionally, returning a new
instance matches the behavior of the previous implementation of the
default system reader, which downstream callers may be depending on.
Move the implementation of the new caching methods getSystemConfig and
getUserConfig up to SystemReader. This avoids that we break the ABI for
subclasses of SystemReader.
Also see [1] which fixed a similar problem with Gerrit's custom
SystemReader.
[1] https://gerrit-review.googlesource.com/c/gerrit/+/225458
Change-Id: If54a2491932d8fc914d4649cb73c9e837c5b8ad0
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Use AtomicReferences to cache user and system level configs
This ensures that only one instance of user and one instance of system
config is set.
Change-Id: Idd00150f91d2d40af79499dd7bf8ad5940f87c4e
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
deleteChildren was called on directory instead of gitDir, leading to a
potential null pointer exception if the git directory existed initially.
Bug: 550340
Change-Id: Iafc3b2961253a99862a59e81c7371f7bc564b412
Signed-off-by: Adrien Bustany <adrien-xx-eclipse@bustany.org>
Avoid sign extension when comparing mtime with Instant#getEpochSecond
Ensure we use the same type when comparing seconds since the epoch.
This does not prevent that in 2038 timestamps in seconds since the epoch
stored in a 32 bit integer will overflow. Integer.MAX_VALUE translates
to 2038-01-19T03:14:07Z. After this date we'll have an issue since we
store seconds since the epoch in a 32 bit integer in some places.
Bug: 319142
Change-Id: If0c03003d40b480f044686e2f7a2f62c9f4e2fe1
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fix deprecation in DirCache caused by Instant based DirCacheEntry
Replace the two int variables smudge_s and smudge_ns by an Instant and
use the new method DirCacheEntry.mightBeRacilyClean(Instant).
Change-Id: Id70adbb0856a64909617acf65da1bae8e2ae934a
Signed-off-by: Michael Keppler <Michael.Keppler@gmx.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Cache user global and system-wide git configurations
So far the git configuration and the system wide git configuration were
always reloaded when jgit accessed these global configuration files to
access global configuration options which are not in the context of a
single git repository. Cache these configurations in SystemReader and
only reload them if their file metadata observed using FileSnapshot
indicates a modification.
Change-Id: I092fe11a5d95f1c5799273cacfc7a415d0b7786c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
The existing javadoc was copied from another method and not adapted.
Change-Id: I39a7e5d719b2c379de9bd1a4710a55a73700c6f0
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Improve retry handling when saving FileStoreAttributes fails
- fix handling of interrupts in FileStoreAttributes#saveToConfig
- increase retry wait time to 100ms
- don't wait after last retry
- dont retry if failure is caused by another exception than
LockFailedException
Change-Id: I108c012717d2bcce71f2c6cb9cf0879de704ebc2
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Make supportsAtomicCreateNewFile return true as default
The method org.eclipse.jgit.util.FS.supportsAtomicCreateNewFile()
should default to true as mentioned in docs [1]
org.eclipse.jgit.util.FS_POSIX.supportsAtomicCreateNewFile() method
will set the value to false if the git config
core.supportsatomiccreatenewfile is not set.
It should default to true if the configuration is undefined.
[1]
4169a95a65/org.eclipse.jgit/src/org/eclipse/jgit/util/FS_POSIX.java (L372)
Bug: 544164
Change-Id: I16ccf989a89da2cf4975c200b3228b25ba4c0d55
Signed-off-by: Vishal Devgire <vishaldevgire@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Remove FileBasedConfig.load(boolean) introduced in d45219ba
We can't add this method to the super class StoredConfig since that
abstracts from filesystem storage. MockSystemReader.MockConfig is a
StoredConfig and is also used by tests for dfs based storage. Hence
remove this leaky abstraction.
This implies we always use the fallback FileStoreAttributes which means
a config file modification is considered racy within the first 2
seconds. This should not be an issue since typically configs change
rarely and re-reading a config within the racy period is relatively
cheap since configs are small.
Change-Id: Ia2615addc24a7cadf3c566ee842c6f4f07e159a5
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Increase the safety factor to 2.5x for extra safety if max of measured
timestamp resolution and measured minimal racy threshold is < 100ms, use
1.25 otherwise since for large filesystem resolution values the
influence of finite resolution of the system clock should be negligible.
Before, not yet using the newly introduced minRacyThreshold measurement,
the threshold was 1.1x FS resolution, and we could issue the
following sequence of events,
start
create-file
read-file (currentTime)
end
which had the following timestamps:
create-file 1564589081
start 1564589082
read 1564589082
end 1564589082
In this case, the difference between create-file and read is 5ms,
which exceeded the 4ms FS resolution, even though the events together
took just 2ms of runtime.
Reproduce with:
bazel test --runs_per_test=100 \
//org.eclipse.jgit.test:org_eclipse_jgit_internal_storage_file_FileSnapshotTest
The file system timestamp resolution is 4ms in this case.
This code assumes that the kernel and the JVM use the same clock that
is synchronized with the file system clock. This seems plausible,
given the resolution of System.currentTimeMillis() and the latency for
a gettimeofday system call (typically ~1us), but it would be good to
justify this with specifications.
Also cover a source of flakiness: if the test runs under extreme load,
then we could have
start
create-file
<long delay>
read
end
which would register as an unmodified file. Avoid this by skipping the
test if end-start is too big.
[msohn]:
- downported from master to stable-5.1
- skip test if resolution is below 10ms
- adjust safety factor to 1.25 for resolutions above 100ms
Change-Id: I87d2cf035e01c44b7ba8364c410a860aa8e312ef
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
In LockFile#waitForStatChange wait in units of file time resolution
Since we now measure file time resolution we can use it to replace the
hard coded wait time of 25ms. FileSnapshot#equals will return true until
the mtime of the old (o) and the new FileSnapshot (n) differ by at least
one file time resolution.
Change-Id: Icb713a80ce9eb929242ed083406bfb6650c72223
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Cache FileStoreAttributeCache entries since looking up FileStore for a
file may be expensive on some platforms.
Implement a simple LRU cache based on ConcurrentHashMap using a simple
long counter to order access to cache entries.
Change-Id: I4881fa938ad2f17712c05da857838073a2fc4ddb
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Also-By: Marc Strapetz <marc.strapetz@syntevo.com>
Fix FileSnapshot#save(long) and FileSnapshot#save(Instant)
Use the fallback timestamp resolution as already described in the
javadoc of these methods. Using zero file timestamp resolution doesn't
make sense.
Change-Id: Iaad2a0f99c3be3678e94980a0a368181b6aed38c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Persist minimal racy threshold and allow manual configuration
To enable persisting the minimal racy threshold per FileStore add a
new config option to the user global git configuration:
- Config section is "filesystem"
- Config subsection is concatenation of
- Java vendor (system property "java.vendor")
- Java version (system property "java.version")
- FileStore's name, on Windows we use the attribute volume:vsn instead
since the name is not necessarily unique.
- separated by '|'
e.g.
"AdoptOpenJDK|1.8.0_212-b03|/dev/disk1s1"
The same prefix is used as for filesystem timestamp resolution, so
both values are stored in the same config section
- The config key for minmal racy threshold is "minRacyThreshold" as a
time value, supported time units are those supported by
DefaultTypedConfigGetter#getTimeUnit
- measure for 3 seconds to limit runtime which depends on hardware, OS
and Java version being used
If the minimal racy threshold is configured for a given FileStore the
configured value is used instead of measuring it.
When the minimal racy threshold was measured it is persisted in the user
global git configuration.
Rename FileStoreAttributeCache to FileStoreAttributes since this class
is now declared public in order to enable exposing all attributes in one
object.
Example:
[filesystem "AdoptOpenJDK|11.0.3|/dev/disk1s1"]
timestampResolution = 7000 nanoseconds
minRacyThreshold = 3440 microseconds
Change-Id: I22195e488453aae8d011b0a8e3276fe3d99deaea
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Also-By: Marc Strapetz <marc.strapetz@syntevo.com>
Measure minimum racy interval to auto-configure FileSnapshot
By running FileSnapshotTest#detectFileModified we found that the sum of
measured filesystem timestamp resolution and measured clock resolution
may yield a too small interval after a file has been modified which we
need to consider racily clean. In our tests we didn't find this behavior
on all systems we tested on, e.g. on MacOS using APFS and Java 8 and 11
this effect was not observed.
On Linux (SLES 15, kernel 4.12.14-150.22-default) we collected the
following test results using Java 8 and 11:
In 23-98% of 10000 test runs (depending on filesystem type and Java
version) the test failed, which means the effective interval which needs
to be considered racily clean after a file was modified is larger than
the measured file timestamp resolution.
"delta" is the observed interval after a file has been modified but
FileSnapshot did not yet detect the modification:
"resolution" is the measured sum of file timestamp resolution and clock
resolution seen in Java.
Java version filesystem failures resolution min delta max delta
1.8.0_212-b04 btrfs 98.6% 1 ms 3.6 ms 6.6 ms
1.8.0_212-b04 ext4 82.6% 3 ms 1.1 ms 4.1 ms
1.8.0_212-b04 xfs 23.8% 4 ms 3.7 ms 3.9 ms
1.8.0_212-b04 zfs 23.1% 3 ms 4.8 ms 5.0 ms
11.0.3+7 btrfs 98.1% 3 us 0.7 ms 4.7 ms
11.0.3+7 ext4 98.1% 6 us 0.7 ms 4.7 ms
11.0.3+7 xfs 98.5% 7 us 0.1 ms 8.0 ms
11.0.3+7 zfs 98.4% 7 us 0.7 ms 5.2 ms
Mac OS
1.8.0_212 APFS 0% 1 s
11.0.3+7 APFS 0% 6 us
The observed delta is not distributed according to a normal gaussian
distribution but rather random in the observed range between "min delta"
and "max delta".
Run this test after measuring file timestamp resolution in
FS.FileAttributeCache to auto-configure JGit since it's unclear what
mechanism is causing this effect.
In FileSnapshot#isRacyClean use the maximum of the measured timestamp
resolution and the measured "delta" as explained above to decide if a
given FileSnapshot is to be considered racily clean. Add a 30% safety
margin to ensure we are on the safe side.
Change-Id: I1c8bb59f6486f174b7bbdc63072777ddbe06694d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
We should not list the complete cache but only show the cache entry at
hand.
Change-Id: I22be2a4dcbf0145155e23f2389bfcf5662cf23a6
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Repeat the test 10000 times to get statistics if measured
fsTimestampResolution is working in practice to detect racy git
situations.
Add a class to compute statistics for this test. Log delta between
lastModified and time when FileSnapshot failed to detect modification.
This happens if the racy git limit determined by measuring filesystem
timestamp resolution and clock resolution is too small. If it would be
correct FileSnapshot would always detect modification or mark it
modified if time since modification is smaller than the racy git limit.
Change-Id: Iabe7af1a7211ca58480f8902d4fa4e366932fc77
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Measure stored timestamp resolution instead of time to touch file
Measure granularity of timestamps stored in the filesystem by setting
and then getting lastModified timestamp until the read value changed.
Increase increment exponentially to limit number of iterations starting
with 1 microsecond since Java's FileTime (up to Java 12) truncates
timestamps to 1 microsecond resolution. The chosen algorithm yields 2000
steps between 1 ms and 2.5 s.
Also measure clock resolution and add that for the total timestamp
resolution. This avoids systematic measurement errors introduced by
doing IO to touch a file.
Change-Id: I9b37138619422452373e298d9d8c7cb2c384db3f
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
We should not use configuration when creating FileSnapshot when
accessing FileBasedConfig.
Change-Id: Ic521632870f18bb004751642b9d30648dd94049a
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Use Instant instead of milliseconds for filesystem timestamp handling
This enables higher file timestamp resolution on filesystems like ext4,
Mac APFS (1ns) or NTFS (100ns) providing high timestamp resolution on
filesystem level.
Note:
- on some OSes Java 8,9 truncate milliseconds, see
https://bugs.openjdk.java.net/browse/JDK-8177809, fixed in Java 10
- UnixFileAttributes truncates timestamp resolution to microseconds when
converting the internal representation to FileTime exposed in the API,
see https://bugs.openjdk.java.net/browse/JDK-8181493
- WindowsFileAttributes also provides only microsecond resolution
Change-Id: I25ffff31a3c6f725fc345d4ddc2f26da3b88f6f2
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>