Externalize warning message in RefDirectory.delete()
Change-Id: Icec16c01853a3f5ea016d454b3d48624498efcce
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
(cherry picked from commit 5e68fe245f)
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Suppress warning for trying to delete non-empty directory
This is actually a fairly common occurrence; deleting the parent
directories can work only if the file deleted was the last one
in the directory.
Bug: 537872
Change-Id: I86d1d45e1e2631332025ff24af8dfd46c9725711
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
(cherry picked from commit d9e767b431)
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
FS_POSIX.createNewFile(File) failed to properly implement atomic file
creation on NFS using the algorithm [1]:
- name of the hard link must be unique to prevent that two processes
using different NFS clients try to create the same link. This would
render nlink useless to detect if there was a race.
- the hard link must be retained for the lifetime of the file since we
don't know when the state of the involved NFS clients will be
synchronized. This depends on NFS configuration options.
To fix these issues we need to change the signature of createNewFile
which would break API. Hence deprecate the old method
FS.createNewFile(File) and add a new method createNewFileAtomic(File).
The new method returns a LockToken which needs to be retained by the
caller (LockFile) until all involved NFS clients synchronized their
state. Since we don't know when the NFS caches are synchronized we need
to retain the token until the corresponding file is no longer needed.
The LockToken must be closed after the LockFile using it has been
committed or unlocked. On Posix, if core.supportsAtomicCreateNewFile =
false this will delete the hard link which guarded the atomic creation
of the file. When acquiring the lock fails ensure that the hard link is
removed.
[1] https://www.time-travellers.org/shane/papers/NFS_considered_harmful.html
also see file creation flag O_EXCL in
http://man7.org/linux/man-pages/man2/open.2.html
Change-Id: I84fcb16143a5f877e9b08c6ee0ff8fa4ea68a90d
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
GC: Avoid logging errors when deleting non-empty folders
I88304d34c and Ia555bce00 modified the way errors are handled when
trying to delete non-empty reference folders. Before, this error was
silently ignored as it was considered an expected output. Now, every
failed folder delete is logged which can be noisy.
Ignore the DirectoryNotEmptyException but log any other error avoiding
deletion of an eligible folder.
Signed-off-by: Hector Oswaldo Caballero <hector.caballero@ericsson.com>
Change-Id: I194512f67885231d62c03976ae683e5cc450ec7c
Suppress warning for trying to delete non-empty directory
This is actually a fairly common occurrence; deleting the parent
directories can work only if the file deleted was the last one
in the directory.
Bug: 537872
Change-Id: I86d1d45e1e2631332025ff24af8dfd46c9725711
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Since I3870cadb4, GC task was always delegated to an executor even when
background option was set to false. This was an issue because if more
than one GC object was instantiated and executed in parallel, only one GC
was actually running because of the single thread executor.
Change-Id: I8c587d22d63c1601b7d75914692644a385cd86d6
Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>
Remove completely the empty directories under refs/<namespace>
including the first level partition of the changes, when they are
completely empty.
Bug: 536777
Change-Id: I88304d34cc42435919c2d1480258684d993dfdca
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Use java.nio to delete path to get detailed errors
Get the full IOException of the reason why a directory
cannot be removed during GC.
Change-Id: Ia555bce009fa48087a73d677f1ce3b9c0b685b57
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
After packaging references, the folders containing these references are
not deleted. In a busy repository, this causes operations to slow down
as traversing the references tree becomes longer.
Delete empty reference folders after the loose references have been
packed.
To avoid deleting a folder that was just created by another concurrent
operation, only delete folders that were not modified in the last 30
seconds.
Signed-off-by: Hector Oswaldo Caballero <hector.caballero@ericsson.com>
Change-Id: Ie79447d6121271cf5e25171be377ea396c7028e0
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Log as warning when an attempt to remove a directory
fails. This helps troubleshooting some bugs like the GC leaving
behind empty directories.
Change-Id: Idb94ce17f8be9668a970c7ecae31436bf434073c
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
From the javadoc for Files.list:
"The returned stream encapsulates a DirectoryStream. If timely disposal
of file system resources is required, the try-with-resources construct
should be used to ensure that the stream's close method is invoked
after the stream operations are completed."
This is the only call to Files#newDirectoryStream that is not already in
a try-with-resources.
Change-Id: I91e6c56b5d74e8435457ad6ed9e6b4b24d2aa14e
(cherry picked from commit 1c16ea4601)
The intent with the setCompressionLevel and checkExisting methods (which
are already public) is for callers to be able to call them, but they
can't do that if the class itself is not public.
Change-Id: I014044fec3bfa1d33775500345efd60eb5d45bde
PackInserter: Ensure objects are written at the end of the pack
When interleaving reads and writes from an unflushed pack, we forgot to
reset the file pointer back to the end of the file before writing more
new objects. This had at least two unfortunate effects:
* The pack data was potentially corrupt, since we could overwrite
previous portions of the file willy-nilly.
* The CountingOutputStream would report more bytes read than the size
of the file, which stored the wrong PackedObjectInfo, which would
cause EOFs during reading.
We already had a test in PackInserterTest which was supposed to catch
bugs like this, by interleaving reads and writes. Unfortunately, it
didn't catch the bug, since as an implementation detail we always read a
full buffer's worth of data from the file when inflating during
readback. If the size of the file was less than the offset of the object
we were reading back plus one buffer (8192 bytes), we would completely
accidentally end up back in the right place in the file.
So, add another test for this case where we read back a small object
positioned before a large object. Before the fix, this test exhibited
exactly the "Unexpected EOF" error reported at crbug.com/gerrit/7668.
Change-Id: I74f08f3d5d9046781d59e5bd7c84916ff8225c3b
Replace explicit calls to initCause where possible
Where the exception being thrown has a constructor that takes a
Throwable, use that instead of instantiating the exception and then
explicitly calling initCause.
Change-Id: I06a0df407ba751a7af8c1c4a46f9e2714f13dbe3
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
CorruptObjectException has a constructor that takes Throwable and
calls initCause with it. Use that instead of instantiating the
exception and explicitly calling initCause.
Change-Id: I1f2747d6c4cc5249e93401b9787eb4ceb50cb995
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Use new StoredObjectRepresentationNotAvailableException constructor
In 5e7eed4 a new StoredObjectRepresentationNotAvailableException
constructor was added, that takes a Throwable to initialize the
exception cause.
Update more call sites to use this constructor instead of first
instantiating it and explicitly calling initCause().
All callers now use the new constructor, so annotate the other one as
deprecated.
Change-Id: I6d2a7e289a95f0360ddebf904cfd8b6c18fef10c
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
When we are cloning we have no refs at all yet, and there cannot
(or at least should not) be any other thread doing something with
refs yet.
Locking loose refs is thus not needed, since there are no loose
refs yet and nothing should be trying to create them concurrently.
Let's skip the whole loose ref locking when we are cloning a repository.
As a result, JGit will write the refs directly to the packed-refs
file, and will not create the refs/remotes/ directories nor the
lock files underneath when cloning and packed refs are used. Since
no lock files are created, any problems on case-insensitive file
systems with tag or branch names that differ only in case are avoided
during cloning.
Detect if we are cloning based on the following heuristics:
* HEAD is a dangling symref
* There is no loose ref
* There is no packed-refs file
Note, however, that there may still be problems with such tag or
branch names later on. This is primarily a five-minutes-past-twelve
stop-gap measure to resolve the referenced bug, which affects the
Oxygen.2 release.
Bug: 528497
Change-Id: I57860c29c210568165276a123b855e462b6a107a
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
When a GC operation is interrupted, temporary packs and indexes can be
left on the pack folder. In big, busy repositories this can lead to
significant amounts of wasted disk space if this interruption is done
with a certain frequency.
Remove stale temporary packs and indexes at the end of the GC process so
they do not accumulate. To avoid interfering with a possible concurrent
JGit GC process in the same repository, only delete temporary files that
are older than one day.
Change-Id: If9b6c1e57fac8a6a0ecc0a703089634caba4caae
Signed-off-by: Hector Caballero <hector.caballero@ericsson.com>
When running on NFS there was a chance that JGits LockFile
semantic is broken because File#createNewFile() may allow
multiple clients to create the same file in parallel. This
change provides a fix which is only used when the new config
option core.supportsAtomicCreateNewFile is set to false. The
default for this option is true. This option can only be set in the
global or the system config file. The repository config file is not
taken into account in this case.
If the config option core.supportsAtomicCreateNewFile is true
then File#createNewFile() is trusted and the behaviour doesn't
change.
But if core.supportsAtomicCreateNewFile is set to false then after
successful creation of the lock file a hardlink to that lock file is
created and the attribute nlink of the lock file is checked to be 2. If
multiple clients manage to create the same lock file nlink would be
greater than 2 showing the error.
This expensive workaround is described in
https://www.time-travellers.org/shane/papers/NFS_considered_harmful.html
section III.d) "Exclusive File Creation"
Change-Id: I3d2cc48d8eb280d5f7039eb94da37804f903be6a
Honor trustFolderStats also when reading packed-refs
Then list of packed refs was cached in RefDirectory based on mtime of
the packed-refs file. This may fail on NFS when attributes are cached.
A cached mtime of the packed-refs file could cause JGit to trust the
cached content of this file and to overlook that the file is modified.
Honor the config option trustFolderStats and always read the packed-refs
content if the option is false. By default this option is set to true
and this fix is not active.
Change-Id: I2b65cfaa8f4aba2efbf8a5e865d3f09f927e2eec
So far, in order to get the pack directory it was necessary to resolve
it from the object directory. This resolution is already done when
creating the object directory, so simplify the call by just adding a
getter to the pack directory.
Change-Id: I69e783141dc6739024e8b3d5acc30843edd651a7
Signed-off-by: Hector Caballero <hector.caballero@ericsson.com>
When invoking File.toPath(), an (unchecked) InvalidPathException may be
thrown which should be converted to a checked IOException.
For now, we will replace File.toPath() by FileUtils.toPath() only for
code which can already handle IOExceptions.
Change-Id: I0f0c5fd2a11739e7a02071adae9a5550985d4df6
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Applications that use ObjectInserters to create lots of individual
objects may prefer to avoid cluttering up the object directory with
loose objects. Add a specialized inserter implementation that produces a
single pack file no matter how many objects. This inserter is loosely
based on the existing DfsInserter implementation, but is simpler since
we don't need to buffer blocks in memory before writing to storage.
An alternative for such applications would be to write out the loose
objects and then repack just those objects later. This operation is not
currently supported with the GC class, which always repacks existing
packs when compacting loose objects. This in turn requires more
CPU-intensive reachability checks and extra I/O to copy objects from old
packs to new packs.
So, the choice was between implementing a new variant of repack, or not
writing loose objects in the first place. The latter approach is likely
less code overall, and avoids unnecessary I/O at runtime.
The current implementation does not yet support newReader() for reading
back objects.
Change-Id: I2074418f4e65853b7113de5eaced3a6b037d1a17
ObjectDirectory: Remove last modified check in insertPack
GC explicitly handles the case where a new pack has the same name as an
existing pack due to it containing the exact same set of objects. In
this case, the pack passed to insertPack will have the same name as an
existing pack, but it will also almost certainly have a later mtime than
the existing pack.
The loop in insertPack tried to short-circuit when inserting a new pack,
to avoid walking more of the pack list than necessary. Unfortunately,
this means it will never get to the check for an identical name,
resulting in a duplicate entry for the same PackFile in the pack list.
Remove the short-circuit so that insertPack does not insert a duplicate
entry.
Change-Id: I00711b28594622ad3bd104332334e8a3592cda7f
Allow creating symbolic references with link, and deleting them or
switching to ObjectId with unlink. How this happens is up to the
individual RefDatabase.
The default implementation detaches RefUpdate if a symbolic reference
is involved, supporting these command instances on RefDirectory.
Unfortunately the packed-refs file does not support storing symrefs,
so atomic transactions involving more than one symref command are
failed early.
Updating InMemoryRepository is deferred until reftable lands, as I
plan to switch InMemoryRepository to use reftable for its internal
storage representation.
Change-Id: Ibcae068b17a2fc6d958f767f402a570ad88d9151
Signed-off-by: Minh Thai <mthai@google.com>
Signed-off-by: Terry Parker <tparker@google.com>
ReflogWriter: Align auto-creation defaults with C git
Per git-config(1), core.logAllRefUpdates auto-creates reflogs for HEAD
and for refs under heads, notes, tags, and for HEAD. Add notes and
remove stash from ReflogWriter#shouldAutoCreateLog. Explicitly force
writing reflogs for refs/stash at call sites, now that this is
supported.
Change-Id: I3a46d2c2703b7c243e0ee2bbf6948279800c485c
Support force writing reflog on a per-update basis
Even if a repository has core.logAllRefUpdates=true, ReflogWriter does
not create reflog files unless the refs are under a hard-coded list of
prefixes, or unless the forceWrite bit is set. Expose the forceWrite bit
on a per-update basis in RefUpdate/BatchRefUpdate/ReceiveCommand,
creating RefLogWriters as necessary.
Change-Id: Ifc851fba00f76bf56d4134f821d0576b37810f80
Ensure ReflogWriter only works with a RefDirectory
The ReflogWriter constructor just took a Repository and called
getDirectory() on it to figure out the reflog dirs, but not all
Repository instances use this storage format for reflogs, so it's
incorrect to attempt to use ReflogWriter when there is not a
RefDirectory directly involved. In practice, ReflogWriter was mostly
only used by the implementation of RefDirectory, so enforcing this is
mostly just shuffling around calls in the same internal package.
The one exception is StashDropCommand, which writes to a reflog lock
file directly. This was a reasonable implementation decision, because
there is no general reflog interface in JGit beyond using
(Batch)RefUpdate to write new entries to the reflog. So to implement
"git stash drop <N>", which removes an arbitrary element from the
reflog, it's fair to fall back to the RefDirectory implementation.
Creating and using a more general interface is well beyond the scope of
this change.
That said, the old behavior of writing out the reflog file even if
that's not the reflog format used by the given Repository is clearly
wrong. Fail fast in this case instead.
Change-Id: I9bd4b047bc3e28a5607fd346ec2400dde9151730
Fix missing RefsChangedEvent when packed refs are used
With atomic ref updates using packed refs, JGit did not fire a
RefsChangedEvent. This resulted in a user-visible regression in
EGit: the UI would not update after a "Fetch from upstream...".
Presumably it would also make Gerrit miss out on ref changes?
Strengthen the BatchRefUpdateTest by also asserting the expected
number of RefsChangedEvents, and ensure modCnt is incremented in
RefDirectory.commitPackedRefs() when refs really changed (as opposed
to some internal housekeeping operation, such as packing loose refs).
Bug: 521296
Change-Id: Ia985bda1d99f45a5f89c8020ca4845e7a66e743e
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Remove workaround for bug in Java's ReferenceQueue
Sun's Java 5, 6, 7 implementation had a bug [1] where a Reference can be
enqueued and dequeued twice on the same reference queue due to a race
condition within ReferenceQueue.enqueue(Reference).
This bug was fixed for Java 8 [2] hence remove the workaround.
[1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6837858
[2] http://hg.openjdk.java.net/jdk8/jdk8/jdk/rev/858c75eb83b5
Change-Id: I2deeb607e3d237f9f825a207533acdee305c7e73
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fix exception handling for opening bitmap index files
When creating a new PackFile instance it is specified whether this pack
has an associated bitmap index file or not. This information is cached
and the public method getBitmapIndex() will always assume a bitmap index
file must exist if the cached data tells so. But it may happen that the
packfiles are repacked during a gc in a different process causing the
packfile, bitmap-index and index file to be deleted. Since JGit still
has an open FileHandle on the packfile this file is not really deleted
and can still be accessed. But index and bitmap index file are deleted.
Fix getBitmapIndex() to invalidate the cached packfile instance if such
a situation occurs.
This problem showed up when a gerrit server was serving repositories
which where garbage collected with native git regularly. Fetch and
clone commands for certain repositories failed permanently after a
native git gc had deleted old bitmap index files.
Change-Id: I8e620bec74dd3f310ba42024f9a657062f868f0e
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This matches the proposal that has been discussed at length on
git-core mailing list and seems to be the accepted convention.
Change-Id: I9f6ab15144826893d1e2a4b48a2d657d6dd445ec
Happily, most anonymous SectionParser implementations can be replaced
with FooConfig::new, as long as the constructor takes a single Config
arg. Many of these, the non-public ones, can in turn be inlined. A few
remaining SectionParsers can be lambdas.
Change-Id: I3f563e752dfd2007dd3a48d6d313d20e2685943a
RefDirectory: Add in-process fair lock for atomic updates
In a server scenario such as Gerrit Code Review, there may be many
atomic BatchRefUpdates contending for locks on both the packed-refs file
and some subset of loose refs. We already retry lock acquisition to
improve this situation slightly, but we can do better by using an
in-process lock. This way, instead of retrying and potentially exceeding
their timeout, different threads sharing the same Repository instance
can wait on a fair lock without having to touch the disk lock. Since a
server is probably already using RepositoryCache anyway, there is a high
likelihood of reusing the Repository instance.
Change-Id: If5dd1dc58f0ce62f26131fd5965a0e21a80e8bd3
RefDirectory: Retry acquiring ref locks with backoff
If a repo frequently uses PackedBatchRefUpdates, there is likely to be
contention on the packed-refs file, so it's not appropriate to fail
immediately the first time we fail to acquire a lock. Add some logic to
RefDirectory to support general retrying of lock acquisition.
Currently, there is a hard-coded wait starting at 100ms and backing off
exponentially to 1600ms, for about 3s of total wait. This is no worse
than the hard-coded backoff that JGit does elsewhere, e.g. in
FileUtils#delete. One can imagine a scheme that uses per-repository
configuration of backoff, and the current interface would support this
without changing any callers.
Change-Id: I4764e11270d9336882483eb698f67a78a401c251
On-disk reflogs are not stored in the packed-refs file, so we cannot
ensure atomic updates. We choose the lesser evil of dropping failed
reflog updates on the floor, rather than throwing an exception even
though the underlying ref updates succeeded.
Add tests for reflogs to BatchRefUpdateTest.
Change-Id: Ia456ba9e36af8e01fde81b19af46a72378e614cd
The existing packed-refs file provides a mechanism for implementing
atomic multi-ref updates without any changes to the on-disk format or
lockfile protocol. We just need to make sure that there are no loose
refs involved in the transaction, which we can achieve by packing the
refs while holding locks on all loose refs. Full details of the
algorithm are in the PackedBatchRefUpdate javadoc.
This change does not implement reflog support, which will come in a
later change.
Change-Id: I09829544a0d4e8dbb141d28c748c3b96ef66fee1
The RefDirectory implementation of doDelete never considered whether to
delete a symref or its leaf, because the detachingSymbolicRef bit was
never exposed from RefUpdate. The behavior was thus incorrectly to
always delete the symref, never the leaf.
There was no test for this behavior. The only thing that attempted to be
a test was testDeleteHeadInBareRepo, but this test was broken for
reasons unrelated to this bug. Specifically, it set the leaf to point to
a completely nonexistent object, and then asserted that deleting HEAD
resulted in NO_CHANGE. The only reason this test ever passed is because
of a quirk of updateImpl, which treats a missing object as the same as
null. This quirk aside, the test wasn't really testing the right thing.
Turn this into a real test by writing out a real object and pointing the
leaf at that.
Also, add a test for the detachingSymbolicRef case, i.e. deleting the
symref and leaving the leaf alone.
Change-Id: Ib96d2a35b4f99eba0734725486085fc6f9d78aa5