BatchRefUpdate: repro racy atomic update, and fix it
PackedBatchRefUpdate was creating a new packed-refs list that was
potentially unsorted. This would be papered over when the list was
read back from disk in parsePackedRef, which detects unsorted ref
lists on reading, and sorts them. However, the BatchRefUpdate also
installed the new (unsorted) list in-memory in
RefDirectory#packedRefs.
With the timestamp granularity code committed to stable-5.1, we can
more often accurately decide that the packed-refs file is clean, and
will return the erroneous unsorted data more often. Unluckily timed
delays also cause the file to be clean, hence this problem was
exacerbated under load.
The symptom is that refs added by a BatchRefUpdate would stop being
visible directly after they were added. In particular, the Gerrit
integration tests uses BatchRefUpdate in its setup for creating the
Admin group, and then tries to read it out directly afterward.
The tests recreates one failure case. A better approach would be to
revise RefList.Builder, so it detects out-of-order lists and
automatically sorts them.
Fixes https://bugs.eclipse.org/bugs/show_bug.cgi?id=548716 and
https://bugs.chromium.org/p/gerrit/issues/detail?id=11373.
Bug: 548716
Change-Id: I613c8059964513ce2370543620725b540b3cb6d1
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The only usage of this test iterator was removed in df637928d. Hence
delete this iterator and associated test.
Change-Id: I47710133ec3edc675c21db210960c024982668c6
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
(cherry picked from commit a024759cf5)
This test case assumed file system timestamp resolution of 1 second. On
filesystems with a finer resolution this test fails since the index
entry is only smudged if the file index entry's lastModified and the
lastModified of the git index itself are within the same filesystem
timer tick. Fix this by ensuring that these timestamps are identical
which should work for any filesystem timer resolution.
Bug: 548188
Change-Id: Id84d59e1cfeb48fa008f8f27f2f892c4f73985de
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
(cherry picked from commit 1159f9dd7c)
Change RacyGitTests to create a racy git situation in a stable way
By using File#setLastModified, we can create a racy git situation
stably.
Tested with --runs_per_test=100
Bug: 526111
Change-Id: I60b3632d353e19f335668325aa603640be423f58
Signed-off-by: Masaya Suzuki <masayasuzuki@google.com>
(cherry picked from commit df637928d2)
Fix pack files scan when filesnapshot isn't modified
Do not reload packfiles when their associated filesnapshot is not
modified on disk compared to the one currently stored in memory.
Fix the regression introduced by fef78212 which, in conjunction with
core.trustfolderstats = false, caused any lookup of objects inside
the packlist to loop forever when the object was not found in the pack
list.
Bug: 546190
Change-Id: I38d752ebe47cefc3299740aeba319a2641f19391
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fix GC to delete empty fanout directories after repacking
The prune method did not delete empty fanout directories when loose
objects moved to a new pack file but only when loose unreferenced
objects were pruned.
Change-Id: Ia068f4914c54d9cf9f40b75e8ea50759402b5000
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Due to finite filesystem timestamp resolution the last modified
timestamp of files cannot detect file changes which happened in the
immediate past (less than one filesystem timer tick ago).
Read and consider file size also, so that differing file size can help
to more accurately detect file changes without reading the file content.
Use bulk read to avoid multiple stat calls to retrieve file attributes.
Change-Id: I974288fff78ac78c52245d9218b5639603f67a46
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
when multiple match options are given in git describe the result must
not depend on the order of the match options. JGit wrongly picked the
first match using the match options in the order they were defined. Fix
this by concatenating the streams of matching tags for all match options
and then choosing the first match on the concatenated stream sorted in
tie break order.
See https://git-scm.com/docs/git-describe#git-describe---matchltpatterngt
Change-Id: Id01433d35fa16fb4c30526605bee041ac1d954b2
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Correct behaviour as git 1.7.1.1 is to resolve tie-breakers to choose
the most recent tag.
https://github.com/git/git/blob/master/Documentation/RelNotes/1.7.1.1.txt:
* "git describe" did not tie-break tags that point at the same commit
correctly; newer ones are preferred by paying attention to the
tagger date now.
Bug: 538610
Change-Id: Ib0b2a301997bb7f75935baf7005473f4de952a64
Signed-off-by: Håvard Wall <haavardw@gmail.com>
The main concern are submodule urls starting with '-' that could pass as
options to an unguarded tool.
Pass through the parser the ids of blobs identified as .gitmodules
files in the ObjectChecker. Load the blobs and parse/validate them
in SubmoduleValidator.
Change-Id: Ia0cc32ce020d288f995bf7bc68041fda36be1963
Signed-off-by: Ivan Frade <ifrade@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
ObjectChecker: Report .gitmodules files found in the pack
In order to validate .gitmodules files, we first need to find them
in the incoming pack.
Do it in the ObjectChecker stage. Check in the tree objects if they
point to a .gitmodules file and report the tree id and the .gitmodules
blob id.
This can be used later to check if the file is in the root of the
project and if the contents are good.
While we're here, make isMacHFSGit more accurate by detecting variants
of filenames that vary in case.
[jn: tweaked NTFS and HFS+ checking; added more tests]
Change-Id: I70802e7d2c1374116149de4f89836b9498f39582
Signed-off-by: Ivan Frade <ifrade@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
SubmoduleAddCommand: Reject submodule URIs that look like cli options
In C git versions before 2.19.1, the submodule is fetched by running
"git clone <uri> <path>". A URI starting with "-" would be interpreted
as an option, causing security problems. See CVE-2018-17456.
Refuse to add submodules with URIs, names or paths starting with "-",
that could be confused with command line arguments.
[jn: backported to JGit 4.7.y, bringing portions of Masaya Suzuki's
dotdot check code in v5.1.0.201808281540-m3~57 (Add API to specify
the submodule name, 2018-07-12) along for the ride]
Change-Id: I2607c3acc480b75ab2b13386fe2cac435839f017
Signed-off-by: Ivan Frade <ifrade@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This test was never being run. Since it was introduced it was
named "notest.." which meant it didn't run with JUnit3, and
since it is not annotated @Test it also doesn't run with JUnit4.
When compiling with Bazel 0.6.0, error-prone raises an error
that the public method is not annotated with @Ignore or @Test.
Given that the test has never been run anyway, we can just
remove it.
Bug: 525415
Change-Id: Ie9a54f89fe42e0c201f547ff54ff1d419ce37864
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
If packed refs are used, duplicate updates result in an exception
because JGit tries to lock the same lock file twice. With non-atomic
ref updates, this used to work, since the same ref would simply be
locked and updated twice in succession.
Let's be more lenient in this case and remove duplicates before
trying to do the ref updates. Silently skip duplicate updates
for the same ref, if they both would update the ref to the same
object ID. (If they don't, behavior is undefined anyway, and we
still throw an exception.)
Add a test that results in a duplicate ref update for a tag.
Bug: 529400
Change-Id: Ide97f20b219646ac24c22e28de0c194a29cb62a5
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Fetch(Process): should tolerate duplicate refspecs
Bug: 529314
Change-Id: I91eaeda8a988d4786908fba6de00478cfc47a2a2
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Change-Id: I2442394fb7eae5b3715779555477dd27b274ee83
Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Remove completely the empty directories under refs/<namespace>
including the first level partition of the changes, when they are
completely empty.
Bug: 536777
Change-Id: I88304d34cc42435919c2d1480258684d993dfdca
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
After packaging references, the folders containing these references are
not deleted. In a busy repository, this causes operations to slow down
as traversing the references tree becomes longer.
Delete empty reference folders after the loose references have been
packed.
To avoid deleting a folder that was just created by another concurrent
operation, only delete folders that were not modified in the last 30
seconds.
Signed-off-by: Hector Oswaldo Caballero <hector.caballero@ericsson.com>
Change-Id: Ie79447d6121271cf5e25171be377ea396c7028e0
Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
ObjectIdSerializer: Support serialization of known non-null ObjectId
The implementation of ObjectIdSerializer, added in change I7599cf8bd,
is not equivalent to the original implementation in Gerrit [1].
The Gerrit implementation provides separate methods to (de)serialize
instances of ObjectId that are known to be non-null. In these methods,
no "marker" is written to the stream. Replacing Gerrit's implementation
with ObjectIdSerializer [2] broke persistent caches because it started
writing markers where they were not expected [3].
Since ObjectIdSerializer is included in JGit 4.11 we can't change the
existing #write and #read methods. Keep those as-is, but extend the
Javadoc to clarify that they support possibly null ObjectId instances.
Add new methods #writeWithoutMarker and #readWithoutMarker to support
the cases where the ObjectId is known to be non-null and the marker
should not be written to the serialization stream.
Also:
- Replace the hard-coded `0` and `1` markers with constants that can
be linked from the Javadocs.
- Include the marker value in the "Invalid flag before ObjectId"
exception message.
[1] https://gerrit-review.googlesource.com/c/gerrit/+/9792
[2] https://gerrit-review.googlesource.com/c/gerrit/+/165851
[3] https://gerrit-review.googlesource.com/c/gerrit/+/165952
Change-Id: Iaf84c3ec32ecf83efffb306fdb4940cc85740f3f
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Fix DiffFormatter for diffs against working tree with autocrlf=true
The WorkingTreeSource produced an ObjectLoader that returned
inconsistent sizes: the file size in getSize(), but then a
correctly filtered smaller stream in openStream(). This resulted
either in an IOE "short read of block" or in an EOFException
depending on the resulting filtered size.
Fix this by ensuring that getSize() does return the size of the
filtered stream.
Bug: 530106
Change-Id: I7c7c85036047dc10030ed29c1d5a6c7f34f2bdff
Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Replace hard-coded "UTF-8" string with the constant.
Change-Id: Ie812add2df28e984090563ec7c6e2c0366616424
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>