reftable: enforce ascending order in sortAndWriteRefs
MergedReftableTest#scanDuplicates tests whether we can write duplicate
keys in a merged reftable. Apparently, the first key appearing should
get precedence, and this works because the sort() algorithm on ordered
collections is stable.
This is potentially confusing behavior, because you can write data
into the table that cannot be retrieved (Merged table can only have
one entry per key), and the APIs such as exactRef() only return a
single value.
Make this consistent with behavior introduced in I04f55c481 "reftable:
enforce ordering for ref and log writes" by considering a duplicate key
in sortAndWriteRefs as a fatal runtime error.
Change-Id: I1eedd18f028180069f78c5c467169dcfe1521157
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
The effect of assert is defined by compiler flags, so this code
introduced a potential vector for corruption.
Change-Id: I12197432e4351a5bd4aa24d352a19937721845c3
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Enable and fix "Statement unnecessarily nested within else clause" warnings
Since [1] the gerrit project includes jgit as a submodule, and has this
warning enabled, resulting in 100s of warnings in the console.
Also enable the warning here, and fix them.
At the same time, add missing braces around adjacent and nearby one-line
blocks.
[1] https://gerrit-review.googlesource.com/c/gerrit/+/227897
Change-Id: I81df3fc7ed6eedf6874ce1a3bedfa727a1897e4c
Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
This introduces ReftableBatchRefUpdate and ReftableDatabase, as
generic classes, with some code moved to DfsReftableBatchRefUpdate and
DfsReftableDatabase.
Clarify thread-safety requirements by asserting locked status in
accessors, and acquiring locks in callers. This does not fix threading
problems, because ReftableBatchRefUpdate already wraps the whole
transaction in a lock.
This also fixes a number of bugs in ReftableBatchRefUpdate:
* non-atomic updates should not bail on first failure
* isNameConflicting should also check for conflicts between names that
are added and removed in the BatchRefUpdate.
Change-Id: I5ec91173ea9a0aa19da444c8c0b2e0f4e8f88798
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
On changing a ref, the old SHA1 is not updated in the object => ref
mapping. This means search by object ID may still turn up a ref from
deeper within the stack. To fix this, check all refs produced by the
merged iterator against the merged reftables.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Change-Id: I41e9cd395b0608eedeeaead0a9fd997238d747c9
MergedReftable is not used as an AutoCloseable, because closing tables
is currently handled by DfsReftableStack#close.
Encode that a MergedReftable is a list of ReftableReaders. The previous
code suggested that we could form nested trees of MergedReftables,
which is not how we use reftables.
Change-Id: Icbe2fee8a5a12373f45fc5f97d8b1a2b14231c96
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
This makes the intended use of the classes more clear. It also
simplifies generic functions that write reftables: they only need a
ReftableWriter as argument, as the stream is carried within the
ReftableWriter.
Change-Id: Idbb06f89ae33100f0c0b562cc38e5b3b026d5181
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
This makes maxUpdateIndex() available in MergedReftable, so we can
know generically at which index to create the next reftable in a
stack.
Change-Id: Ia2314bc57c8b5dd7e69d5e61096fdce1d35abd11
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
reftable: add OutputStream argument to ReftableWriter constructor
This lets us write reftables generically with functions that take
just ReftableWriter argument
Change-Id: I7285951f62f9bd4c78e8f0de194c077d51fa4e51
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
reftable: read file footer in ReftableReader#allRefs
allRefs determined the end of the ref block without accounting for
index or log blocks. This could cause other blocks to be interpreted
as ref blocks, leading to "invalid block" error messages.
Change-Id: I7b9323e7d5e0e7d64535b3ec1efd576aed1e9870
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Previously, the API did not enforce ordering of writes. Misuse of
this API would lead to data effectively being lost.
Guard against that with IllegalArgumentException, and add a test.
Change-Id: I04f55c481d60532fc64d35fa32c47037a03988ae
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
reftable: fix seeking to refs in reflog implementation
Small reftables omit the log index. Currently,
ReftableWriter#shouldHaveIndex does this if there is a single-block
log, but other writers could decide on different criteria.
In the case that the log index is missing, we have to linearly search
for the right block. It is never appropriate to use binary search on
blocks for log data, as the blocks are compressed and therefore
irregularly sized.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Change-Id: Id59874edf6bf45c7dec502d9465888e077ffe198
Now the reference carries its updateIndex, so the cursor doesn't need
to expose it.
Change-Id: Icbfca46f92a13f3d8215ad10b2a166a6f40b0b0f
Signed-off-by: Ivan Frade <ifrade@google.com>
RefDatabase/Ref: Add versioning to reference database
In DFS implementations the reference table can fall out of sync, but
it is not possible to check this situation in the current API.
Add a property to the Refs indicating the order of its updates. This
version is set only by RefDatabase implementations that support
versioning (e.g reftable based).
Caller is responsible to check if the reference db creates versioned
refs before accessing getUpdateIndex(). E.g:
Ref ref = refdb.exactRef(...);
if (refdb.hasVersioning()) {
ref.getUpdateIndex();
}
Change-Id: I0d5ec8e8df47c730301b2e12851a6bf3dac9d120
Signed-off-by: Ivan Frade <ifrade@google.com>
Make Reftable seek* and has* method names more consistent
Make the method names more consistent and their semantics simpler:
hasRef and seekRef to look up a single exact reference by name and
hasRefsByPrefix and seekRefsByPrefix to look up multiple references by
name prefix.
In particular, splitting hasRef into two separate methods for its
different uses makes DfsReftableDatabase.isNameConflicting easier to
follow.
[jn: fleshed out commit message]
Change-Id: I71106068ff3ec4f7e14dd9eb6ee6b5fab8d14d0b
Signed-off-by: Minh Thai <mthai@google.com>
Signed-off-by: Jonathan Nieder <jrn@google.com>
Reftable implementation of RefDatabase.getRefsByPrefix() should be
more performant, as references are filtered directly by prefix;
instead of fetching the whole subtree then filter by prefix.
Change-Id: If4f5f8c08285ea1eaec9efb83c3d864cea7a1321
Signed-off-by: Minh Thai <mthai@google.com>
MergedReftable to skip shadowed refs in same reftable
This would allow compact and GC process to clean up duplicate ref names in the reftables.
Change-Id: I2b9df0bf72dba63cc3525e374982e60559a776c2
Signed-off-by: Minh Thai <mthai@google.com>
ReftableCompactor should accept 0 for minUpdateIndex
Do not use 0 as the unset value for minUpdateIndex, as input reftables
may have minUpdateIndex starting at 0.
Change-Id: Ie040a6b73d4a5eba5521e51d0ee4580713c84a3e
Signed-off-by: Minh Thai <mthai@google.com>
Add an update_index to every reference in a reftable, storing the
exact transaction that last modified the reference. This is necessary
to fix some merge race conditions.
Consider updates at T1, T3 are present in two reftables. Compacting
these will create a table with range [T1,T3]. If T2 arrives during
or after the compaction its impossible for readers to know how to
merge the [T1,T3] table with the T2 table.
With an explicit update_index per reference, MergedReftable is able to
individually sort each reference, merging individual entries at T3
from [T1,T3] ahead of identically named entries appearing in T2.
Change-Id: Ie4065d4176a5a0207dcab9696ae05d086e042140
resolve(Ref) helps callers recursively chase symbolic references and
is a useful function when wrapping a Reftable inside a RefDatabase, as
RefCursor does not resolve symbolic references during iteration.
Change-Id: I1ba143f403773497972e225dc92c35ecb989e154
Transactions may wish to merge several tables together as part of an
operation. Setting a byte limit allows the transaction to consider
only some recent tables, bounding the cost of the compaction.
Change-Id: If037f2cbdc174ff1a215d5917178b33cde4ddaba
A compaction of reftables is just copying the results of a
MergedReftable into a ReftableWriter. Wrap this up into a utility.
Change-Id: I6f5677d923e9628993a2d8b4b007a9b8662c9045
MergedReftable combines multiple reference tables together in a stack,
allowing higher/later tables to shadow earlier/lower tables. This
forms the basis of a transaction system, where each transaction writes
a new reftable containing only the modified references, and readers
perform a merge on the fly to get the latest value.
Change-Id: Ic2cb750141e8c61a8b2726b2eb95195acb6ddc83
ReftableReader provides sequential scanning support over all
references, a range of references within a subtree (such as
"refs/heads/"), and lookup of a single reference. Reads can be
accelerated by an index block, if it was created by the writer.
The BlockSource interface provides an abstraction to read from the
reftable's backing storage, supporting a future commit to connect
to JGit DFS and the DfsBlockCache.
Change-Id: Ib0dc5fa937d0c735f2a9ff4439d55c457fea7aa8
This is a simple writer to create reftable formatted files. Follow-up
commits will add support for reading from reftable, debugging
utilities, and tests.
Change-Id: I3d520c3515c580144490b0b45433ea175a3e6e11