| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes all the javadoc warnings, stops ignoring doclint 'missing'
category and fails the build on javadoc warnings for public and
protected classes and class members.
Since javadoc doesn't allow access specifiers when specifying doclint
configuration we cannot set `-Xdoclint:all,-missing/private`
hence there is no simple way to skip private elements from doclint.
Therefore we check javadoc using the Eclipse Java compiler
(which is used by default) and javadoc configuration in
`.settings/org.eclipse.jdt.core.prefs` files.
This allows more fine grained configuration.
We can reconsider this when javadoc starts supporting access specifiers
in the doclint configuration.
Below are detailled explanations for most modifications.
@inheritDoc
===========
doclint complains about explicits `{@inheritDoc}` when the parent does
not have any documentation. As far as I can tell, javadoc defaults to
inherit comments and should only be used when one wants to append extra
documentation from the parent. Given the parent has no documentation,
remove those usages which doclint complains about.
In some case I have moved up the documentation from the concrete class
up to the abstract class.
Remove `{@inheritDoc}` on overriden methods which don't add additional
documentation since javadoc defaults to inherit javadoc of overridden
methods.
@value to @link
===============
In PackConfig, DEFAULT_SEARCH_FOR_REUSE_TIMEOUT and similar are forged
from Integer.MAX_VALUE and are thus not considered constants (I guess
cause the value would depends on the platform). Replace it with a link
to `Integer.MAX_VALUE`.
In `StringUtils.toBoolean`, @value was used to refer to the
`stringValue` parameter. I have replaced it with `{@code stringValue}`.
{@link <url>} to <a>
====================
@link does not support being given an external URL. Replaces them with
HTML `<a>`.
@since: being invalid
=====================
org.eclipse.jgit/src/org/eclipse/jgit/util/Equality.java has an invalid
tag `@since: ` due to the extra `:`. Javadoc does not complain about it
with version 11.0.18+10 but does with 11.0.19.7. It is invalid
regardless.
invalid HTML syntax
===================
- javadoc doesn't allow <br/>, <p/> and </p> anymore, use <br> and <p>
instead
- replace <tt>code</tt> by {@code code}
- <table> tags don't allow summary attribute, specify caption as
<caption>caption</caption> to fix this
doclint visibility issue
========================
In the private abstract classes `BaseDirCacheEditor` and
`BasePackConnection` links to other methods in the abstract class are
inherited in the public subclasses but doclint gets confused and
considers them unreachable. The HTML documentation for the sub classes
shows the relative links in the sub classes, so it is all correct. It
must be a bug somewhere in javadoc.
Mute those warnings with: @SuppressWarnings("doclint:missing")
Misc
====
Replace `<` and `>` with HTML encoded entities (`< and `>`).
In `SshConstants` I went enclosing a serie of -> arrows in @literal.
Additional tags
===============
Configure maven-javad0c-plugin to allow the following additional tags
defined in https://openjdk.org/jeps/8068562:
- apiNote
- implSpec
- implNote
Missing javadoc
===============
Add missing @params and descriptions
Change-Id: I840056389aa59135cfb360da0d5e40463ce35bd0
Also-By: Matthias Sohn <matthias.sohn@sap.com>
|
|
|
|
|
|
|
|
|
|
| |
This is the format given by the Eclipse legal doc generator [1].
[1] https://www.eclipse.org/projects/tools/documentation.php?id=technology.jgit
Bug: 548298
Change-Id: I8d8cabc998ba1b083e3f0906a8d558d391ffb6c4
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
|
|
|
|
| |
Change-Id: Ia655f45153bcf1d422ffffce6dcf914847e14c4c
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
|
|
In practice the DHT storage layer has not been performing as well as
large scale server environments want to see from a Git server.
The performance of the DHT schema degrades rapidly as small changes
are pushed into the repository due to the chunk size being less than
1/3 of the pushed pack size. Small chunks cause poor prefetch
performance during reading, and require significantly longer prefetch
lists inside of the chunk meta field to work around the small size.
The DHT code is very complex (>17,000 lines of code) and is very
sensitive to the underlying database round-trip time, as well as the
way objects were written into the pack stream that was chunked and
stored on the database. A poor pack layout (from any version of C Git
prior to Junio reworking it) can cause the DHT code to be unable to
enumerate the objects of the linux-2.6 repository in a completable
time scale.
Performing a clone from a DHT stored repository of 2 million objects
takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row
for each object being cloned. This is very difficult for some DHTs to
scale, even at 5000 rows/second the lookup stage alone takes 6 minutes
(on local filesystem, this is almost too fast to bother measuring).
Some servers like Apache Cassandra just fall over and cannot complete
the 2 million lookups in rapid fire.
On a ~400 MiB repository, the DHT schema has an extra 25 MiB of
redundant data that gets downloaded to the JGit process, and that is
before you consider the cost of the OBJECT_INDEX table also being
fully loaded, which is at least 223 MiB of data for the linux kernel
repository. In the DHT schema answering a `git clone` of the ~400 MiB
linux kernel needs to load 248 MiB of "index" data from the DHT, in
addition to the ~400 MiB of pack data that gets sent to the client.
This is 193 MiB more data to be accessed than the native filesystem
format, but it needs to come over a much smaller pipe (local Ethernet
typically) than the local SATA disk drive.
I also never got around to writing the "repack" support for the DHT
schema, as it turns out to be fairly complex to safely repack data in
the repository while also trying to minimize the amount of changes
made to the database, due to very common limitations on database
mutation rates..
This new DFS storage layer fixes a lot of those issues by taking the
simple approach for storing relatively standard Git pack and index
files on an abstract filesystem. Packs are accessed by an in-process
buffer cache, similar to the WindowCache used by the local filesystem
storage layer. Unlike the local file IO, there are some assumptions
that the storage system has relatively high latency and no concept of
"file handles". Instead it looks at the file more like HTTP byte range
requests, where a read channel is a simply a thunk to trigger a read
request over the network.
The DFS code in this change is still abstract, it does not store on
any particular filesystem, but is fairly well suited to the Amazon S3
or Apache Hadoop HDFS. Storing packs directly on HDFS rather than
HBase removes a layer of abstraction, as most HBase row reads turn
into an HDFS read.
Most of the DFS code in this change was blatently copied from the
local filesystem code. Most parts should be refactored to be shared
between the two storage systems, but right now I am hesistent to do
this due to how well tuned the local filesystem code currently is.
Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb
|