Set "potentialNullReference" to "error" level and fixed all issues
There should be no functional change, the logic updated only to make
code simple so that compiler can understand what is going for. Removed
all @SuppressWarnings("null") annotations since they cannot be used if
"org.eclipse.jdt.core.compiler.problem.potentialNullReference" option is
set to the "error" level.
Bug: 470647
Change-Id: Ie93c249fa46e792198d362e531d5cbabaf41fdc4
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Root commits are commits with zero parents. If a commmit has no
parents it is the first commit in the repository. In general the root
commits should be unique for any given project, as the first commit
will be created at a different time, by a different user with its own
message. These root commits can be used as a "fingerprint" to
identify disjoint histories.
Change-Id: Id891dbc1f17c816cea404569578bb7635ff85cdb
Fix NPE in DfsGarbageCollector and further reduce memory
DfsGarbageCollector asks PackWriter for the set of objects packed
after the bitmap index is written out. This is now null as it was
cleared to release memory. Instead use PackBitmapIndexBuilder to
build the set as it also has the objects.
Reduce memory in PackBitmapIndexBuilder by fully discarding the
ObjectToPack instances. This was the original intent of commit
4bb523475d ("PackWriter: shed memory while creating bitmaps")
but failed as the instances were still held live here.
Switch to BlockList instead of ObjectToPack[]. This allows the
JVM to allocate many smaller arrays instead of one contiguous
array with 5.2M reference pointers. In a tight heap the smaller
allocations are more feasible.
Reduce the initial EWAHCompressedBitmaps for the 4 type maps. On
average a typical repository is 30% commits, 30% trees and 30% blobs.
These bitmaps are typically very dense. PackWriter orders objects by
commit, tree, blob when writing the file so these should always be a
very dense run of 1s with some 0s before and after. So even the 1/3rd
allocation is likely too large, but the later trim() will reduce the
internal buffer anyway.
Change-Id: If0b80a31cb00894f1485ff8f53ef7ae5a759a046
Once bitmap creation begins the internal maps required for packing are
no longer necessary. On a repository with 5.2M objects this can save
more than 438 MiB of memory by allowing the ObjectToPack instances to
get garbage collected away.
Downside is the PackWriter cannot be used for any further opertions
except to write the bitmap index. This is an acceptable trade-off as
in practice nobody uses the PackWriter after the bitmaps are built.
Change-Id: Ibfaf84b22fa0590896a398ff659a91fcf03d7128
Bitmap builder: actually compress EWAH bitmaps in memory
For construction performance each new EWAHBitmap is allocated at the
roughly worst-case size the bitmap would need if all of the words must
be literal and no run length compression is available. In practice
this is far larger than is required, wasting heap memory while the
bitmaps are computed.
Trim down each bitmap to its minimum required size. This copies the
internal array to a new smaller array, allowing the GC to reclaim the
prior larger array for reuse.
A single bitmap of 5.2M bits is only 79 KiB of memory without this
trim call but 15,000 such bitmaps is 1.1 GiB. Trimming can help fit
a larger number of bitmaps during processing.
Change-Id: I2bd19a786189db5b01c4c96f209b83de50e10c3b
Do not retain commit body during bitmap generation
The bitmap preparer only needs commit graph topology; it does not use
the message body. Allow the RevWalk to free the body after the commit
has been parsed to save memory.
Change-Id: I97d4a440c9fc313873fd224bd05b9d9e3dc575db
Catch unexpected PatternSyntaxException and convert it to
InvalidPatternException. Log such errors, do not silently ignore them.
Bug: 463581
Change-Id: Id0936d9816769ec0cfae1898beda0f7a3c146e67
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Inspired by a proposal from gitolite[1], where we store a file in
a tree for each ref name, and the contents of the file is the latest
push cert to affect that ref.
The main modification from that proposal (other than lacking the
out-of-git batching) is to append "@{cert}" to filenames, which allows
storing certificates for both refs/foo and refs/foo/bar. Those
refnames cannot coexist at the same time in a repository, but we do
not want to discard the push certificate responsible for deleting the
ref, which we would have to do if refs/foo in the push cert tree
changed from a tree to a blob.
The "@{cert}" syntax is at least somewhat consistent with
gitrevisions(7) wherein @{...} describe operators on ref names.
As we cannot (currently) atomically update the push cert ref with the
refs that were updated, this operation is inherently racy. Kick the can
down the road by pushing this burden on callers.
[1] cf062b8bb6/contrib/hooks/repo-specific/save-push-signatures
Change-Id: Id3eb32416f969fba4b5e4d9c4b47053c564b0ccd
Avoid double-colon in InvalidObjectIdException description
The invalidId message in JGitText and the asAscii bad id both contain a
colon, so the resulting message would say
Invalid id: : a78987c98798ufa
Fix it by keeping the colon in the translated message and not adding
another colon programmatically.
Noticed by code inspection.
Change-Id: I13972eebde27a4128828e6c64517666f0ba6288b
Signed-off-by: Jonathan Nieder <jrn@google.com>
For loose objects an expiration date can be set which will save too
young objects from being deleted. Add the same for packfiles. Packfiles
which are too young are not deleted.
Bug: 468024
Change-Id: I3956411d19b47aaadc215dab360d57fa6c24635e
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Introduce PostUploadHook to replace UploadPackLogger
UploadPackLogger is incorrectly named--it can be used to trigger any
post upload action, such as GC/compaction. This change introduces
PostUploadHook/PostUploadHookChain to replace
UploadPackLogger/UploadPackLoggerChain and deprecates the latter.
It also introduces PackStatistics as a replacement for
PackWriter.Statistics, since the latter is not public API.
It changes PackWriter to use PackStatistics and reimplements
PackWriter.Statistics to delegate to PackStatistics.
Change-Id: Ic51df1613e471f568ffee25ae67e118425b38986
Signed-off-by: Terry Parker <tparker@google.com>
- Consistently return structured data, such as actual ReceiveCommands,
which is more useful for callers that are doing things other than
verifying the signature, e.g. recording the set of commands.
- Store the certificate version field, as this is required to be part
of the signed payload.
- Add a toText() method to recreate the actual payload for signature
verification. This requires keeping track of the un-chomped command
strings from the original protocol stream.
- Separate the parser from the certificate itself, so the actual
PushCertificate object can be immutable. Make a fair attempt at deep
immutability, but this is not possible with the current mutable
ReceiveCommand structure.
- Use more detailed error messages that don't involve NON-NLS strings.
- Document null return values more thoroughly. Instead of having the
undocumented behavior of throwing NPE from certain methods if they
are not first guarded by enabled(), eliminate enabled() and return
null from those methods.
- Add tests for parsing a push cert from a section of pkt-line stream
using a real live stream captured with Wireshark (which, it should
be noted, uncovered several simply incorrect statements in C git's
Documentation/technical/pack-protocol.txt).
This is a slightly breaking API change to classes that were
technically public and technically released in 4.0. However, it is
highly unlikely that people were actually depending on public
behavior, since there were no public methods to create
PushCertificates with anything other than null field values, or a
PushCertificateParser that did anything other than infinite loop or
throw exceptions when reading.
Change-Id: I5382193347a8eb1811032d9b32af9651871372d0
Fix public API issues introduced in I1baeedcc6946.
Move ObjectCountCallback and WriteAbortedException to package
org.eclipse.jgit.transport, so that they'll become public API.
Change-Id: I95e3cfaa49f3f7371e794d5c253cf6981f87cae0
Signed-off-by: Yuxuan 'fishy' Wang <fishywang@google.com>
Added callback in PackWriter and BundleWriter for the caller to get the
count of objects to write, and a chance to abort the write operation.
Change-Id: I1baeedcc6946b1093652de4a707fe597a577e526
Signed-off-by: Yuxuan 'fishy' Wang <fishywang@google.com>
Return -1 from PackWriter.Statistics.getBitmapIndexMises() when no
bitmap indices were found, to differentiate it from the case where
the bitmap indices contained all of the want/have commits.
Change-Id: I78d4600b462c19f62b347217a0b2c19eaaf3a14b
Signed-off-by: Terry Parker <tparker@google.com>
Clarify description of ServiceNotAuthorizedException
This exception's detail message states
Service not permitted
and according to the Javadoc it indicates that the current user does not
have access to the service. In practice, though, callers handle this
exception by presenting a '401 Unauthorized' response to the client,
meaning that the user is unauthenticated and should authenticate.
Clarify the documentation and detail message to match the practice.
The exception message is not used anywhere except logs. No
client-visible effect intended.
Change-Id: I2c6be9cb74af932f0dcb121a381a64f2ad876766
Signed-off-by: Jonathan Nieder <jrn@google.com>
ObjectReader release method was replaced by close method but
WindowCursor was still implementing release method.
To prevent the same mistake again, make ObjectReader close method
abstract to force sub classes to implement it.
Change-Id: I50d0d1d19a26e306fd0dba77b246a95a44fd6584
Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>
FS: Extract GobblerThread into a private static class
The primary goal is to improve exception readability. Since this is a
standalone thread, just logging the stack trace of the caught
exception is not very useful:
java.io.IOException: Stream closed
at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162)
at java.io.BufferedInputStream.read(BufferedInputStream.java:258)
at org.eclipse.jgit.util.FS$2.run(FS.java:451)
Providing a named class eliminates the "FS$2", and including the
command name provides a little more context in the error message.
A future improvement might include the stack trace that created the
GobblerThread as well.
Change-Id: Ibf16d15b47a85b6f41844a177e398c2fc94f27b0
RevWalks to find commits that are not in bitmap indices are expensive.
Track the count of commits that are enumerated via RevWalks as "bitmap
index misses" in the PackWriter.Statistics class.
Change-Id: Ie0135a0a0aeba2dfb6df78839d545006629f16cb
Signed-off-by: Terry Parker <tparker@google.com>
Don't invalidate pack file on InterruptedIOException
If the thread reading a pack file is interrupted don't invalidate that
pack file.
This could happen when Gerrit invoked JGit for computing a diff in one
thread and waited for the call to finish from another thread, with a
timeout. When the timeout was reached the "diff" thread was interrupted.
If it happened to be in an IO operation, reading a pack file, an
InterruptedIOException was thrown and the pack file was marked as
invalid and removed from the pack list.
Invalidating the pack in that case could cause the project disappearing in
Gerrit as discussed in [1] and [2].
[1] https://groups.google.com/forum/#!topic/repo-discuss/CYYoHfDxCfA
[2] https://groups.google.com/forum/#!topic/repo-discuss/ZeGWPyyJlrM
Change-Id: I2eb1f98370936b5be541d96d70c3973cbfc39238
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Use AutoClosable to close resources in bundle org.eclipse.jgit
- use try-with-resource where possible
- replace use of deprecated release() by close()
Change-Id: I0f139c3535679087b7fa09649166bca514750b81
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
The Java GC doesn't always clear these before running out of memory
and failing allocations. In practice OpenJDK 7 is leaving these live,
removing any advantage of the SoftReference to attempt to shed memory
when the GC is unable to continue allocating.
Instead follow the pattern of the DfsBlockCache and use hard refs
to the object data. Require applications to configure the cache
size more accurately given expected memory usage.
Change-Id: I87586b3e71b1cba0308a6a278d42e971be4bccd3
The LRU chain management code was broken leading to situations where
the chain was incomplete. This prevented the cache from removing
items when it exceeded its memory target, causing a leak.
One case was repeated hit on the head of the chain. moveToHead(e)
was invoked linking the head back to itself in a cycle orphaning
the rest of the table.
Add some unit tests to cover this and a few other paths.
Change-Id: Ib27486eaa1b1d2bf1c745a56d0a5832bfb029322
Despite being the primary author of RevWalk and ObjectWalk I still
fail to remember to setRetainBody(false) in application code using
an ObjectWalk to examine the graph.
Document the default for RevWalk is setRetainBody(true), where the
application usually wants the commit bodies to display or inspect.
Change the default for ObjectWalk to setRetainBody(false), as nearly
all callers want only the graph shape and do not need the larger text
inside a commit body. This allows some code in JGit to be simplified.
Change-Id: I367e42209e805bd5e1f41b4072aeb2fa98ec9d99
Do not concatenate strings as arguments to StringBuilder.append()
That more or less defeats the purpose of using a StringBuilder.
Change-Id: I519f7bf1c9b6670e63c3714210f834ee845dc69f
Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
This error happens on nfs file system when you try to read a file that
was deleted or replaced.
When the error happens because the file was deleted, removing it from
the list is the proper way to handle the error, same use case as
FileNotFoundException. When the error happens because the file was
replaced, removing the file from the list will cause the file to be
re-read so it will get the latest version of the file.
Bug: 462868
Change-Id: I368af61a6cf73706601a3e4df4ef24f0aa0465c5
Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>
Pack not found and pack corrupted/invalid are handled by the code (pack
is removed from the list) so logging an error and the stacktrace is
misleading because it implies that there is an action to take to fix the
error.
Lower the log level to warn and remove the stacktrace for those 2 types
of errors and keep the error log statement for any other.
Change-Id: I2400fe5fec07ac6d6c244b852cce615663774e6e
Signed-off-by: Hugo Arès <hugo.ares@ericsson.com>
Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Cached packs are only used when writing over the network or to
a bundle file and reuse validation is always disabled in these
two contexts. The client/consumer of the stream will be SHA-1
checksumming every object.
Reuse validation is most critical during local GC to avoid silently
ignoring corruption by stopping as soon as a problem is found and
leaving everything alone for the end-user to debug and salvage.
Cached packs are not supported during local GC as the bitmap rebuild
logic does not support including a cached pack in the result.
Strip out the validation and force PackWriter to always disable the
cached pack feature if reuseValidation is enabled.
Change-Id: If0d7baf2ae1bf1f7e71bf773151302c9f7887039
This hint allows an underlying implementation to read more bytes when
possible and buffer them locally for future read calls to consume.
Change-Id: Ia986a1bb8640eecb91cfbd515c61fa1ff1574a6f
Avoid storing large packs in block cache during reuse
When a large pack (> 30% of the block cache) is being reused by
copying it pollutes the block cache with noise by storing blocks
that are never referenced again.
Avoid this by streaming the file directly from its channel onto
the output stream.
Change-Id: I2e53de27f3dcfb93de68b1fad45f75ab23e79fe7