If readRef returns null (indicating that the ref does not exist), the
loop continues so we can find the ref later in the search path. And
resolve should never return null, so if we return null it should mean
we exhausted the entire search path and didn't find the ref.
... except that resolve can return null: it does so when it has
followed too many symrefs and concluded that there is a symref loop:
if (MAX_SYMBOLIC_REF_DEPTH <= depth)
return null; // claim it doesn't exist
Continue the loop instead of returning null immediately. This makes
the behavior more consistent.
Arguably getRef should throw an exception when a symref loop is
detected. That would be a more invasive change, so if it's a good
idea it will have to wait for another patch.
Change-Id: Icb1c7fafd4f1e34c9b43538e27ab5bbc17ad9eef Signed-off-by: Jonathan Nieder <jrn@google.com>
Throw IndexReadException if existing index can't be read
If the index file exists but can't be read for example because of wrong
filesystem permissions we should throw a specific exception. This allows
EGit to handle this error situation.
Bug: 482607
Change-Id: I50bfcb719c45caac3cb5550a8b16307c2ea9def4 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Tue, 17 Nov 2015 16:26:53 +0000 (17:26 +0100)]
Fix pre-push hook to not set null remoteName as first argument
According to [1] the pre-push hook expects two parameters which provide
the name and location of the destination remote, if a named remote is
not being used both values should be the same.
We did set the first parameter to null in that case which caused
ProcessBuilder to throw a NullPointerException since its start() method
doesn't accept null arguments.
[1] https://git-scm.com/docs/githooks#_pre_push
Bug: 482393
Change-Id: Idb9b0a48cefac01abfcfdf00f6d173f8fa1d9a7b Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Add an attribute accessor to CanonicalTreeParser and use it in Treewalk
When checking out a branch we need to access the attributes stored
in the tree to be checked out. E.g. directly after a clone we checkout
the remote HEAD. In this case index and workingtree are still empty.
So we have to search the tree to be checked out for attributes.
Arthur Daussy [Fri, 31 Oct 2014 16:46:36 +0000 (17:46 +0100)]
Adds the git attributes computation on the treewalk
Adds the getAttributes feature to the tree walk. The computation of
attributes needs to be done by the TreeWalk since it needs both a
WorkingTreeIterator and a DirCacheIterator.
Bug: 342372
CQ: 9120
Change-Id: I5e33257fd8c9895869a128bad3fd1e720409d361 Signed-off-by: Arthur Daussy <arthur.daussy@obeo.fr> Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Andrey Loskutov [Mon, 16 Nov 2015 22:41:38 +0000 (23:41 +0100)]
Make jgit annotations accessible to other plugins
Other plugins which want to use JGit nullness annotations in their code
cannot do this if the annotations aren't part of the published API.
Unfortunately it looks like although Eclipse JDT allows to use custom
nullness annotation types per project, it does not understand if those
annotations are used mixed with other nullness annotations in other
projects. E.g. EGit can either configure JGit annotations for NPE
analysis and so "understand" nullness from JGit API but so it loses the
ability to use any other nullness annotations to annotate its own code.
Andrey Loskutov [Sun, 15 Nov 2015 21:55:58 +0000 (22:55 +0100)]
Added jgit own NonNull annotation type
The annotation is required for example in Repository case (patch
follows), where almost all non-void return methods return Nullable
except few returning NonNull. I definitely do not favor this style, but
it is a nightmare to clients to guess if the null check is needed or
not.
Matthias Sohn [Sun, 15 Nov 2015 21:56:55 +0000 (22:56 +0100)]
Handle InternalError during symlink support detection on Windows
When JGit tries to detect symlink support the attempt to create a
symlink may fail with a java.lang.InternalError. This was reported for a
case where the application using JGit runs in Windows XP compatibility
mode using Oracle JDK 1.8. Handle this by assuming symlinks are not
supported in this case.
Bug: 471027
Change-Id: I978288754dea0c6fffd3457fad7d4d971e27c6c2 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Fri, 13 Nov 2015 23:59:36 +0000 (00:59 +0100)]
Merge branch 'stable-4.1'
* stable-4.1:
Prepare 4.1.2-SNAPSHOT builds
JGit v4.1.1.201511131810-r
Fallback exactRef: Do not ignore symrefs to unborn branch
RefDirectory.exactRef: Do not ignore symrefs to unborn branch
Change-Id: I66afb303f355aad8a7eaa7a6dff06de70ae9c490 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Jonathan Nieder [Wed, 11 Nov 2015 18:42:46 +0000 (10:42 -0800)]
Fallback exactRef: Do not ignore symrefs to unborn branch
When asked to read a symref pointing to a branch-yet-to-be-born (such
as HEAD in a newly initialized repository), getRef and getRefs provide
different results.
getRef: SymbolicRef[HEAD -> refs/heads/master=00000000]
getRefs and getAdditionalRefs: nothing
exactRef should match the getRef behavior: it is meant to be a
simpler, faster version of getRef that lets you search for a ref
without resolving it using the search path without other semantic
changes. But the fallback implementation of exactRef relies on getRefs
and produces null for this case.
Luckily the in-tree RefDatabase implementations override exactRef and
get the correct behavior. But any out-of-tree storage backend that
doesn't inherit from DfsRefDatabase or RefDirectory would still return
null when it shouldn't.
Let the fallback implementation use getRef instead to avoid this.
This means that exactRef would waste some effort traversing the ref
search path when the named ref is not found --- but subclasses tend to
override exactRef for performance already, so in the default
implementation correctness is more important.
Jonathan Nieder [Tue, 10 Nov 2015 23:11:04 +0000 (15:11 -0800)]
RefDirectory.exactRef: Do not ignore symrefs to unborn branch
When asked to read a symref pointing to a branch-yet-to-be-born (such
as HEAD in a newly initialized repository), DfsRepository and
FileRepository return different results.
getRef("HEAD") returns the same as DfsRepository's exactRef in both
backends.
The intended behavior is the DfsRepository one: exactRef() is supposed
to be like getRef(), but more exact because it doesn't need to
traverse the search path.
The discrepancy is because DfsRefDatabase implements exactRef()
directly with the intended semantics, while RefDirectory uses a
fallback implementation built on top of getRefs(). getRefs() skips
symrefs to an unborn branch.
Override the fallback implementation with a correct implementation
that is similar to getRef() to avoid this. A followup change will fix
the fallback.
Change-Id: Ic138a5564a099ebf32248d86b93e2de9ab3c94ee Reported-by: David Pursehouse <david.pursehouse@sonymobile.com> Improved-by: Christian Halstrick <christian.halstrick@sap.com>
Bug: 478865
BitmapBuilder is an interface to be implemented by implementors of this
interface. According to OSGi semantic versioning rules breaking API
changes require update of the minor version only if implementors of the
API have to be adapted and the changes do not affect clients of the API.
Change-Id: If45d204181ea9bc788b6b57693ca17b1847564c7 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Jonathan Nieder [Mon, 9 Nov 2015 16:48:46 +0000 (08:48 -0800)]
Make BitmapIndexImpl.CompressedBitmap public
PackWriterBitmapPreparer (which is in another package) is already well
aware of the mapping between EWAHCompressedBitmaps and the
higher-level CompressedBitmap objects of the BitmapIndexImpl API.
Making the CompressedBitmap type public makes the translation more
obvious and wouldn't break any abstractions that aren't already
broken. So expose it.
This is all under org.eclipse.jgit.internal so there are no API
stability guarantees --- we can change the API if internals change
(for example if some day there are bitmaps spanning multiple packs).
In particular this means the confusing toBitmap helper can be removed.
Terry Parker [Mon, 9 Nov 2015 22:19:11 +0000 (14:19 -0800)]
Update dependencies to use the JGit-internal @Nullable
Update the project-specific Eclipse settings to replace the use of the
org.eclipse.jdt.annotation.Nullable class the new JGit-specific
@Nullable annotation. I verified that Eclipse reports errors when the
return value of a method annotated with
@org.eclipse.jgit.annotations.Nullable is dereferenced without a null
check.
Also remove the Maven and MANIFEST.MF dependencies on
org.eclipse.jdt.annotation.
Eclipse null analysis uses three annotations: @Nullable, @NonNull and
@NonNullByDefault. All three are updated in this patch because it is
invalid to set the Eclipse preferences to empty values. So far only
@Nullable has been introduced in org.eclipse.jgit.annotations.
My personal preference is to follow the advice in Effective Java and
avoid the null-return idiom, and to avoid passing null values in
general. This sets the expectation is that arguments and return types
are assumed non-null unless otherwise documented. If that is the
expectation, then consistent application of @NonNull is redundant and
hurts readability by cluttering the code, obscuring the occasional
@Nullable annotation that really requires attention.
If the JGit community decides there is value in using the @NonNull and
@NonNullByDefault annotations we can add them--this change configures
Eclipse to use them.
Change-Id: I9af1b786d1b44b9b0d9c609480dc842df79bf698 Signed-off-by: Terry Parker <tparker@google.com>
Terry Parker [Mon, 9 Nov 2015 21:40:39 +0000 (13:40 -0800)]
Add a JGit-internal Nullable type
Commit 847b3d1 enabled annotation-based NPE analysis to JGit.
In so doing, it introduced a new dependency on the org.eclipse.jdt that
is undesirable. Follow Gerrit's lead by adding an internal Nullable type
(see
https://gerrit.googlesource.com/gerrit/+/stable-2.11/gerrit-common/src/main/java/com/google/gerrit/common/Nullable.java).
The javax.annotations.Nullable class uses a retention policy of RUNTIME,
whereas the org.eclipse.jdt.annotation.Nullable class used a policy of
CLASS. Since I'm not aware of tools that make use of the annotation at
runtime I chose the CLASS policy. If in the future there is a benefit to
retaining the annotations in the running binary we can update this
class.
Change-Id: I63dc8f9a6dc46b517cbde211bb5e2f8521a54d04 Signed-off-by: Terry Parker <tparker@google.com>
Jonathan Nieder [Mon, 9 Nov 2015 21:14:13 +0000 (13:14 -0800)]
Make BitmapIndexImpl.CompressedBitmap, CompressedBitmapBuilder static
A CompressedBitmap represents a pair (EWAH bit vector, PackIndex
assigning bit positions to git objects). The bit vector is a member
field and the PackIndex is implicit via the 'this' reference to the
outer class.
Make this clearer by making CompressedBitmap a static class and
replacing the 'this' reference by an explicit field.
Likewise for CompressedBitmapBuilder.
Change-Id: Id85659fc4fc3ad82034db3370cce4cdbe0c5492c Suggested-by: Terry Parker <tparker@google.com>
Jonathan Nieder [Sat, 7 Nov 2015 19:53:25 +0000 (11:53 -0800)]
Skip redundant 'OR-reuse' step in tip commit bitmap setup
When creating bitmaps during gc, the bitmaps in tipCommitBitmaps are
built in setupTipCommitBitmaps using the following procedure:
0. create a bitmap ('reuse') that lists all ancestors of commits
whose existing bitmaps will be reused. I will call this the
reused part of history.
1. initialize a bitmap for each of the pack's "want"s by taking
a copy of the 'reuse' bitmap and setting the bit corresponding
to the one wanted commit.
2. walk through ancestors of wants, excluding the reused part of
history. Add parents of visited commits to bitmaps that have
those commits.
3. AND-NOT each tipCommitBitmap against the 'reuse' bitmap
4. Sort the bitmaps and AND-NOT each against the previous so they
partition the new commits.
The OR against 'reuse' in step 1 and the AND-NOT against 'reuse'
cancel each other out, except when commits from the reused part of
history are added to a bitmap in step 2. So avoid adding commits from
the reused part of history in step 2 and skip the OR and AND-NOT.
Performance impact (thanks to Terry for measuring):
The initial "selecting bitmaps" phase of garbage collection decreased
from (83 + 81 + 85) / 3 = 83 to (56 + 57 + 56) / 3 = 56.3, meaning
nearly a ~50% speedup of that phase.
Tested-by: Terry Parker <tparker@google.com>
Change-Id: I26ea695809594347575d14a1d8e6721b8608eb9c
Jonathan Nieder [Wed, 4 Nov 2015 06:29:31 +0000 (22:29 -0800)]
Inline PackWriterBitmapWalker.newRevFilter into callers
Instead of using the newRevFilter helper, call the appropriate
RevFilter constructor directly. This means one less hop to find
documentation about what the RevFilter will do.
Jonathan Nieder [Wed, 4 Nov 2015 06:43:45 +0000 (22:43 -0800)]
Convert remaining callers of BitmapBuilder.add to use .addObject
When setupTipCommitBitmaps is called, writeBitmaps does not have any
bitmaps saved, so these calls to .add always add a single commit and
do not OR in a bitmap.
The objects returned by nextObject after a commit walk is finished
are trees and blobs. Non-commit objects do not have bitmaps
associated so the call to .add also can only add a single object.
* changes:
Remove BitmapRevFilter.getCountOfLoadedCommits
Make BitmapBuilder.getBitmapIndex public
Deprecate BitmapBuilder.add and introduce simpler addObject method
Add @Override annotations to BitmapIndexImpl
Rely on bitmap RevFilter to filter walk during bitmap selection
Use 'reused' bitmap to filter walk during bitmap selection
Rely on bitmap RevFilter to filter tip commit setup walk
Use 'reused' bitmap to filter tip commit setup walk
Include ancestors of reused bitmap commits in reuse bitmap again
Jonathan Nieder [Wed, 4 Nov 2015 07:20:14 +0000 (23:20 -0800)]
Make BitmapBuilder.getBitmapIndex public
Every Bitmap in current JGit code has an associated BitmapIndex. Make
it public in BitmapBuilder to make retrieving bitmaps to OR in from
that index easier.
Jonathan Nieder [Wed, 4 Nov 2015 02:54:25 +0000 (18:54 -0800)]
Deprecate BitmapBuilder.add and introduce simpler addObject method
The BitmapIndex.BitmapBuilder.add API is subtle:
/**
* Adds the id and the existing bitmap for the id, if one
* exists, to the bitmap.
*
* @return true if the value was not contained or able to be
* loaded.
*/
boolean add(AnyObjectId objectId, int type);
Reading the name of the method does not make it obvious what it will
do. Does it add the named object to the bitmap, or all objects
reachable from it? It depends on whether the BitmapIndex owns an
existing bitmap for that object. I did not notice this subtlety when
skimming the javadoc, either. This resulted in enough confusion to
subtly break the bitmap building code (see change
I30844134bfde0cbabdfaab884c84b9809dd8bdb8 for details).
So discourage use of the add() API by deprecating it.
To replace it, provide a addObject() method that adds a single object.
This way, callers can decide whether to use addObject() or or() based
on the context.
For example,
if (bitmap.add(c, OBJ_COMMIT)) {
for (RevCommit p : c.getParents()) {
rememberToAlsoHandle(p);
}
}
can be replaced with
if (bitmap.contains(c)) {
// already included
} else if (index.getBitmap(c) != null) {
bitmap.or(index.getBitmap(c));
} else {
bitmap.addObject(c, OBJ_COMMIT);
for (RevCommit p : c.getParents()) {
rememberToAlsoHandle(p);
}
}
which is more verbose but makes it clearer that the behavior
depends on the content of index.getBitmaps().
Jonathan Nieder [Wed, 4 Nov 2015 02:38:18 +0000 (18:38 -0800)]
Add @Override annotations to BitmapIndexImpl
This makes it easier to distinguish between implementations of methods
from the interface from helpers internal to org.eclipse.jgit.internal.storage.*.
This was illegal in Java 5 but JGit requires newer Java these days.
Jonathan Nieder [Tue, 3 Nov 2015 19:31:21 +0000 (11:31 -0800)]
Rely on bitmap RevFilter to filter walk during bitmap selection
This RevWalk filters out reused bitmap commits via the 'reuse' bitmap.
Avoid possible wasted time and complexity by not also redundantly
marking them UNINTERESTING.
Jonathan Nieder [Tue, 3 Nov 2015 19:24:02 +0000 (11:24 -0800)]
Use 'reused' bitmap to filter walk during bitmap selection
When building fullBitmap in order to determine which ancestor chain to
add this commit to, we were excluding the ancestors of reusedCommits
using markUninteresting. This use of markUninteresting is a bit
wasteful because we already have a bitmap indicating exactly which
commits should be excluded (which can save some walking). Use it.
A separate commit will remove the now-redundant markUninteresting
call.
No behavior change intended (except for performance improvement).
Jonathan Nieder [Tue, 3 Nov 2015 19:16:22 +0000 (11:16 -0800)]
Rely on bitmap RevFilter to filter tip commit setup walk
This RevWalk filters out reused bitmap commits via the 'reuse' bitmap.
Avoid possible wasted time and complexity by not redundantly marking
them UNINTERESTING any more.
Jonathan Nieder [Tue, 3 Nov 2015 19:15:36 +0000 (11:15 -0800)]
Use 'reused' bitmap to filter tip commit setup walk
When garbage collecting, we decide to reuse some bitmaps in older
history from the previous pack to save time. The remainder of commit
selection only involves commits not covered by those bitmaps.
Currently we carry that out in two ways:
1. by building a bitmap representing the already-covered commits,
for easy containment checks and AND-NOT-ing against
2. by marking the reused bitmap commits as uninteresting in the
RevWalk that finds new commits
The mechanism in (2) is less efficient than (1): rw.next() will walk
back from reused bitmap commits to check whether the commit it is
about to emit is an ancestor of them, when using the bitmap from (1)
would let us perform the same check with a single contains() call.
Add a RevFilter teaching the RevWalk to perform that same check
directly using the bitmap from (1).
The next time the RevWalk is used, a different RevFilter is installed
so this does not break that.
A later commit will drop the markUninteresting calls.
No functional change intended except a possible speedup.
Mike Williams [Wed, 21 Oct 2015 20:09:58 +0000 (16:09 -0400)]
Insert duplicate objects to prevent race during garbage collection.
Prior to this change, DfsInserter would not insert an object into a pack
if it already existed in another pack in the repository, even if that
pack was unreachable. Consider this sequence of events:
- Object FOO is pushed to a repository.
- Subsequent ref changes make FOO UNREACHABLE_GARBAGE.
- FOO is subsequently re-inserted using a DfsInserter, but skipped
due to existing in UNREACHABLE_GARBAGE.
- The repository is repacked; FOO will not be written into a new pack
because it is not yet reachable from a reference. If the
UNREACHABLE_GARBAGE packs are deleted, FOO disappears.
- A reference is updated to reference FOO. This reference is now broken
as FOO was removed when the repacking process deleted the
UNREACHABLE_GARBAGE pack that stored the only copy of FOO.
The garbage collector can't safely delete the UNREACHABLE_GARBAGE
pack because FOO might be in the middle of being re-inserted/re-packed.
This change writes a duplicate copy of an object if it only exists in
UNREACHABLE_GARBAGE. This "freshens" the object to give it a chance to
survive long enough to be made reachable through a reference.
Change-Id: I20f2062230f3af3bccd6f21d3b7342f1152a5532 Signed-off-by: Mike Williams <miwilliams@google.com>
Jonathan Nieder [Wed, 4 Nov 2015 01:45:06 +0000 (17:45 -0800)]
Include ancestors of reused bitmap commits in reuse bitmap again
Until 320a4142 (Update bitmap selection throttling to fully span
active branches, 2015-10-20), setupTipCommitBitmaps contained code
along the following lines:
for (PackBitmapIndexRemapper.Entry entry : bitmapRemapper) {
if (!reuse(entry))
continue;
RevCommit rc = (RevCommit) rw.peel(rw.parseAny(entry));
This loop OR-ed together bitmaps for commits whose bitmaps would be
reused. A subtle point is the use of the add() method, which ORs in a
bitmap from the BitmapIndex when it exists and falls back to OR-ing in
a single bit when that bitmap does not exist in the BitmapIndex.
Commit 320a4142 removed the addBitmap step, so the bitmap does not
exist in the BitmapIndex and the fallback behavior is triggered.
Simplify and restore the intended behavior by avoiding use of the
subtle use of the add() method --- use or() directly instead.
Terry Parker [Sat, 31 Oct 2015 00:47:06 +0000 (17:47 -0700)]
[performance] Speed up delta packing
When packing is able to reuse lots of deltas from existing packs, those
objects are marked as "doNotAttemptDelta" and do not contribute to
DeltaTask's computeTopPaths() "totalWeight" calculation.
In the extreme case when all packs are reusable, "totalWeight" will be
zero. DeltaTask.partitionTasks() uses "totalWeight" to determine a
"weightPerThread" size it uses to set up DeltaTasks. When "totalWeight"
is small, partitionTasks() ends up creating a DeltaTask for every
unique path.
For a large repository, the small "weightPerThread" can result in the
creation of >100k tasks (for the MSM 3.10 Linux repository, the count
was ~150k). This makes the "task stealing" mechanism in DeltaTask very
inefficient, because every attempt to steal work does a linear walk
through all tasks, searching for the one with the most work remaining,
which is O(N^2) comparisons. For the MSM 3.10 repository when all
deltas were reusable, PackWriter.parallelDeltaSearch() took
(1615+1633+1458)/3 = 1568 seconds.
The error is that DeltaTask treats the weights of objects marked as
"doNotAttemptDelta" inconsistently. It ignores the weights when
calculating "totalWeight" but uses them when partitioning the tasks.
The fix is to also ignore them when partitioning the tasks.
With this patch applied, PackWriter.parallelDeltaSearch() on the
MSM 3.10 repository when all deltas are reused went from taking
1568 seconds to 62ms (>25k speedup).
This patch also fixes a totalWeight initialization error in
DeltaTask.computeTopPaths().
Change-Id: I2ae37efa83bca42b0e716266ae6aa9d182e76d9c Signed-off-by: Terry Parker <tparker@google.com>
Jonathan Nieder [Tue, 3 Nov 2015 19:42:53 +0000 (11:42 -0800)]
Decrease indentation in setupTipCommitBitmaps
Avoid leaving the reader in suspense by handling the unusual
(!RevCommit) case first. As a nice side effect, there is less nesting
to keep track of in the rest of the loop body.
When the file <git-dir>/hooks/pre-push exists make sure that is is
executing during a push. The pre-push hook runs during git push, after
the remote refs have been updated but before any objects have been
transferred.
Enhance FS.runProcess() to support stdin-redirection and binary data
In order to support filters in gitattributes FS.runProcess() is made
public. Support for stdin redirection has been added. Support for binary
data on stdin/stdout (as used be clean/smudge filters) has been added.
Change-Id: Ice2c152e9391368dc5748d7b825a838e3eb755f9 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Andrey Loskutov [Sun, 1 Nov 2015 12:40:02 +0000 (13:40 +0100)]
Cleaned up various readPipe() threading issues
Fixed random errors in discoverGitSystemConfig() on Linux where the
process error stream was closed by readPipe() before or while
GobblerThread was reading from it.
Marked readPipe() as @Nullable and fixed potential NPE in
discoverGitSystemConfig() on readPipe() return value.
Fixed process error output randomly mixed with other threads log
messages.
Terry Parker [Thu, 29 Oct 2015 04:59:10 +0000 (21:59 -0700)]
Bitmap generation: Add a test of ordering commits by "chains"
When commits are selected for bitmap generation, they are reordered
so that related "chains" of commits are grouped together. Chains are
"subbranches" of commits that may branch off of and re-merge with the
main line. Grouping by chains means that the XOR difference between
consecutive selected commits will be smaller, resulting in better
run-length compression of the XORed bitmaps.
Add a new testSelectionOrderingWithChains() test in a new
GcCommitSelectionTest test class. Also move related GC commit selection
tests out of GcBasicPackingTest and into GcCommitSelectionTest.
Change-Id: I8e80cac29c4ca8193b41c9898e5436c22a659f11 Signed-off-by: Terry Parker <tparker@google.com>
Andrey Loskutov [Wed, 28 Oct 2015 20:17:43 +0000 (21:17 +0100)]
[performance] Remove synthetic access$ methods in dfs, diff and merge
Java compiler must generate synthetic access methods for private methods
and fields of the enclosing class if they are accessed from inner
classes and vice versa.
While invisible in the code, those synthetic access methods exist in the
bytecode and seem to produce some extra execution overhead at runtime
(compared with the direct access to this fields or methods), see
https://git.eclipse.org/r/58948/.
By removing the "private" access modifier from affected methods and
fields we help compiler to avoid generation of synthetic access methods
and hope to improve execution performance.
To validate changes, one can either use javap or use Bytecode Outline
plugin in Eclipse. In both cases one should look for "synthetic
access$<number>" methods at the end of the class and inner class files
in question - there should be none.
NB: don't mix this "synthetic access$" methods up with "public synthetic
bridge" methods generated to allow generic method override return types.
Andrey Loskutov [Wed, 28 Oct 2015 19:52:43 +0000 (20:52 +0100)]
[performance] Remove synthetic access$ methods in lib, util and dircache
Java compiler must generate synthetic access methods for private methods
and fields of the enclosing class if they are accessed from inner
classes and vice versa.
While invisible in the code, those synthetic access methods exist in the
bytecode and seem to produce some extra execution overhead at runtime
(compared with the direct access to this fields or methods), see
https://git.eclipse.org/r/58948/.
By removing the "private" access modifier from affected methods and
fields we help compiler to avoid generation of synthetic access methods
and hope to improve execution performance.
To validate changes, one can either use javap or use Bytecode Outline
plugin in Eclipse. In both cases one should look for "synthetic
access$<number>" methods at the end of the class and inner class files
in question - there should be none.
NB: don't mix this "synthetic access$" methods up with "public synthetic
bridge" methods generated to allow generic method override return types.
Andrey Loskutov [Wed, 28 Oct 2015 19:24:12 +0000 (20:24 +0100)]
[performance] Remove synthetic access$ methods in transport package
Java compiler must generate synthetic access methods for private methods
and fields of the enclosing class if they are accessed from inner
classes and vice versa.
While invisible in the code, those synthetic access methods exist in the
bytecode and seem to produce some extra execution overhead at runtime
(compared with the direct access to this fields or methods), see
https://git.eclipse.org/r/58948/.
By removing the "private" access modifier from affected methods and
fields we help compiler to avoid generation of synthetic access methods
and hope to improve execution performance.
To validate changes, one can either use javap or use Bytecode Outline
plugin in Eclipse. In both cases one should look for "synthetic
access$<number>" methods at the end of the class and inner class files
in question - there should be none.
NB: don't mix this "synthetic access$" methods up with "public synthetic
bridge" methods generated to allow generic method override return types.
Andrey Loskutov [Tue, 27 Oct 2015 22:51:21 +0000 (23:51 +0100)]
[performance] Remove synthetic access$ methods in pack and file packages
Java compiler must generate synthetic access methods for private methods
and fields of the enclosing class if they are accessed from inner
classes and vice versa.
While invisible in the code, those synthetic access methods exist in the
bytecode and seem to produce some extra execution overhead at runtime
(compared with the direct access to this fields or methods), see
https://git.eclipse.org/r/58948/.
By removing the "private" access modifier from affected methods and
fields we help compiler to avoid generation of synthetic access methods
and hope to improve execution performance.
To validate changes, one can either use javap or use Bytecode Outline
plugin in Eclipse. In both cases one should look for "synthetic
access$<number>" methods at the end of the class and inner class files
in question - there should be none.
NB: don't mix this "synthetic access$" methods up with "public synthetic
bridge" methods generated to allow generic method override return types.
Terry Parker [Mon, 19 Oct 2015 22:59:47 +0000 (15:59 -0700)]
Expose bitmap selection parameters via PackConfig
Expose the following bitmap selection parameters via PackConfig:
"bitmapContiguousCommitCount", "bitmapRecentCommitCount",
"bitmapRecentCommitSpan", "bitmapDistantCommitSpan",
"bitmapExcessiveBranchCount", and "bitmapInactiveBranchAge".
The value of bitmapContiguousCommitCount, whereby bitmaps are
created for the most recent N commits in a branch, has never
been verified. If experiments show that they are not valuable,
then we can simplify the implementation so that there is only
a concept of recent and distant commit history (defined by
"bitmapRecentCommitCount"), and the only controls we need are
"bitmapRecentCommitSpan" and "bitmapDistantCommitSpan".
Change-Id: I288bf3f97d6fbfdfcd5dde2699eff433a7307fb9 Signed-off-by: Terry Parker <tparker@google.com>
Terry Parker [Tue, 20 Oct 2015 22:29:38 +0000 (15:29 -0700)]
Update bitmap selection throttling to fully span active branches.
Replace the “bitmapCommitRange” parameter that was recently introduced
with two new parameters: “bitmapExcessiveBranchCount” and
“bitmapInactiveBranchAgeInDays”. If the count of branches does not
exceed “bitmapExcessiveBranchCount”, then the current algorithm is kept
for all branches.
If the branch count is excessive, then the commit time for the tip
commit for each branch is used to determine if a branch is “inactive”.
"Active" branches get full commit selection using the existing
algorithm. "Inactive" branches get fewer bitmaps near the branch tips.
Introduce a "contiguousCommitCount" parameter that always enforces that
the N most recent commits in a branch are selected for bitmaps. The
previous nextSelectionDistance() algorithm created anywhere from 1-100
contiguous bitmaps at branch tips.
For example, consider a branch with commits numbering 0-300, with 0
being the most recent commit. If the most recent 200 commits are not
merge commits and the 200th commit was the last one selected,
nextSelectionDistance() returned 100, causing commits 200-101 to be
ignored. Then a window of size 100 was evaluated, searching for merge
commits. Since no merge commits are found, the next commit (commit 0)
was selected, for a total of 1 commit in the topmost 100 commits.
If instead the 250th commit was selected, then by the same logic
commit 50 is selected. At that point nextSelectionDistance() switches to
selecting consecutive commits, so commits 0-50 in the topmost 100
commits are selected. The "contiguousCommitCount" parameter provides
more determinism by always selecting a constant number or topmost
commits.
Add an optimization to break out of the inner loop of selectCommits() if
all of the commits for the current branch have already been found.
When reusing bitmaps from an existing pack, remove unnecessary
populating and clearing of the writeBitmaps/PackBitmapIndexBuilder.
Add comments to PackWriterBitmapPreparer, rename methods and variables
for readability.
Add tests for bitmap selection with and without merge commits and with
excessive branch pruning triggered.
Note: I will follow up with an additional change that exposes the new
parameters through PackConfig.
Change-Id: I5ccbb96c8849f331c302d9f7840e05f9650c4608 Signed-off-by: Terry Parker <tparker@google.com>
Matthias Sohn [Sun, 18 Oct 2015 16:28:04 +0000 (18:28 +0200)]
Silence Maven complaining about unset versions of reporting plugins
Since we use the reporting plugins only in the parent pom.xml there's no
point in using the new pluginManagement tag in the reporting section
which was introduced to fix
https://issues.apache.org/jira/browse/MSITE-443
Change-Id: I750ca3765e95afb06609a362fb3354afc3b66b90 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Andrei Pozolotin [Fri, 25 Sep 2015 20:55:32 +0000 (20:55 +0000)]
Adding JGitV1 and JGitV2 Walk Encryption
Building on top of https://git.eclipse.org/r/#/c/56391/
Here we preserve compatibility with JetS3t
and add 2 new native JGit encryption implementations.
For reference, see connection configuration files:
* Version 0: jgit-s3-connection-v-0.properties
* Version 1: jgit-s3-connection-v-1.properties
* Version 2: jgit-s3-connection-v-2.properties
Change-Id: I713290bcacbe92d88e5ef28ce137de73dd1abe2f Signed-off-by: Andrei Pozolotin <andrei.pozolotin@gmail.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Andrei Pozolotin [Mon, 21 Sep 2015 22:59:14 +0000 (22:59 +0000)]
Adding AES Walk Encryption support in http://www.jets3t.org/ mode
See previous attempt: https://git.eclipse.org/r/#/c/16674/
Here we preserve as much of JetS3t mode as possible
while allowing to use new Java 8+ PBE algorithms
such as PBEWithHmacSHA512AndAES_256
Summary of changes:
* change pom.xml to control long tests
* add WalkEncryptionTest.launch to run long tests
* add AmazonS3.Keys to to normalize use of constants
* change WalkEncryption to support AES in JetS3t mode
* add WalkEncryptionTest to test remote encryption pipeline
* add support for CI configuration for live Amazon S3 testing
* add log4j based logging for tests in both Eclipse and Maven build
To test locally, check out the review branch, then:
* create amazon test configuration file
* located your home dir: ${user.home}
* named jgit-s3-config.properties
* file format follows AmazonS3 connection settings file:
accesskey = your-amazon-access-key
secretkey = your-amazon-secret-key
test.bucket = your-bucket-for-testing
* finally:
* run in Eclipse: WalkEncryptionTest.launch
* or
* run in Shell: mvn test --define test=WalkEncryptionTest
Change-Id: I6f455fd9fb4eac261ca73d0bec6a4e7dae9f2e91 Signed-off-by: Andrei Pozolotin <andrei.pozolotin@gmail.com>
Andrey Loskutov [Sat, 10 Oct 2015 19:02:24 +0000 (22:02 +0300)]
Fixed jgit test failures on Windows
RepoCommandTest was failing because of open file handle left.
IgnoreNodeTest was failing because of problems with creation of files
with trailing spaces on Windows.
HookTest was failing because of wrong line delimiter.
Andrey Loskutov [Sun, 16 Aug 2015 11:00:00 +0000 (13:00 +0200)]
Delete non empty directories before checkout a path
If the checkout path is currently a non-empty directory (and was a link
or a regular file before), this directory will be removed before
performing checkout, but only if the checkout path is specified.
Terry Parker [Thu, 8 Oct 2015 22:06:37 +0000 (15:06 -0700)]
Limit the range of commits for which bitmaps are created.
A bitmap index contains bitmaps for a set of commits in a pack file.
Creating a bitmap for every commit is too expensive, so heuristics
select the most "important" commits. The most recent commits are the
most valuable. To clone a repository only those for the branch tips are
needed. When fetching, only commits since the last fetch are needed.
The commit selection heuristics generally work, but for some
repositories the number of selected commits is prohibitively high. One
example is the MSM 3.10 Linux kernel. With over 1 million commits on
2820 branches, the current heuristics resulted in +36k selected commits.
Each uncompressed bitmap for that repository is ~413k, making it
difficult to complete a GC operation in available memory.
The benefit of creating bitmaps over the entire history of a repository
like the MSM 3.10 Linux kernel isn't clear. For that repository, most
history for the last year appears to be in the last 100k commits.
Limiting bitmap commit selection to just those commits reduces the count
of selected commits from ~36k to ~10.5k. Dropping bitmaps for older
commits does not affect object counting times for clones or for fetches
on clients that are reasonably up-to-date.
This patch defines a new "bitmapCommitRange" PackConfig parameter to
limit the commit selection process when building bitmaps. The range
starts with the most recent commit and walks backwards. A range of 10k
considers only the 10000 most recent commits. A range of zero creates
bitmaps only for branch tips. A range of -1 (the default) does not limit
the range--all commits in the pack are used in the commit selection
process.
Change-Id: Ied92c70cfa0778facc670e0f14a0980bed5e3bfb Signed-off-by: Terry Parker <tparker@google.com>
Add utility method allowing to check for empty folders in workdir
Previously the method DirCacheCheckoutTest#assertWorkDir() silently
skipped over empty folders. If tests would have left unexpected empty
folders in the worktree this would be overlooked. Now empty folders have
to be specified by something like mkmap("<foldername>", "/", ...]
Change-Id: Idb8b270e92daf02ecdc381d148a5958bd83ec057 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>