Hoist ObjectIdSet up to lib as part of the public API and add
the interface to some common types like PackIndex and JGit custom
ObjectId map types. This cleans up wrapper code in a number of
places by allowing direct use of the types as an ObjectIdSet.
Future commits can now rely on ObjectIdSet as a simple read-only
type to check a set of objects from a number of storage options.
Change-Id: Ib62b062421d475bd52abd6c84a73916ef36e084b
[performance] Remove synthetic access$ methods in lib, util and dircache
Java compiler must generate synthetic access methods for private methods
and fields of the enclosing class if they are accessed from inner
classes and vice versa.
While invisible in the code, those synthetic access methods exist in the
bytecode and seem to produce some extra execution overhead at runtime
(compared with the direct access to this fields or methods), see
https://git.eclipse.org/r/58948/.
By removing the "private" access modifier from affected methods and
fields we help compiler to avoid generation of synthetic access methods
and hope to improve execution performance.
To validate changes, one can either use javap or use Bytecode Outline
plugin in Eclipse. In both cases one should look for "synthetic
access$<number>" methods at the end of the class and inner class files
in question - there should be none.
NB: don't mix this "synthetic access$" methods up with "public synthetic
bridge" methods generated to allow generic method override return types.
Change-Id: Ie7b65f251ec4452d5a5ed48aa0f272cf49a9aecd
Signed-off-by: Andrey Loskutov <loskutov@gmx.de>
ObjectIdOwnerMap: More lightweight map for ObjectIds
OwnerMap is about 200 ms faster than SubclassMap, more friendly to the
GC, and uses less storage: testing the "Counting objects" part of
PackWriter on 1886362 objects:
ObjectIdSubclassMap:
load factor 50%
table: 4194304 (wasted 2307942)
ms spent 36998 36009 34795 34703 34941 35070 34284 34511 34638 34256
ms avg 34800 (last 9 runs)
ObjectIdOwnerMap:
load factor 100%
table: 2097152 (wasted 210790)
directory: 1024
ms spent 36842 35112 34922 34703 34580 34782 34165 34662 34314 34140
ms avg 34597 (last 9 runs)
The major difference with OwnerMap is entries must extend from
ObjectIdOwnerMap.Entry, where the OwnerMap has injected its own
private "next" field into each object. This allows the OwnerMap to use
a singly linked list for chaining collisions within a bucket. By
putting collisions in a linked list, we gain the entire table back for
the SHA-1 bits to index their own "private" slot.
Unfortunately this means that each object can appear in at most ONE
OwnerMap, as there is only one "next" field within the object instance
to thread into the map. For types that are very object map heavy like
RevWalk (entity RevObject) and PackWriter (entity ObjectToPack) this
is sufficient, these entity types are only put into one map by their
container. By introducing a new map type, we don't break existing
applications that might be trying to use ObjectIdSubclassMap to track
RevCommits they obtained from a RevWalk.
The OwnerMap uses less memory. Each object uses 1 reference more (so
we're up 1,886,362 references), but the table is 1/2 the size (2^20
rather than 2^21). The table itself wastes only 210,790 slots, rather
than 2,307,942. So OwnerMap is wasting 200k fewer references.
OwnerMap is more friendly to the GC, because it hardly ever generates
garbage. As the map reaches its 100% load factor target, it doubles in
size by allocating additional segment arrays of 2048 entries. (So the
first grow allocates 1 segment, second 2 segments, third 4 segments,
etc.) These segments are hooked into the pre-allocated directory of
1024 spaces. This permits the map to grow to 2 million objects before
the directory itself has to grow. By using segments of 2048 entries,
we are asking the GC to acquire 8,204 bytes in a 32 bit JVM. This is
easier to satisfy then 2,307,942 bytes (for the 512k table that is
just an intermediate step in the SubclassMap). By reusing the
previously allocated segments (they are re-hashed in-place) we don't
release any memory during a table grow.
When the directory grows, it does so by discarding the old one and
using one that is 4x larger (so the directory goes to 4096 entries on
its first grow). A directory of size 4096 can handle up to 8 millon
objects. The second directory grow (16384) goes to 33 million objects.
At that point we're starting to really push the limits of the JVM
heap, but at least its many small arrays. Previously SubclassMap would
need a table of 67108864 entries to handle that object count, which
needs a single contiguous allocation of 256 MiB. That's hard to come
by in a 32 bit JVM. Instead OwnerMap uses 8192 arrays of about 8 KiB
each. This is much easier to fit into a fragmented heap.
Change-Id: Ia4acf5cfbf7e9b71bc7faa0db9060f6a969c0c50
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
ObjectIdSubclassMap: Micro-optimize wrapping at end of table
During a review of the class, Josh Bloch pointed out we can use
"i = (i + 1) & mask" to wrap around at the end of the table, instead
of a conditional with a branch. This is generally faster due to one
less branch that will be mis-predicted by the CPU.
Change-Id: Ic88c00455ebc6adde9708563a6ad4d0377442bba
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
ObjectIdSubclassMap: Avoid field loads in inner loops
Ensure the JIT knows the table cannot be changed during the critical
inner loop of get() or insert() by loading the field into a final
local variable. This shouldn't be necessary, but the instance member
is declared non-final (to resizing) and it is not very obvious to the
JIT that the table cannot be modified by AnyObjectId.equals().
Simplify the JIT's decision making by making it obvious, these
values cannot change during the critical inner loop, allowing
for better register allocation.
Change-Id: I0d797533fc5327366f1207b0937c406f02cdaab3
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This method is trivial in definition, and is called in only 3
places. Inline the method manually to ensure its really going
to be inlined by the JIT at runtime.
Change-Id: I128522af8167c07d2de6cc210573599038871dda
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
32 is way to small for the map. Most applications using the map
will need to load more than 16 objects just from the root refs
being read from the Repository.
Default the initial size to 2048. This cuts out 6 expansions in
the early life of the table, reducing garbage and rehashing time.
Change-Id: I6dd076ebc0b284f1755855d383b79535604ac547
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
If the table needs to be grown, do it before the current insertion
rather than after. This is a tiny micro-optimization that allows
the compiler to reuse the result of "++size" to compare against
previously pre-computed size at which the table should rehash itself.
Change-Id: Ief6f81b91c10ed433d67e0182f558ca70d58a2b0
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
ObjectIdSubclassMap: Use & rather than % for hashing
Bitwise and is faster than integer modulus operations, and since
the table size is always a power of 2, this is simple to use for
index operation.
Change-Id: I83d01e5c74fd9e910c633a98ea6f90b59092ba29
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
obj_hash doesn't match our naming conventions, camelCaseNames
are the preferred format.
Change-Id: I72da199daccb60a98d17b6af1e498189bf149515
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The new addIfAbsent() method combines get() with add(), but does
it in a single step so that the common case of get() returning null
for a new object can immediately insert the object into the map.
Change-Id: Ib599ab4de13ad67665ccfccf3ece52ba3222bcba
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
ObjectIdSubclassMap: Correct Iterator to throw NoSuchElementException
The Iterator contract says next() shall throw NoSuchElementException
if there are no more items remaining in the iteration. We got this
wrong when I originally wrote the implementation, so fix it.
Change-Id: Iea25e6569ead5c8b3128b8a368c5b2caebec7ecc
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
This class behaves like a cross between a Set and a Map, sometimes
we might expect to use the method isEmpty() to test for size() == 0.
So implement it, reducing the surprise folks get when they are given
one of these objects.
Change-Id: I0d68e1243da8e62edf79c6ba4fd925f643e80a88
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The initial contribution was handled through a CQ, and does not need
to be reported as an individual bug record in the project's IP log.
Its an odd corner case that the EMO IP team doesn't want to see,
even though its technically a contribution written by at least
some non-committers.
The project.skipCommit variable can now be used to mask out any
particular change from the IP log. Currently within JGit we want
to mask only the initial commit, but others could be masked if the
need arises.
Change-Id: I598e08137ddc5913284471ee2aa545f4df685023
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
As discussed on the egit-dev mailing list, we prefer not to have
trailing whitespace in our source code. Correct all currently
offending lines by trimming them.
Change-Id: I002b1d1980071084c0bc53242c8f5900970e6845
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Per CQ 3448 this is the initial contribution of the JGit project
to eclipse.org. It is derived from the historical JGit repository
at commit 3a2dd9921c.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>