You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

RevObject.java 4.0KB

ObjectIdOwnerMap: More lightweight map for ObjectIds OwnerMap is about 200 ms faster than SubclassMap, more friendly to the GC, and uses less storage: testing the "Counting objects" part of PackWriter on 1886362 objects: ObjectIdSubclassMap: load factor 50% table: 4194304 (wasted 2307942) ms spent 36998 36009 34795 34703 34941 35070 34284 34511 34638 34256 ms avg 34800 (last 9 runs) ObjectIdOwnerMap: load factor 100% table: 2097152 (wasted 210790) directory: 1024 ms spent 36842 35112 34922 34703 34580 34782 34165 34662 34314 34140 ms avg 34597 (last 9 runs) The major difference with OwnerMap is entries must extend from ObjectIdOwnerMap.Entry, where the OwnerMap has injected its own private "next" field into each object. This allows the OwnerMap to use a singly linked list for chaining collisions within a bucket. By putting collisions in a linked list, we gain the entire table back for the SHA-1 bits to index their own "private" slot. Unfortunately this means that each object can appear in at most ONE OwnerMap, as there is only one "next" field within the object instance to thread into the map. For types that are very object map heavy like RevWalk (entity RevObject) and PackWriter (entity ObjectToPack) this is sufficient, these entity types are only put into one map by their container. By introducing a new map type, we don't break existing applications that might be trying to use ObjectIdSubclassMap to track RevCommits they obtained from a RevWalk. The OwnerMap uses less memory. Each object uses 1 reference more (so we're up 1,886,362 references), but the table is 1/2 the size (2^20 rather than 2^21). The table itself wastes only 210,790 slots, rather than 2,307,942. So OwnerMap is wasting 200k fewer references. OwnerMap is more friendly to the GC, because it hardly ever generates garbage. As the map reaches its 100% load factor target, it doubles in size by allocating additional segment arrays of 2048 entries. (So the first grow allocates 1 segment, second 2 segments, third 4 segments, etc.) These segments are hooked into the pre-allocated directory of 1024 spaces. This permits the map to grow to 2 million objects before the directory itself has to grow. By using segments of 2048 entries, we are asking the GC to acquire 8,204 bytes in a 32 bit JVM. This is easier to satisfy then 2,307,942 bytes (for the 512k table that is just an intermediate step in the SubclassMap). By reusing the previously allocated segments (they are re-hashed in-place) we don't release any memory during a table grow. When the directory grows, it does so by discarding the old one and using one that is 4x larger (so the directory goes to 4096 entries on its first grow). A directory of size 4096 can handle up to 8 millon objects. The second directory grow (16384) goes to 33 million objects. At that point we're starting to really push the limits of the JVM heap, but at least its many small arrays. Previously SubclassMap would need a table of 67108864 entries to handle that object count, which needs a single contiguous allocation of 256 MiB. That's hard to come by in a 32 bit JVM. Instead OwnerMap uses 8192 arrays of about 8 KiB each. This is much easier to fit into a fragmented heap. Change-Id: Ia4acf5cfbf7e9b71bc7faa0db9060f6a969c0c50 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years ago
ObjectIdOwnerMap: More lightweight map for ObjectIds OwnerMap is about 200 ms faster than SubclassMap, more friendly to the GC, and uses less storage: testing the "Counting objects" part of PackWriter on 1886362 objects: ObjectIdSubclassMap: load factor 50% table: 4194304 (wasted 2307942) ms spent 36998 36009 34795 34703 34941 35070 34284 34511 34638 34256 ms avg 34800 (last 9 runs) ObjectIdOwnerMap: load factor 100% table: 2097152 (wasted 210790) directory: 1024 ms spent 36842 35112 34922 34703 34580 34782 34165 34662 34314 34140 ms avg 34597 (last 9 runs) The major difference with OwnerMap is entries must extend from ObjectIdOwnerMap.Entry, where the OwnerMap has injected its own private "next" field into each object. This allows the OwnerMap to use a singly linked list for chaining collisions within a bucket. By putting collisions in a linked list, we gain the entire table back for the SHA-1 bits to index their own "private" slot. Unfortunately this means that each object can appear in at most ONE OwnerMap, as there is only one "next" field within the object instance to thread into the map. For types that are very object map heavy like RevWalk (entity RevObject) and PackWriter (entity ObjectToPack) this is sufficient, these entity types are only put into one map by their container. By introducing a new map type, we don't break existing applications that might be trying to use ObjectIdSubclassMap to track RevCommits they obtained from a RevWalk. The OwnerMap uses less memory. Each object uses 1 reference more (so we're up 1,886,362 references), but the table is 1/2 the size (2^20 rather than 2^21). The table itself wastes only 210,790 slots, rather than 2,307,942. So OwnerMap is wasting 200k fewer references. OwnerMap is more friendly to the GC, because it hardly ever generates garbage. As the map reaches its 100% load factor target, it doubles in size by allocating additional segment arrays of 2048 entries. (So the first grow allocates 1 segment, second 2 segments, third 4 segments, etc.) These segments are hooked into the pre-allocated directory of 1024 spaces. This permits the map to grow to 2 million objects before the directory itself has to grow. By using segments of 2048 entries, we are asking the GC to acquire 8,204 bytes in a 32 bit JVM. This is easier to satisfy then 2,307,942 bytes (for the 512k table that is just an intermediate step in the SubclassMap). By reusing the previously allocated segments (they are re-hashed in-place) we don't release any memory during a table grow. When the directory grows, it does so by discarding the old one and using one that is 4x larger (so the directory goes to 4096 entries on its first grow). A directory of size 4096 can handle up to 8 millon objects. The second directory grow (16384) goes to 33 million objects. At that point we're starting to really push the limits of the JVM heap, but at least its many small arrays. Previously SubclassMap would need a table of 67108864 entries to handle that object count, which needs a single contiguous allocation of 256 MiB. That's hard to come by in a 32 bit JVM. Instead OwnerMap uses 8192 arrays of about 8 KiB each. This is much easier to fit into a fragmented heap. Change-Id: Ia4acf5cfbf7e9b71bc7faa0db9060f6a969c0c50 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161
  1. /*
  2. * Copyright (C) 2008, Shawn O. Pearce <spearce@spearce.org> and others
  3. *
  4. * This program and the accompanying materials are made available under the
  5. * terms of the Eclipse Distribution License v. 1.0 which is available at
  6. * https://www.eclipse.org/org/documents/edl-v10.php.
  7. *
  8. * SPDX-License-Identifier: BSD-3-Clause
  9. */
  10. package org.eclipse.jgit.revwalk;
  11. import java.io.IOException;
  12. import org.eclipse.jgit.errors.IncorrectObjectTypeException;
  13. import org.eclipse.jgit.errors.MissingObjectException;
  14. import org.eclipse.jgit.lib.AnyObjectId;
  15. import org.eclipse.jgit.lib.Constants;
  16. import org.eclipse.jgit.lib.ObjectId;
  17. import org.eclipse.jgit.lib.ObjectIdOwnerMap;
  18. /**
  19. * Base object type accessed during revision walking.
  20. */
  21. public abstract class RevObject extends ObjectIdOwnerMap.Entry {
  22. static final int PARSED = 1;
  23. int flags;
  24. RevObject(AnyObjectId name) {
  25. super(name);
  26. }
  27. abstract void parseHeaders(RevWalk walk) throws MissingObjectException,
  28. IncorrectObjectTypeException, IOException;
  29. abstract void parseBody(RevWalk walk) throws MissingObjectException,
  30. IncorrectObjectTypeException, IOException;
  31. /**
  32. * Get Git object type. See {@link org.eclipse.jgit.lib.Constants}.
  33. *
  34. * @return object type
  35. */
  36. public abstract int getType();
  37. /**
  38. * Get the name of this object.
  39. *
  40. * @return unique hash of this object.
  41. */
  42. public final ObjectId getId() {
  43. return this;
  44. }
  45. /**
  46. * Test to see if the flag has been set on this object.
  47. *
  48. * @param flag
  49. * the flag to test.
  50. * @return true if the flag has been added to this object; false if not.
  51. */
  52. public final boolean has(RevFlag flag) {
  53. return (flags & flag.mask) != 0;
  54. }
  55. /**
  56. * Test to see if any flag in the set has been set on this object.
  57. *
  58. * @param set
  59. * the flags to test.
  60. * @return true if any flag in the set has been added to this object; false
  61. * if not.
  62. */
  63. public final boolean hasAny(RevFlagSet set) {
  64. return (flags & set.mask) != 0;
  65. }
  66. /**
  67. * Test to see if all flags in the set have been set on this object.
  68. *
  69. * @param set
  70. * the flags to test.
  71. * @return true if all flags of the set have been added to this object;
  72. * false if some or none have been added.
  73. */
  74. public final boolean hasAll(RevFlagSet set) {
  75. return (flags & set.mask) == set.mask;
  76. }
  77. /**
  78. * Add a flag to this object.
  79. * <p>
  80. * If the flag is already set on this object then the method has no effect.
  81. *
  82. * @param flag
  83. * the flag to mark on this object, for later testing.
  84. */
  85. public final void add(RevFlag flag) {
  86. flags |= flag.mask;
  87. }
  88. /**
  89. * Add a set of flags to this object.
  90. *
  91. * @param set
  92. * the set of flags to mark on this object, for later testing.
  93. */
  94. public final void add(RevFlagSet set) {
  95. flags |= set.mask;
  96. }
  97. /**
  98. * Remove a flag from this object.
  99. * <p>
  100. * If the flag is not set on this object then the method has no effect.
  101. *
  102. * @param flag
  103. * the flag to remove from this object.
  104. */
  105. public final void remove(RevFlag flag) {
  106. flags &= ~flag.mask;
  107. }
  108. /**
  109. * Remove a set of flags from this object.
  110. *
  111. * @param set
  112. * the flag to remove from this object.
  113. */
  114. public final void remove(RevFlagSet set) {
  115. flags &= ~set.mask;
  116. }
  117. /** {@inheritDoc} */
  118. @Override
  119. public String toString() {
  120. final StringBuilder s = new StringBuilder();
  121. s.append(Constants.typeString(getType()));
  122. s.append(' ');
  123. s.append(name());
  124. s.append(' ');
  125. appendCoreFlags(s);
  126. return s.toString();
  127. }
  128. /**
  129. * Append a debug description of core RevFlags to a buffer.
  130. *
  131. * @param s
  132. * buffer to append a debug description of core RevFlags onto.
  133. */
  134. protected void appendCoreFlags(StringBuilder s) {
  135. s.append((flags & RevWalk.TOPO_QUEUED) != 0 ? 'o' : '-');
  136. s.append((flags & RevWalk.TEMP_MARK) != 0 ? 't' : '-');
  137. s.append((flags & RevWalk.REWRITE) != 0 ? 'r' : '-');
  138. s.append((flags & RevWalk.UNINTERESTING) != 0 ? 'u' : '-');
  139. s.append((flags & RevWalk.SEEN) != 0 ? 's' : '-');
  140. s.append((flags & RevWalk.PARSED) != 0 ? 'p' : '-');
  141. }
  142. }