You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

PackBitmapIndexV1.java 6.4KB

Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 years ago
Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 years ago
Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 years ago
Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 years ago
Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 years ago
Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 years ago
Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 years ago
Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 years ago
Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 years ago
Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 years ago
Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 years ago
Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 years ago
Limit the range of commits for which bitmaps are created. A bitmap index contains bitmaps for a set of commits in a pack file. Creating a bitmap for every commit is too expensive, so heuristics select the most "important" commits. The most recent commits are the most valuable. To clone a repository only those for the branch tips are needed. When fetching, only commits since the last fetch are needed. The commit selection heuristics generally work, but for some repositories the number of selected commits is prohibitively high. One example is the MSM 3.10 Linux kernel. With over 1 million commits on 2820 branches, the current heuristics resulted in +36k selected commits. Each uncompressed bitmap for that repository is ~413k, making it difficult to complete a GC operation in available memory. The benefit of creating bitmaps over the entire history of a repository like the MSM 3.10 Linux kernel isn't clear. For that repository, most history for the last year appears to be in the last 100k commits. Limiting bitmap commit selection to just those commits reduces the count of selected commits from ~36k to ~10.5k. Dropping bitmaps for older commits does not affect object counting times for clones or for fetches on clients that are reasonably up-to-date. This patch defines a new "bitmapCommitRange" PackConfig parameter to limit the commit selection process when building bitmaps. The range starts with the most recent commit and walks backwards. A range of 10k considers only the 10000 most recent commits. A range of zero creates bitmaps only for branch tips. A range of -1 (the default) does not limit the range--all commits in the pack are used in the commit selection process. Change-Id: Ied92c70cfa0778facc670e0f14a0980bed5e3bfb Signed-off-by: Terry Parker <tparker@google.com>
8 years ago
Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 years ago
Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217
  1. /*
  2. * Copyright (C) 2012, Google Inc. and others
  3. *
  4. * This program and the accompanying materials are made available under the
  5. * terms of the Eclipse Distribution License v. 1.0 which is available at
  6. * https://www.eclipse.org/org/documents/edl-v10.php.
  7. *
  8. * SPDX-License-Identifier: BSD-3-Clause
  9. */
  10. package org.eclipse.jgit.internal.storage.file;
  11. import java.io.DataInput;
  12. import java.io.IOException;
  13. import java.io.InputStream;
  14. import java.text.MessageFormat;
  15. import java.util.Arrays;
  16. import org.eclipse.jgit.internal.JGitText;
  17. import org.eclipse.jgit.lib.AnyObjectId;
  18. import org.eclipse.jgit.lib.Constants;
  19. import org.eclipse.jgit.lib.ObjectId;
  20. import org.eclipse.jgit.lib.ObjectIdOwnerMap;
  21. import org.eclipse.jgit.util.IO;
  22. import org.eclipse.jgit.util.NB;
  23. import com.googlecode.javaewah.EWAHCompressedBitmap;
  24. /**
  25. * Support for the pack bitmap index v1 format.
  26. *
  27. * @see PackBitmapIndex
  28. */
  29. class PackBitmapIndexV1 extends BasePackBitmapIndex {
  30. static final byte[] MAGIC = { 'B', 'I', 'T', 'M' };
  31. static final int OPT_FULL = 1;
  32. private static final int MAX_XOR_OFFSET = 126;
  33. private final PackIndex packIndex;
  34. private final PackReverseIndex reverseIndex;
  35. private final EWAHCompressedBitmap commits;
  36. private final EWAHCompressedBitmap trees;
  37. private final EWAHCompressedBitmap blobs;
  38. private final EWAHCompressedBitmap tags;
  39. private final ObjectIdOwnerMap<StoredBitmap> bitmaps;
  40. PackBitmapIndexV1(final InputStream fd, PackIndex packIndex,
  41. PackReverseIndex reverseIndex) throws IOException {
  42. super(new ObjectIdOwnerMap<StoredBitmap>());
  43. this.packIndex = packIndex;
  44. this.reverseIndex = reverseIndex;
  45. this.bitmaps = getBitmaps();
  46. final byte[] scratch = new byte[32];
  47. IO.readFully(fd, scratch, 0, scratch.length);
  48. // Check the magic bytes
  49. for (int i = 0; i < MAGIC.length; i++) {
  50. if (scratch[i] != MAGIC[i]) {
  51. byte[] actual = new byte[MAGIC.length];
  52. System.arraycopy(scratch, 0, actual, 0, MAGIC.length);
  53. throw new IOException(MessageFormat.format(
  54. JGitText.get().expectedGot, Arrays.toString(MAGIC),
  55. Arrays.toString(actual)));
  56. }
  57. }
  58. // Read the version (2 bytes)
  59. final int version = NB.decodeUInt16(scratch, 4);
  60. if (version != 1)
  61. throw new IOException(MessageFormat.format(
  62. JGitText.get().unsupportedPackIndexVersion,
  63. Integer.valueOf(version)));
  64. // Read the options (2 bytes)
  65. final int opts = NB.decodeUInt16(scratch, 6);
  66. if ((opts & OPT_FULL) == 0)
  67. throw new IOException(MessageFormat.format(
  68. JGitText.get().expectedGot, Integer.valueOf(OPT_FULL),
  69. Integer.valueOf(opts)));
  70. // Read the number of entries (1 int32)
  71. long numEntries = NB.decodeUInt32(scratch, 8);
  72. if (numEntries > Integer.MAX_VALUE)
  73. throw new IOException(JGitText.get().indexFileIsTooLargeForJgit);
  74. // Checksum applied on the bottom of the corresponding pack file.
  75. this.packChecksum = new byte[20];
  76. System.arraycopy(scratch, 12, packChecksum, 0, packChecksum.length);
  77. // Read the bitmaps for the Git types
  78. SimpleDataInput dataInput = new SimpleDataInput(fd);
  79. this.commits = readBitmap(dataInput);
  80. this.trees = readBitmap(dataInput);
  81. this.blobs = readBitmap(dataInput);
  82. this.tags = readBitmap(dataInput);
  83. // An entry is object id, xor offset, flag byte, and a length encoded
  84. // bitmap. The object id is an int32 of the nth position sorted by name.
  85. // The xor offset is a single byte offset back in the list of entries.
  86. StoredBitmap[] recentBitmaps = new StoredBitmap[MAX_XOR_OFFSET];
  87. for (int i = 0; i < (int) numEntries; i++) {
  88. IO.readFully(fd, scratch, 0, 6);
  89. int nthObjectId = NB.decodeInt32(scratch, 0);
  90. int xorOffset = scratch[4];
  91. int flags = scratch[5];
  92. EWAHCompressedBitmap bitmap = readBitmap(dataInput);
  93. if (nthObjectId < 0)
  94. throw new IOException(MessageFormat.format(
  95. JGitText.get().invalidId, String.valueOf(nthObjectId)));
  96. if (xorOffset < 0)
  97. throw new IOException(MessageFormat.format(
  98. JGitText.get().invalidId, String.valueOf(xorOffset)));
  99. if (xorOffset > MAX_XOR_OFFSET)
  100. throw new IOException(MessageFormat.format(
  101. JGitText.get().expectedLessThanGot,
  102. String.valueOf(MAX_XOR_OFFSET),
  103. String.valueOf(xorOffset)));
  104. if (xorOffset > i)
  105. throw new IOException(MessageFormat.format(
  106. JGitText.get().expectedLessThanGot, String.valueOf(i),
  107. String.valueOf(xorOffset)));
  108. ObjectId objectId = packIndex.getObjectId(nthObjectId);
  109. StoredBitmap xorBitmap = null;
  110. if (xorOffset > 0) {
  111. int index = (i - xorOffset);
  112. xorBitmap = recentBitmaps[index % recentBitmaps.length];
  113. if (xorBitmap == null)
  114. throw new IOException(MessageFormat.format(
  115. JGitText.get().invalidId,
  116. String.valueOf(xorOffset)));
  117. }
  118. StoredBitmap sb = new StoredBitmap(
  119. objectId, bitmap, xorBitmap, flags);
  120. bitmaps.add(sb);
  121. recentBitmaps[i % recentBitmaps.length] = sb;
  122. }
  123. }
  124. /** {@inheritDoc} */
  125. @Override
  126. public int findPosition(AnyObjectId objectId) {
  127. long offset = packIndex.findOffset(objectId);
  128. if (offset == -1)
  129. return -1;
  130. return reverseIndex.findPostion(offset);
  131. }
  132. /** {@inheritDoc} */
  133. @Override
  134. public ObjectId getObject(int position) throws IllegalArgumentException {
  135. ObjectId objectId = reverseIndex.findObjectByPosition(position);
  136. if (objectId == null)
  137. throw new IllegalArgumentException();
  138. return objectId;
  139. }
  140. /** {@inheritDoc} */
  141. @Override
  142. public int getObjectCount() {
  143. return (int) packIndex.getObjectCount();
  144. }
  145. /** {@inheritDoc} */
  146. @Override
  147. public EWAHCompressedBitmap ofObjectType(
  148. EWAHCompressedBitmap bitmap, int type) {
  149. switch (type) {
  150. case Constants.OBJ_BLOB:
  151. return blobs.and(bitmap);
  152. case Constants.OBJ_TREE:
  153. return trees.and(bitmap);
  154. case Constants.OBJ_COMMIT:
  155. return commits.and(bitmap);
  156. case Constants.OBJ_TAG:
  157. return tags.and(bitmap);
  158. }
  159. throw new IllegalArgumentException();
  160. }
  161. /** {@inheritDoc} */
  162. @Override
  163. public int getBitmapCount() {
  164. return bitmaps.size();
  165. }
  166. /** {@inheritDoc} */
  167. @Override
  168. public boolean equals(Object o) {
  169. // TODO(cranger): compare the pack checksum?
  170. if (o instanceof PackBitmapIndexV1)
  171. return getPackIndex() == ((PackBitmapIndexV1) o).getPackIndex();
  172. return false;
  173. }
  174. /** {@inheritDoc} */
  175. @Override
  176. public int hashCode() {
  177. return getPackIndex().hashCode();
  178. }
  179. PackIndex getPackIndex() {
  180. return packIndex;
  181. }
  182. private static EWAHCompressedBitmap readBitmap(DataInput dataInput)
  183. throws IOException {
  184. EWAHCompressedBitmap bitmap = new EWAHCompressedBitmap();
  185. bitmap.deserialize(dataInput);
  186. return bitmap;
  187. }
  188. }