You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

BitSet.java 3.6KB

Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121
  1. /*
  2. * Copyright (C) 2012, Google Inc.
  3. * and other copyright owners as documented in the project's IP log.
  4. *
  5. * This program and the accompanying materials are made available
  6. * under the terms of the Eclipse Distribution License v1.0 which
  7. * accompanies this distribution, is reproduced below, and is
  8. * available at http://www.eclipse.org/org/documents/edl-v10.php
  9. *
  10. * All rights reserved.
  11. *
  12. * Redistribution and use in source and binary forms, with or
  13. * without modification, are permitted provided that the following
  14. * conditions are met:
  15. *
  16. * - Redistributions of source code must retain the above copyright
  17. * notice, this list of conditions and the following disclaimer.
  18. *
  19. * - Redistributions in binary form must reproduce the above
  20. * copyright notice, this list of conditions and the following
  21. * disclaimer in the documentation and/or other materials provided
  22. * with the distribution.
  23. *
  24. * - Neither the name of the Eclipse Foundation, Inc. nor the
  25. * names of its contributors may be used to endorse or promote
  26. * products derived from this software without specific prior
  27. * written permission.
  28. *
  29. * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
  30. * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
  31. * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  32. * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  33. * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
  34. * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  35. * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  36. * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  37. * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
  38. * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
  39. * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
  40. * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
  41. * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  42. */
  43. package org.eclipse.jgit.storage.file;
  44. import java.util.Arrays;
  45. import javaewah.EWAHCompressedBitmap;
  46. /**
  47. * A random access BitSet to supports efficient conversions to
  48. * EWAHCompressedBitmap.
  49. */
  50. final class BitSet {
  51. private long[] words;
  52. BitSet(int initialCapacity) {
  53. words = new long[block(initialCapacity) + 1];
  54. }
  55. final void clear() {
  56. Arrays.fill(words, 0);
  57. }
  58. final void set(int position) {
  59. int block = block(position);
  60. if (block >= words.length) {
  61. long[] buf = new long[2 * block(position)];
  62. System.arraycopy(words, 0, buf, 0, words.length);
  63. words = buf;
  64. }
  65. words[block] |= mask(position);
  66. }
  67. final void clear(int position) {
  68. int block = block(position);
  69. if (block < words.length)
  70. words[block] &= ~mask(position);
  71. }
  72. final boolean get(int position) {
  73. int block = block(position);
  74. return block < words.length && (words[block] & mask(position)) != 0;
  75. }
  76. final EWAHCompressedBitmap toEWAHCompressedBitmap() {
  77. EWAHCompressedBitmap compressed = new EWAHCompressedBitmap(
  78. words.length);
  79. int runningEmptyWords = 0;
  80. long lastNonEmptyWord = 0;
  81. for (long word : words) {
  82. if (word == 0) {
  83. runningEmptyWords++;
  84. continue;
  85. }
  86. if (lastNonEmptyWord != 0)
  87. compressed.add(lastNonEmptyWord);
  88. if (runningEmptyWords > 0) {
  89. compressed.addStreamOfEmptyWords(false, runningEmptyWords);
  90. runningEmptyWords = 0;
  91. }
  92. lastNonEmptyWord = word;
  93. }
  94. int bitsThatMatter = 64 - Long.numberOfLeadingZeros(lastNonEmptyWord);
  95. if (bitsThatMatter > 0)
  96. compressed.add(lastNonEmptyWord, bitsThatMatter);
  97. return compressed;
  98. }
  99. private static final int block(int position) {
  100. return position >> 6;
  101. }
  102. private static final long mask(int position) {
  103. return 1L << position;
  104. }
  105. }