Added read/write support for pack bitmap index.
A pack bitmap index is an additional index of compressed
bitmaps of the object graph. Furthermore, a logical API of the index
functionality is included, as it is expected to be used by the
PackWriter.
Compressed bitmaps are created using the javaewah library, which is a
word-aligned compressed variant of the Java bitset class based on
run-length encoding. The library only works with positive integer
values. Thus, the maximum number of ObjectIds in a pack file that
this index can currently support is limited to Integer.MAX_VALUE.
Every ObjectId is given an integer mapping. The integer is the
position of the ObjectId in the complete ObjectId list, sorted
by offset, for the pack file. That integer is what the bitmaps
use to reference the ObjectId. Currently, the new index format can
only be used with pack files that contain a complete closure of the
object graph e.g. the result of a garbage collection.
The index file includes four bitmaps for the Git object types i.e.
commits, trees, blobs, and tags. In addition, a collection of
bitmaps keyed by an ObjectId is also included. The bitmap for each entry
in the collection represents the full closure of ObjectIds reachable
from the keyed ObjectId (including the keyed ObjectId itself). The
bitmaps are further compressed by XORing the current bitmaps against
prior bitmaps in the index, and selecting the smallest representation.
The XOR'd bitmap and offset from the current entry to the position
of the bitmap to XOR against is the actual representation of the entry
in the index file. Each entry contains one byte, which is currently
used to note whether the bitmap should be blindly reused.
Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 years ago |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121 |
- /*
- * Copyright (C) 2012, Google Inc.
- * and other copyright owners as documented in the project's IP log.
- *
- * This program and the accompanying materials are made available
- * under the terms of the Eclipse Distribution License v1.0 which
- * accompanies this distribution, is reproduced below, and is
- * available at http://www.eclipse.org/org/documents/edl-v10.php
- *
- * All rights reserved.
- *
- * Redistribution and use in source and binary forms, with or
- * without modification, are permitted provided that the following
- * conditions are met:
- *
- * - Redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer.
- *
- * - Redistributions in binary form must reproduce the above
- * copyright notice, this list of conditions and the following
- * disclaimer in the documentation and/or other materials provided
- * with the distribution.
- *
- * - Neither the name of the Eclipse Foundation, Inc. nor the
- * names of its contributors may be used to endorse or promote
- * products derived from this software without specific prior
- * written permission.
- *
- * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
- * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
- * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
- * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
- * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
- * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
- * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
- * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
- * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
- * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
- * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
- */
-
- package org.eclipse.jgit.storage.file;
-
- import java.util.Arrays;
-
- import javaewah.EWAHCompressedBitmap;
-
- /**
- * A random access BitSet to supports efficient conversions to
- * EWAHCompressedBitmap.
- */
- final class BitSet {
-
- private long[] words;
-
- BitSet(int initialCapacity) {
- words = new long[block(initialCapacity) + 1];
- }
-
- final void clear() {
- Arrays.fill(words, 0);
- }
-
- final void set(int position) {
- int block = block(position);
- if (block >= words.length) {
- long[] buf = new long[2 * block(position)];
- System.arraycopy(words, 0, buf, 0, words.length);
- words = buf;
- }
- words[block] |= mask(position);
- }
-
- final void clear(int position) {
- int block = block(position);
- if (block < words.length)
- words[block] &= ~mask(position);
- }
-
- final boolean get(int position) {
- int block = block(position);
- return block < words.length && (words[block] & mask(position)) != 0;
- }
-
- final EWAHCompressedBitmap toEWAHCompressedBitmap() {
- EWAHCompressedBitmap compressed = new EWAHCompressedBitmap(
- words.length);
- int runningEmptyWords = 0;
- long lastNonEmptyWord = 0;
- for (long word : words) {
- if (word == 0) {
- runningEmptyWords++;
- continue;
- }
-
- if (lastNonEmptyWord != 0)
- compressed.add(lastNonEmptyWord);
-
- if (runningEmptyWords > 0) {
- compressed.addStreamOfEmptyWords(false, runningEmptyWords);
- runningEmptyWords = 0;
- }
-
- lastNonEmptyWord = word;
- }
- int bitsThatMatter = 64 - Long.numberOfLeadingZeros(lastNonEmptyWord);
- if (bitsThatMatter > 0)
- compressed.add(lastNonEmptyWord, bitsThatMatter);
- return compressed;
- }
-
- private static final int block(int position) {
- return position >> 6;
- }
-
- private static final long mask(int position) {
- return 1L << position;
- }
- }
|