From fa4cc2475fb127783e98ddb56b6c1fd155cd2bc4 Mon Sep 17 00:00:00 2001 From: "Shawn O. Pearce" Date: Fri, 1 Jul 2011 18:31:53 -0700 Subject: [PATCH] DFS: A storage layer for JGit In practice the DHT storage layer has not been performing as well as large scale server environments want to see from a Git server. The performance of the DHT schema degrades rapidly as small changes are pushed into the repository due to the chunk size being less than 1/3 of the pushed pack size. Small chunks cause poor prefetch performance during reading, and require significantly longer prefetch lists inside of the chunk meta field to work around the small size. The DHT code is very complex (>17,000 lines of code) and is very sensitive to the underlying database round-trip time, as well as the way objects were written into the pack stream that was chunked and stored on the database. A poor pack layout (from any version of C Git prior to Junio reworking it) can cause the DHT code to be unable to enumerate the objects of the linux-2.6 repository in a completable time scale. Performing a clone from a DHT stored repository of 2 million objects takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row for each object being cloned. This is very difficult for some DHTs to scale, even at 5000 rows/second the lookup stage alone takes 6 minutes (on local filesystem, this is almost too fast to bother measuring). Some servers like Apache Cassandra just fall over and cannot complete the 2 million lookups in rapid fire. On a ~400 MiB repository, the DHT schema has an extra 25 MiB of redundant data that gets downloaded to the JGit process, and that is before you consider the cost of the OBJECT_INDEX table also being fully loaded, which is at least 223 MiB of data for the linux kernel repository. In the DHT schema answering a `git clone` of the ~400 MiB linux kernel needs to load 248 MiB of "index" data from the DHT, in addition to the ~400 MiB of pack data that gets sent to the client. This is 193 MiB more data to be accessed than the native filesystem format, but it needs to come over a much smaller pipe (local Ethernet typically) than the local SATA disk drive. I also never got around to writing the "repack" support for the DHT schema, as it turns out to be fairly complex to safely repack data in the repository while also trying to minimize the amount of changes made to the database, due to very common limitations on database mutation rates.. This new DFS storage layer fixes a lot of those issues by taking the simple approach for storing relatively standard Git pack and index files on an abstract filesystem. Packs are accessed by an in-process buffer cache, similar to the WindowCache used by the local filesystem storage layer. Unlike the local file IO, there are some assumptions that the storage system has relatively high latency and no concept of "file handles". Instead it looks at the file more like HTTP byte range requests, where a read channel is a simply a thunk to trigger a read request over the network. The DFS code in this change is still abstract, it does not store on any particular filesystem, but is fairly well suited to the Amazon S3 or Apache Hadoop HDFS. Storing packs directly on HDFS rather than HBase removes a layer of abstraction, as most HBase row reads turn into an HDFS read. Most of the DFS code in this change was blatently copied from the local filesystem code. Most parts should be refactored to be shared between the two storage systems, but right now I am hesistent to do this due to how well tuned the local filesystem code currently is. Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb --- org.eclipse.jgit/META-INF/MANIFEST.MF | 1 + .../jgit/storage/dfs/DfsText.properties | 4 + .../org/eclipse/jgit/lib/ConfigConstants.java | 21 + .../jgit/storage/dfs/DeltaBaseCache.java | 194 ++++ .../eclipse/jgit/storage/dfs/DfsBlock.java | 155 +++ .../jgit/storage/dfs/DfsBlockCache.java | 561 +++++++++ .../jgit/storage/dfs/DfsBlockCacheConfig.java | 212 ++++ .../jgit/storage/dfs/DfsCachedPack.java | 92 ++ .../eclipse/jgit/storage/dfs/DfsConfig.java | 61 + .../jgit/storage/dfs/DfsGarbageCollector.java | 347 ++++++ .../eclipse/jgit/storage/dfs/DfsInserter.java | 393 +++++++ .../jgit/storage/dfs/DfsObjDatabase.java | 403 +++++++ .../storage/dfs/DfsObjectRepresentation.java | 90 ++ .../jgit/storage/dfs/DfsObjectToPack.java | 82 ++ .../jgit/storage/dfs/DfsOutputStream.java | 98 ++ .../jgit/storage/dfs/DfsPackCompactor.java | 317 +++++ .../jgit/storage/dfs/DfsPackDescription.java | 266 +++++ .../eclipse/jgit/storage/dfs/DfsPackFile.java | 1025 +++++++++++++++++ .../eclipse/jgit/storage/dfs/DfsPackKey.java | 55 + .../jgit/storage/dfs/DfsPackParser.java | 450 ++++++++ .../eclipse/jgit/storage/dfs/DfsReader.java | 793 +++++++++++++ .../jgit/storage/dfs/DfsReaderOptions.java | 135 +++ .../jgit/storage/dfs/DfsRefDatabase.java | 450 ++++++++ .../jgit/storage/dfs/DfsRefRename.java | 73 ++ .../jgit/storage/dfs/DfsRefUpdate.java | 157 +++ .../jgit/storage/dfs/DfsRepository.java | 121 ++ .../storage/dfs/DfsRepositoryBuilder.java | 140 +++ .../org/eclipse/jgit/storage/dfs/DfsText.java | 60 + .../jgit/storage/dfs/InMemoryRepository.java | 279 +++++ .../storage/dfs/LargePackedWholeObject.java | 128 ++ .../jgit/storage/dfs/PackInputStream.java | 85 ++ .../ReadAheadRejectedExecutionHandler.java | 61 + .../jgit/storage/dfs/ReadAheadTask.java | 241 ++++ .../jgit/storage/dfs/ReadableChannel.java | 103 ++ .../src/org/eclipse/jgit/util/IO.java | 32 + .../jgit/util/io/CountingOutputStream.java | 90 ++ 36 files changed, 7775 insertions(+) create mode 100644 org.eclipse.jgit/resources/org/eclipse/jgit/storage/dfs/DfsText.properties create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DeltaBaseCache.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsBlock.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsBlockCache.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsBlockCacheConfig.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsCachedPack.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsConfig.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsGarbageCollector.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsInserter.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsObjDatabase.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsObjectRepresentation.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsObjectToPack.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsOutputStream.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackCompactor.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackDescription.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackFile.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackKey.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackParser.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsReader.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsReaderOptions.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRefDatabase.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRefRename.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRefUpdate.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRepository.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRepositoryBuilder.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsText.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/InMemoryRepository.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/LargePackedWholeObject.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/PackInputStream.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/ReadAheadRejectedExecutionHandler.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/ReadAheadTask.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/ReadableChannel.java create mode 100644 org.eclipse.jgit/src/org/eclipse/jgit/util/io/CountingOutputStream.java diff --git a/org.eclipse.jgit/META-INF/MANIFEST.MF b/org.eclipse.jgit/META-INF/MANIFEST.MF index 6b8cb68ec5..b6c080f53d 100644 --- a/org.eclipse.jgit/META-INF/MANIFEST.MF +++ b/org.eclipse.jgit/META-INF/MANIFEST.MF @@ -23,6 +23,7 @@ Export-Package: org.eclipse.jgit;version="1.2.0", org.eclipse.jgit.revplot;version="1.2.0", org.eclipse.jgit.revwalk;version="1.2.0", org.eclipse.jgit.revwalk.filter;version="1.2.0", + org.eclipse.jgit.storage.dfs;version="1.2.0", org.eclipse.jgit.storage.file;version="1.2.0", org.eclipse.jgit.storage.pack;version="1.2.0", org.eclipse.jgit.transport;version="1.2.0", diff --git a/org.eclipse.jgit/resources/org/eclipse/jgit/storage/dfs/DfsText.properties b/org.eclipse.jgit/resources/org/eclipse/jgit/storage/dfs/DfsText.properties new file mode 100644 index 0000000000..2c4bd06a33 --- /dev/null +++ b/org.eclipse.jgit/resources/org/eclipse/jgit/storage/dfs/DfsText.properties @@ -0,0 +1,4 @@ +cannotReadIndex=Cannot read index {0} +shortReadOfBlock=Short read of block at {0} in pack {1}; expected {2} bytes, received only {3} +shortReadOfIndex=Short read of index {0} +willNotStoreEmptyPack=Cannot store empty pack diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/lib/ConfigConstants.java b/org.eclipse.jgit/src/org/eclipse/jgit/lib/ConfigConstants.java index e2c23db7df..41862c7e41 100644 --- a/org.eclipse.jgit/src/org/eclipse/jgit/lib/ConfigConstants.java +++ b/org.eclipse.jgit/src/org/eclipse/jgit/lib/ConfigConstants.java @@ -60,6 +60,9 @@ public class ConfigConstants { /** The "diff" section */ public static final String CONFIG_DIFF_SECTION = "diff"; + /** The "dfs" section */ + public static final String CONFIG_DFS_SECTION = "dfs"; + /** The "user" section */ public static final String CONFIG_USER_SECTION = "user"; @@ -93,6 +96,24 @@ public class ConfigConstants { /** The "worktree" key */ public static final String CONFIG_KEY_WORKTREE = "worktree"; + /** The "blockLimit" key */ + public static final String CONFIG_KEY_BLOCK_LIMIT = "blockLimit"; + + /** The "blockSize" key */ + public static final String CONFIG_KEY_BLOCK_SIZE = "blockSize"; + + /** The "readAheadLimit" key */ + public static final String CONFIG_KEY_READ_AHEAD_LIMIT = "readAheadLimit"; + + /** The "readAheadThreads" key */ + public static final String CONFIG_KEY_READ_AHEAD_THREADS = "readAheadThreads"; + + /** The "deltaBaseCacheLimit" key */ + public static final String CONFIG_KEY_DELTA_BASE_CACHE_LIMIT = "deltaBaseCacheLimit"; + + /** The "streamFileThreshold" key */ + public static final String CONFIG_KEY_STREAM_FILE_TRESHOLD = "streamFileThreshold"; + /** The "remote" key */ public static final String CONFIG_KEY_REMOTE = "remote"; diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DeltaBaseCache.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DeltaBaseCache.java new file mode 100644 index 0000000000..7b313dab90 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DeltaBaseCache.java @@ -0,0 +1,194 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.lang.ref.SoftReference; + +/** + * Caches recently used objects for {@link DfsReader}. + *

+ * This cache is not thread-safe. Each reader should have its own cache. + */ +final class DeltaBaseCache { + private static final int TABLE_BITS = 10; + + private static final int MASK_BITS = 32 - TABLE_BITS; + + private static int hash(long position) { + return (((int) position) << MASK_BITS) >>> MASK_BITS; + } + + private int maxByteCount; + + private final Slot[] table; + + private Slot lruHead; + + private Slot lruTail; + + private int curByteCount; + + DeltaBaseCache(DfsReader reader) { + DfsReaderOptions options = reader.getOptions(); + maxByteCount = options.getDeltaBaseCacheLimit(); + table = new Slot[1 << TABLE_BITS]; + } + + Entry get(DfsPackKey key, long position) { + Slot e = table[hash(position)]; + for (; e != null; e = e.tableNext) { + if (e.offset == position && key.equals(e.pack)) { + Entry buf = e.data.get(); + if (buf != null) { + moveToHead(e); + return buf; + } + } + } + return null; + } + + void put(DfsPackKey key, long offset, int objectType, byte[] data) { + if (data.length > maxByteCount) + return; // Too large to cache. + + curByteCount += data.length; + releaseMemory(); + + int tableIdx = hash(offset); + Slot e = new Slot(key, offset, data.length); + e.data = new SoftReference(new Entry(data, objectType)); + e.tableNext = table[tableIdx]; + table[tableIdx] = e; + moveToHead(e); + } + + private void releaseMemory() { + while (curByteCount > maxByteCount && lruTail != null) { + Slot currOldest = lruTail; + Slot nextOldest = currOldest.lruPrev; + + curByteCount -= currOldest.size; + unlink(currOldest); + removeFromTable(currOldest); + + if (nextOldest == null) + lruHead = null; + else + nextOldest.lruNext = null; + lruTail = nextOldest; + } + } + + private void removeFromTable(Slot e) { + int tableIdx = hash(e.offset); + Slot p = table[tableIdx]; + + if (p == e) { + table[tableIdx] = e.tableNext; + return; + } + + for (; p != null; p = p.tableNext) { + if (p.tableNext == e) { + p.tableNext = e.tableNext; + return; + } + } + } + + private void moveToHead(final Slot e) { + unlink(e); + e.lruPrev = null; + e.lruNext = lruHead; + if (lruHead != null) + lruHead.lruPrev = e; + else + lruTail = e; + lruHead = e; + } + + private void unlink(final Slot e) { + Slot prev = e.lruPrev; + Slot next = e.lruNext; + + if (prev != null) + prev.lruNext = next; + if (next != null) + next.lruPrev = prev; + } + + static class Entry { + final byte[] data; + + final int type; + + Entry(final byte[] aData, final int aType) { + data = aData; + type = aType; + } + } + + private static class Slot { + final DfsPackKey pack; + + final long offset; + + final int size; + + Slot tableNext; + + Slot lruPrev; + + Slot lruNext; + + SoftReference data; + + Slot(DfsPackKey key, long offset, int size) { + this.pack = key; + this.offset = offset; + this.size = size; + } + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsBlock.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsBlock.java new file mode 100644 index 0000000000..cec098eb1b --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsBlock.java @@ -0,0 +1,155 @@ +/* + * Copyright (C) 2008-2011, Google Inc. + * Copyright (C) 2007, Robin Rosenberg + * Copyright (C) 2006-2008, Shawn O. Pearce + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.io.IOException; +import java.security.MessageDigest; +import java.util.zip.CRC32; +import java.util.zip.DataFormatException; +import java.util.zip.Inflater; + +import org.eclipse.jgit.storage.pack.PackOutputStream; + +/** A cached slice of a {@link DfsPackFile}. */ +final class DfsBlock { + /** + * Size in bytes to pass to {@link Inflater} at a time. + *

+ * Blocks can be large (for example 1 MiB), while compressed objects inside + * of them are very small (for example less than 100 bytes for a delta). JNI + * forces the data supplied to the Inflater to be copied during setInput(), + * so use a smaller stride to reduce the risk that too much unnecessary was + * moved into the native layer. + */ + private static final int INFLATE_STRIDE = 512; + + final DfsPackKey pack; + + final long start; + + final long end; + + private final byte[] block; + + DfsBlock(DfsPackKey p, long pos, byte[] buf) { + pack = p; + start = pos; + end = pos + buf.length; + block = buf; + } + + int size() { + return block.length; + } + + int remaining(long pos) { + int ptr = (int) (pos - start); + return block.length - ptr; + } + + boolean contains(DfsPackKey want, long pos) { + return pack == want && start <= pos && pos < end; + } + + int copy(long pos, byte[] dstbuf, int dstoff, int cnt) { + int ptr = (int) (pos - start); + return copy(ptr, dstbuf, dstoff, cnt); + } + + int copy(int p, byte[] b, int o, int n) { + n = Math.min(block.length - p, n); + System.arraycopy(block, p, b, o, n); + return n; + } + + int inflate(Inflater inf, long pos, byte[] dstbuf, int dstoff) + throws DataFormatException { + int ptr = (int) (pos - start); + int in = Math.min(INFLATE_STRIDE, block.length - ptr); + if (dstoff < dstbuf.length) + in = Math.min(in, dstbuf.length - dstoff); + inf.setInput(block, ptr, in); + + for (;;) { + int out = inf.inflate(dstbuf, dstoff, dstbuf.length - dstoff); + if (out == 0) { + if (inf.needsInput()) { + ptr += in; + in = Math.min(INFLATE_STRIDE, block.length - ptr); + if (in == 0) + return dstoff; + inf.setInput(block, ptr, in); + continue; + } + return dstoff; + } + dstoff += out; + } + } + + void crc32(CRC32 out, long pos, int cnt) { + int ptr = (int) (pos - start); + out.update(block, ptr, cnt); + } + + void write(PackOutputStream out, long pos, int cnt, MessageDigest digest) + throws IOException { + int ptr = (int) (pos - start); + out.write(block, ptr, cnt); + if (digest != null) + digest.update(block, ptr, cnt); + } + + void check(Inflater inf, byte[] tmp, long pos, int cnt) + throws DataFormatException { + // Unlike inflate() above the exact byte count is known by the caller. + // Push all of it in a single invocation to avoid unnecessary loops. + // + inf.setInput(block, (int) (pos - start), cnt); + while (inf.inflate(tmp, 0, tmp.length) > 0) + continue; + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsBlockCache.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsBlockCache.java new file mode 100644 index 0000000000..6379b70a9a --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsBlockCache.java @@ -0,0 +1,561 @@ +/* + * Copyright (C) 2008-2011, Google Inc. + * Copyright (C) 2008, Shawn O. Pearce + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.concurrent.ThreadPoolExecutor; +import java.util.concurrent.atomic.AtomicLong; +import java.util.concurrent.atomic.AtomicReferenceArray; +import java.util.concurrent.locks.ReentrantLock; + +import org.eclipse.jgit.JGitText; + +/** + * Caches slices of a {@link DfsPackFile} in memory for faster read access. + *

+ * The DfsBlockCache serves as a Java based "buffer cache", loading segments of + * a DfsPackFile into the JVM heap prior to use. As JGit often wants to do reads + * of only tiny slices of a file, the DfsBlockCache tries to smooth out these + * tiny reads into larger block-sized IO operations. + *

+ * Whenever a cache miss occurs, loading is invoked by exactly one thread for + * the given (DfsPackKey,position) key tuple. This is ensured by an + * array of locks, with the tuple hashed to a lock instance. + *

+ * Its too expensive during object access to be accurate with a least recently + * used (LRU) algorithm. Strictly ordering every read is a lot of overhead that + * typically doesn't yield a corresponding benefit to the application. This + * cache implements a clock replacement algorithm, giving each block one chance + * to have been accessed during a sweep of the cache to save itself from + * eviction. + *

+ * Entities created by the cache are held under hard references, preventing the + * Java VM from clearing anything. Blocks are discarded by the replacement + * algorithm when adding a new block would cause the cache to exceed its + * configured maximum size. + *

+ * The key tuple is passed through to methods as a pair of parameters rather + * than as a single Object, thus reducing the transient memory allocations of + * callers. It is more efficient to avoid the allocation, as we can't be 100% + * sure that a JIT would be able to stack-allocate a key tuple. + *

+ * The internal hash table does not expand at runtime, instead it is fixed in + * size at cache creation time. The internal lock table used to gate load + * invocations is also fixed in size. + */ +public final class DfsBlockCache { + private static volatile DfsBlockCache cache; + + static { + reconfigure(new DfsBlockCacheConfig()); + } + + /** + * Modify the configuration of the window cache. + *

+ * The new configuration is applied immediately. If the new limits are + * smaller than what what is currently cached, older entries will be purged + * as soon as possible to allow the cache to meet the new limit. + * + * @param cfg + * the new window cache configuration. + * @throws IllegalArgumentException + * the cache configuration contains one or more invalid + * settings, usually too low of a limit. + */ + public static void reconfigure(DfsBlockCacheConfig cfg) { + DfsBlockCache nc = new DfsBlockCache(cfg); + DfsBlockCache oc = cache; + cache = nc; + + if (oc != null && oc.readAheadService != null) + oc.readAheadService.shutdown(); + } + + /** @return the currently active DfsBlockCache. */ + public static DfsBlockCache getInstance() { + return cache; + } + + /** Number of entries in {@link #table}. */ + private final int tableSize; + + /** Hash bucket directory; entries are chained below. */ + private final AtomicReferenceArray table; + + /** Locks to prevent concurrent loads for same (PackFile,position). */ + private final ReentrantLock[] loadLocks; + + /** Maximum number of bytes the cache should hold. */ + private final long maxBytes; + + /** + * Suggested block size to read from pack files in. + *

+ * If a pack file does not have a native block size, this size will be used. + *

+ * If a pack file has a native size, a whole multiple of the native size + * will be used until it matches this size. + */ + private final int blockSize; + + /** As {@link #blockSize} is a power of 2, bits to shift for a / blockSize. */ + private final int blockSizeShift; + + /** Number of bytes to read-ahead from current read position. */ + private final int readAheadLimit; + + /** Thread pool to handle optimistic read-ahead. */ + private final ThreadPoolExecutor readAheadService; + + /** Cache of pack files, indexed by description. */ + private final Map packCache; + + /** Number of times a block was found in the cache. */ + private final AtomicLong statHit; + + /** Number of times a block was not found, and had to be loaded. */ + private final AtomicLong statMiss; + + /** Number of blocks evicted due to cache being full. */ + private volatile long statEvict; + + /** Protects the clock and its related data. */ + private final ReentrantLock clockLock; + + /** Current position of the clock. */ + private Ref clockHand; + + /** Number of bytes currently loaded in the cache. */ + private volatile long liveBytes; + + private DfsBlockCache(final DfsBlockCacheConfig cfg) { + tableSize = tableSize(cfg); + if (tableSize < 1) + throw new IllegalArgumentException(JGitText.get().tSizeMustBeGreaterOrEqual1); + + table = new AtomicReferenceArray(tableSize); + loadLocks = new ReentrantLock[32]; + for (int i = 0; i < loadLocks.length; i++) + loadLocks[i] = new ReentrantLock(true /* fair */); + + int eb = (int) (tableSize * .1); + if (64 < eb) + eb = 64; + else if (eb < 4) + eb = 4; + if (tableSize < eb) + eb = tableSize; + + maxBytes = cfg.getBlockLimit(); + blockSize = cfg.getBlockSize(); + blockSizeShift = Integer.numberOfTrailingZeros(blockSize); + + clockLock = new ReentrantLock(true /* fair */); + clockHand = new Ref(null, -1, 0, null); + clockHand.next = clockHand; + + readAheadLimit = cfg.getReadAheadLimit(); + readAheadService = cfg.getReadAheadService(); + + packCache = new HashMap(); + + statHit = new AtomicLong(); + statMiss = new AtomicLong(); + } + + /** @return total number of bytes in the cache. */ + public long getCurrentSize() { + return liveBytes; + } + + /** @return 0..100, defining how full the cache is. */ + public long getFillPercentage() { + return getCurrentSize() * 100 / maxBytes; + } + + /** @return 0..100, defining number of cache hits. */ + public long getHitRatio() { + long hits = statHit.get(); + long miss = statMiss.get(); + long total = hits + miss; + if (total == 0) + return 0; + return hits * 100 / total; + } + + /** @return number of evictions performed due to cache being full. */ + public long getEvictions() { + return statEvict; + } + + DfsPackFile getOrCreate(DfsPackDescription dsc, DfsPackKey key) { + // TODO This table grows without bound. It needs to clean up + // entries that aren't in cache anymore, and aren't being used + // by a live DfsObjDatabase reference. + synchronized (packCache) { + DfsPackFile pack = packCache.get(dsc); + if (pack != null && pack.invalid()) { + packCache.remove(dsc); + pack = null; + } + if (pack == null) { + if (key == null) + key = new DfsPackKey(); + pack = new DfsPackFile(this, dsc, key); + packCache.put(dsc, pack); + } + return pack; + } + } + + private int hash(int packHash, long off) { + return packHash + (int) (off >>> blockSizeShift); + } + + int getBlockSize() { + return blockSize; + } + + private static int tableSize(final DfsBlockCacheConfig cfg) { + final int wsz = cfg.getBlockSize(); + final long limit = cfg.getBlockLimit(); + if (wsz <= 0) + throw new IllegalArgumentException(JGitText.get().invalidWindowSize); + if (limit < wsz) + throw new IllegalArgumentException(JGitText.get().windowSizeMustBeLesserThanLimit); + return (int) Math.min(5 * (limit / wsz) / 2, Integer.MAX_VALUE); + } + + /** + * Lookup a cached object, creating and loading it if it doesn't exist. + * + * @param pack + * the pack that "contains" the cached object. + * @param position + * offset within pack of the object. + * @param ctx + * current thread's reader. + * @return the object reference. + * @throws IOException + * the reference was not in the cache and could not be loaded. + */ + DfsBlock getOrLoad(DfsPackFile pack, long position, DfsReader ctx) + throws IOException { + final long requestedPosition = position; + position = pack.alignToBlock(position); + + DfsPackKey key = pack.key; + int slot = slot(key, position); + HashEntry e1 = table.get(slot); + DfsBlock v = scan(e1, key, position); + if (v != null) + return v; + + reserveSpace(blockSize); + ReentrantLock regionLock = lockFor(key, position); + regionLock.lock(); + try { + HashEntry e2 = table.get(slot); + if (e2 != e1) { + v = scan(e2, key, position); + if (v != null) { + creditSpace(blockSize); + return v; + } + } + + statMiss.incrementAndGet(); + boolean credit = true; + try { + v = pack.readOneBlock(position, ctx); + credit = false; + } finally { + if (credit) + creditSpace(blockSize); + } + if (position != v.start) { + // The file discovered its blockSize and adjusted. + position = v.start; + slot = slot(key, position); + e2 = table.get(slot); + } + + Ref ref = new Ref(key, position, v.size(), v); + ref.hot = true; + for (;;) { + HashEntry n = new HashEntry(clean(e2), ref); + if (table.compareAndSet(slot, e2, n)) + break; + e2 = table.get(slot); + } + addToClock(ref, blockSize - v.size()); + } finally { + regionLock.unlock(); + } + + // If the block size changed from the default, it is possible the block + // that was loaded is the wrong block for the requested position. + if (v.contains(pack.key, requestedPosition)) + return v; + return getOrLoad(pack, requestedPosition, ctx); + } + + @SuppressWarnings("unchecked") + private void reserveSpace(int reserve) { + clockLock.lock(); + long live = liveBytes + reserve; + if (maxBytes < live) { + Ref prev = clockHand; + Ref hand = clockHand.next; + do { + if (hand.hot) { + // Value was recently touched. Clear + // hot and give it another chance. + hand.hot = false; + prev = hand; + hand = hand.next; + continue; + } else if (prev == hand) + break; + + // No recent access since last scan, kill + // value and remove from clock. + Ref dead = hand; + hand = hand.next; + prev.next = hand; + dead.next = null; + dead.value = null; + live -= dead.size; + statEvict++; + } while (maxBytes < live); + clockHand = prev; + } + liveBytes = live; + clockLock.unlock(); + } + + private void creditSpace(int credit) { + clockLock.lock(); + liveBytes -= credit; + clockLock.unlock(); + } + + private void addToClock(Ref ref, int credit) { + clockLock.lock(); + if (credit != 0) + liveBytes -= credit; + Ref ptr = clockHand; + ref.next = ptr.next; + ptr.next = ref; + clockHand = ref; + clockLock.unlock(); + } + + void put(DfsBlock v) { + put(v.pack, v.start, v.size(), v); + } + + Ref put(DfsPackKey key, long pos, int size, T v) { + int slot = slot(key, pos); + HashEntry e1 = table.get(slot); + Ref ref = scanRef(e1, key, pos); + if (ref != null) + return ref; + + reserveSpace(size); + ReentrantLock regionLock = lockFor(key, pos); + regionLock.lock(); + try { + HashEntry e2 = table.get(slot); + if (e2 != e1) { + ref = scanRef(e2, key, pos); + if (ref != null) { + creditSpace(size); + return ref; + } + } + + ref = new Ref(key, pos, size, v); + ref.hot = true; + for (;;) { + HashEntry n = new HashEntry(clean(e2), ref); + if (table.compareAndSet(slot, e2, n)) + break; + e2 = table.get(slot); + } + addToClock(ref, 0); + } finally { + regionLock.unlock(); + } + return ref; + } + + boolean contains(DfsPackKey key, long position) { + return scan(table.get(slot(key, position)), key, position) != null; + } + + @SuppressWarnings("unchecked") + T get(DfsPackKey key, long position) { + T val = (T) scan(table.get(slot(key, position)), key, position); + if (val == null) + statMiss.incrementAndGet(); + return val; + } + + boolean readAhead(ReadableChannel rc, DfsPackKey key, int size, long pos, + long len, DfsReader ctx) { + if (!ctx.wantReadAhead() || readAheadLimit <= 0 || readAheadService == null) + return false; + + int cap = readAheadLimit / size; + long readAheadEnd = pos + readAheadLimit; + List blocks = new ArrayList(cap); + while (pos < readAheadEnd && pos < len) { + long end = Math.min(pos + size, len); + if (!contains(key, pos)) + blocks.add(new ReadAheadTask.BlockFuture(key, pos, end)); + pos = end; + } + if (blocks.isEmpty()) + return false; + + ReadAheadTask task = new ReadAheadTask(this, rc, blocks); + ReadAheadTask.TaskFuture t = new ReadAheadTask.TaskFuture(task); + for (ReadAheadTask.BlockFuture b : blocks) + b.setTask(t); + readAheadService.execute(t); + ctx.startedReadAhead(blocks); + return true; + } + + @SuppressWarnings("unchecked") + private T scan(HashEntry n, DfsPackKey pack, long position) { + for (; n != null; n = n.next) { + Ref r = n.ref; + if (r.pack != pack || r.position != position) + continue; + T v = r.get(); + if (v == null) + return null; + statHit.incrementAndGet(); + return v; + } + return null; + } + + @SuppressWarnings("unchecked") + private Ref scanRef(HashEntry n, DfsPackKey pack, long position) { + for (; n != null; n = n.next) { + Ref r = n.ref; + if (r.pack == pack && r.position == position) + return r.get() != null ? r : null; + } + return null; + } + + void remove(DfsPackFile pack) { + synchronized (packCache) { + packCache.remove(pack.getPackDescription()); + } + } + + private int slot(DfsPackKey pack, long position) { + return (hash(pack.hash, position) >>> 1) % tableSize; + } + + private ReentrantLock lockFor(DfsPackKey pack, long position) { + return loadLocks[(hash(pack.hash, position) >>> 1) % loadLocks.length]; + } + + private static HashEntry clean(HashEntry top) { + while (top != null && top.ref.next == null) + top = top.next; + if (top == null) + return null; + HashEntry n = clean(top.next); + return n == top.next ? top : new HashEntry(n, top.ref); + } + + private static final class HashEntry { + /** Next entry in the hash table's chain list. */ + final HashEntry next; + + /** The referenced object. */ + final Ref ref; + + HashEntry(HashEntry n, Ref r) { + next = n; + ref = r; + } + } + + static final class Ref { + final DfsPackKey pack; + final long position; + final int size; + volatile T value; + Ref next; + volatile boolean hot; + + Ref(DfsPackKey pack, long position, int size, T v) { + this.pack = pack; + this.position = position; + this.size = size; + this.value = v; + } + + T get() { + T v = value; + if (v != null) + hot = true; + return v; + } + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsBlockCacheConfig.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsBlockCacheConfig.java new file mode 100644 index 0000000000..14c96987c1 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsBlockCacheConfig.java @@ -0,0 +1,212 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import static org.eclipse.jgit.lib.ConfigConstants.CONFIG_CORE_SECTION; +import static org.eclipse.jgit.lib.ConfigConstants.CONFIG_DFS_SECTION; +import static org.eclipse.jgit.lib.ConfigConstants.CONFIG_KEY_BLOCK_LIMIT; +import static org.eclipse.jgit.lib.ConfigConstants.CONFIG_KEY_BLOCK_SIZE; +import static org.eclipse.jgit.lib.ConfigConstants.CONFIG_KEY_READ_AHEAD_LIMIT; +import static org.eclipse.jgit.lib.ConfigConstants.CONFIG_KEY_READ_AHEAD_THREADS; + +import java.util.concurrent.ArrayBlockingQueue; +import java.util.concurrent.RejectedExecutionHandler; +import java.util.concurrent.ThreadFactory; +import java.util.concurrent.ThreadPoolExecutor; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicInteger; + +import org.eclipse.jgit.lib.Config; + +/** Configuration parameters for {@link DfsBlockCache}. */ +public class DfsBlockCacheConfig { + /** 1024 (number of bytes in one kibibyte/kilobyte) */ + public static final int KB = 1024; + + /** 1024 {@link #KB} (number of bytes in one mebibyte/megabyte) */ + public static final int MB = 1024 * KB; + + private long blockLimit; + + private int blockSize; + + private int readAheadLimit; + + private ThreadPoolExecutor readAheadService; + + /** Create a default configuration. */ + public DfsBlockCacheConfig() { + setBlockLimit(32 * MB); + setBlockSize(64 * KB); + } + + /** + * @return maximum number bytes of heap memory to dedicate to caching pack + * file data. Default is 32 MB. + */ + public long getBlockLimit() { + return blockLimit; + } + + /** + * @param newLimit + * maximum number bytes of heap memory to dedicate to caching + * pack file data. + * @return {@code this} + */ + public DfsBlockCacheConfig setBlockLimit(final long newLimit) { + blockLimit = newLimit; + return this; + } + + /** + * @return size in bytes of a single window mapped or read in from the pack + * file. Default is 64 KB. + */ + public int getBlockSize() { + return blockSize; + } + + /** + * @param newSize + * size in bytes of a single window read in from the pack file. + * @return {@code this} + */ + public DfsBlockCacheConfig setBlockSize(final int newSize) { + blockSize = Math.max(512, newSize); + return this; + } + + /** @return number of bytes to read ahead sequentially by. */ + public int getReadAheadLimit() { + return readAheadLimit; + } + + /** + * @param newSize + * new read-ahead limit, in bytes. + * @return {@code this} + */ + public DfsBlockCacheConfig setReadAheadLimit(final int newSize) { + readAheadLimit = Math.max(0, newSize); + return this; + } + + /** @return service to perform read-ahead of sequential blocks. */ + public ThreadPoolExecutor getReadAheadService() { + return readAheadService; + } + + /** + * @param svc + * service to perform read-ahead of sequential blocks with. If + * not null the {@link RejectedExecutionHandler} must be managed + * by the JGit DFS library and not the application. + * @return {@code this}. + */ + public DfsBlockCacheConfig setReadAheadService(ThreadPoolExecutor svc) { + if (svc != null) + svc.setRejectedExecutionHandler(ReadAheadRejectedExecutionHandler.INSTANCE); + readAheadService = svc; + return this; + } + + /** + * Update properties by setting fields from the configuration. + *

+ * If a property is not defined in the configuration, then it is left + * unmodified. + * + * @param rc + * configuration to read properties from. + * @return {@code this} + */ + public DfsBlockCacheConfig fromConfig(final Config rc) { + setBlockLimit(rc.getLong( + CONFIG_CORE_SECTION, + CONFIG_DFS_SECTION, + CONFIG_KEY_BLOCK_LIMIT, + getBlockLimit())); + + setBlockSize(rc.getInt( + CONFIG_CORE_SECTION, + CONFIG_DFS_SECTION, + CONFIG_KEY_BLOCK_SIZE, + getBlockSize())); + + setReadAheadLimit(rc.getInt( + CONFIG_CORE_SECTION, + CONFIG_DFS_SECTION, + CONFIG_KEY_READ_AHEAD_LIMIT, + getReadAheadLimit())); + + int readAheadThreads = rc.getInt( + CONFIG_CORE_SECTION, + CONFIG_DFS_SECTION, + CONFIG_KEY_READ_AHEAD_THREADS, + 0); + + if (0 < getReadAheadLimit() && 0 < readAheadThreads) { + setReadAheadService(new ThreadPoolExecutor( + 1, // Minimum number of threads kept alive. + readAheadThreads, // Maximum threads active. + 60, TimeUnit.SECONDS, // Idle threads wait this long before ending. + new ArrayBlockingQueue(1), // Do not queue deeply. + new ThreadFactory() { + private final String name = "JGit-DFS-ReadAhead"; + private final AtomicInteger cnt = new AtomicInteger(); + private final ThreadGroup group = new ThreadGroup(name); + + public Thread newThread(Runnable body) { + int id = cnt.incrementAndGet(); + Thread thread = new Thread(group, body, name + "-" + id); + thread.setDaemon(true); + thread.setContextClassLoader(getClass().getClassLoader()); + return thread; + } + }, ReadAheadRejectedExecutionHandler.INSTANCE)); + } + return this; + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsCachedPack.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsCachedPack.java new file mode 100644 index 0000000000..e0ecc8018a --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsCachedPack.java @@ -0,0 +1,92 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.io.IOException; +import java.util.Set; + +import org.eclipse.jgit.lib.ObjectId; +import org.eclipse.jgit.storage.pack.CachedPack; +import org.eclipse.jgit.storage.pack.ObjectToPack; +import org.eclipse.jgit.storage.pack.PackOutputStream; +import org.eclipse.jgit.storage.pack.StoredObjectRepresentation; + +/** A DfsPackFile available for reuse as-is. */ +public class DfsCachedPack extends CachedPack { + private final DfsPackFile pack; + + DfsCachedPack(DfsPackFile pack) { + this.pack = pack; + } + + /** @return the description of the pack. */ + public DfsPackDescription getPackDescription() { + return pack.getPackDescription(); + } + + @Override + public Set getTips() { + return getPackDescription().getTips(); + } + + @Override + public long getObjectCount() throws IOException { + return getPackDescription().getObjectCount(); + } + + @Override + public long getDeltaCount() throws IOException { + return getPackDescription().getDeltaCount(); + } + + @Override + public boolean hasObject(ObjectToPack obj, StoredObjectRepresentation rep) { + return ((DfsObjectRepresentation) rep).pack == pack; + } + + void copyAsIs(PackOutputStream out, boolean validate, DfsReader ctx) + throws IOException { + pack.copyPackAsIs(out, validate, ctx); + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsConfig.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsConfig.java new file mode 100644 index 0000000000..fa91b7cddf --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsConfig.java @@ -0,0 +1,61 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.io.IOException; + +import org.eclipse.jgit.errors.ConfigInvalidException; +import org.eclipse.jgit.lib.StoredConfig; + +final class DfsConfig extends StoredConfig { + @Override + public void load() throws IOException, ConfigInvalidException { + clear(); + } + + @Override + public void save() throws IOException { + // TODO actually store this configuration. + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsGarbageCollector.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsGarbageCollector.java new file mode 100644 index 0000000000..7d8ff3e1a5 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsGarbageCollector.java @@ -0,0 +1,347 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import static org.eclipse.jgit.storage.dfs.DfsObjDatabase.PackSource.GC; +import static org.eclipse.jgit.storage.dfs.DfsObjDatabase.PackSource.UNREACHABLE_GARBAGE; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +import org.eclipse.jgit.lib.AnyObjectId; +import org.eclipse.jgit.lib.Constants; +import org.eclipse.jgit.lib.NullProgressMonitor; +import org.eclipse.jgit.lib.ObjectId; +import org.eclipse.jgit.lib.ProgressMonitor; +import org.eclipse.jgit.lib.Ref; +import org.eclipse.jgit.revwalk.RevWalk; +import org.eclipse.jgit.storage.dfs.DfsObjDatabase.PackSource; +import org.eclipse.jgit.storage.file.PackIndex; +import org.eclipse.jgit.storage.pack.PackConfig; +import org.eclipse.jgit.storage.pack.PackWriter; +import org.eclipse.jgit.util.io.CountingOutputStream; + +/** Repack and garbage collect a repository. */ +public class DfsGarbageCollector { + private final DfsRepository repo; + + private final DfsRefDatabase refdb; + + private final DfsObjDatabase objdb; + + private final List newPackDesc; + + private final List newPackStats; + + private final List newPackList; + + private DfsReader ctx; + + private PackConfig packConfig; + + private Map refsBefore; + + private List packsBefore; + + private Set allHeads; + + private Set nonHeads; + + /** Sum of object counts in {@link #packsBefore}. */ + private long objectsBefore; + + /** Sum of object counts iN {@link #newPackDesc}. */ + private long objectsPacked; + + private Set tagTargets; + + /** + * Initialize a garbage collector. + * + * @param repository + * repository objects to be packed will be read from. + */ + public DfsGarbageCollector(DfsRepository repository) { + repo = repository; + refdb = repo.getRefDatabase(); + objdb = repo.getObjectDatabase(); + newPackDesc = new ArrayList(4); + newPackStats = new ArrayList(4); + newPackList = new ArrayList(4); + + packConfig = new PackConfig(repo); + packConfig.setIndexVersion(2); + } + + /** @return configuration used to generate the new pack file. */ + public PackConfig getPackConfig() { + return packConfig; + } + + /** + * @param newConfig + * the new configuration to use when creating the pack file. + * @return {@code this} + */ + public DfsGarbageCollector setPackConfig(PackConfig newConfig) { + packConfig = newConfig; + return this; + } + + /** + * Create a single new pack file containing all of the live objects. + *

+ * This method safely decides which packs can be expired after the new pack + * is created by validating the references have not been modified in an + * incompatible way. + * + * @param pm + * progress monitor to receive updates on as packing may take a + * while, depending on the size of the repository. + * @return true if the repack was successful without race conditions. False + * if a race condition was detected and the repack should be run + * again later. + * @throws IOException + * a new pack cannot be created. + */ + public boolean pack(ProgressMonitor pm) throws IOException { + if (pm == null) + pm = NullProgressMonitor.INSTANCE; + if (packConfig.getIndexVersion() != 2) + throw new IllegalStateException("Only index version 2"); + + ctx = (DfsReader) objdb.newReader(); + try { + refdb.clearCache(); + objdb.clearCache(); + + refsBefore = repo.getAllRefs(); + packsBefore = Arrays.asList(objdb.getPacks()); + if (packsBefore.isEmpty()) + return true; + + allHeads = new HashSet(); + nonHeads = new HashSet(); + tagTargets = new HashSet(); + for (Ref ref : refsBefore.values()) { + if (ref.isSymbolic() || ref.getObjectId() == null) + continue; + if (isHead(ref)) + allHeads.add(ref.getObjectId()); + else + nonHeads.add(ref.getObjectId()); + if (ref.getPeeledObjectId() != null) + tagTargets.add(ref.getPeeledObjectId()); + } + tagTargets.addAll(allHeads); + + boolean rollback = true; + try { + packHeads(pm); + packRest(pm); + packGarbage(pm); + objdb.commitPack(newPackDesc, toPrune()); + rollback = false; + return true; + } finally { + if (rollback) + objdb.rollbackPack(newPackDesc); + } + } finally { + ctx.release(); + } + } + + /** @return all of the source packs that fed into this compaction. */ + public List getSourcePacks() { + return toPrune(); + } + + /** @return new packs created by this compaction. */ + public List getNewPacks() { + return newPackDesc; + } + + /** @return statistics corresponding to the {@link #getNewPacks()}. */ + public List getNewPackStatistics() { + return newPackStats; + } + + private List toPrune() { + int cnt = packsBefore.size(); + List all = new ArrayList(cnt); + for (DfsPackFile pack : packsBefore) + all.add(pack.getPackDescription()); + return all; + } + + private void packHeads(ProgressMonitor pm) throws IOException { + if (allHeads.isEmpty()) + return; + + PackWriter pw = newPackWriter(); + try { + pw.preparePack(pm, allHeads, Collections. emptySet()); + if (0 < pw.getObjectCount()) + writePack(GC, pw, pm).setTips(allHeads); + } finally { + pw.release(); + } + } + + private void packRest(ProgressMonitor pm) throws IOException { + if (nonHeads.isEmpty() || objectsPacked == getObjectsBefore()) + return; + + PackWriter pw = newPackWriter(); + try { + for (DfsPackFile pack : newPackList) + pw.excludeObjects(pack.getPackIndex(ctx)); + pw.preparePack(pm, nonHeads, allHeads); + if (0 < pw.getObjectCount()) + writePack(GC, pw, pm); + } finally { + pw.release(); + } + } + + private void packGarbage(ProgressMonitor pm) throws IOException { + if (objectsPacked == getObjectsBefore()) + return; + + // TODO(sop) This is ugly. The garbage pack needs to be deleted. + List newIdx = new ArrayList(newPackList.size()); + for (DfsPackFile pack : newPackList) + newIdx.add(pack.getPackIndex(ctx)); + + PackWriter pw = newPackWriter(); + try { + RevWalk pool = new RevWalk(ctx); + for (DfsPackFile oldPack : packsBefore) { + PackIndex oldIdx = oldPack.getPackIndex(ctx); + pm.beginTask("Finding garbage", (int) oldIdx.getObjectCount()); + for (PackIndex.MutableEntry ent : oldIdx) { + pm.update(1); + ObjectId id = ent.toObjectId(); + if (pool.lookupOrNull(id) != null || anyIndexHas(newIdx, id)) + continue; + + int type = oldPack.getObjectType(ctx, ent.getOffset()); + pw.addObject(pool.lookupAny(id, type)); + } + pm.endTask(); + } + if (0 < pw.getObjectCount()) + writePack(UNREACHABLE_GARBAGE, pw, pm); + } finally { + pw.release(); + } + } + + private static boolean anyIndexHas(List list, AnyObjectId id) { + for (PackIndex idx : list) + if (idx.hasObject(id)) + return true; + return false; + } + + private static boolean isHead(Ref ref) { + return ref.getName().startsWith(Constants.R_HEADS); + } + + private long getObjectsBefore() { + if (objectsBefore == 0) { + for (DfsPackFile p : packsBefore) + objectsBefore += p.getPackDescription().getObjectCount(); + } + return objectsBefore; + } + + private PackWriter newPackWriter() { + PackWriter pw = new PackWriter(packConfig, ctx); + pw.setDeltaBaseAsOffset(true); + pw.setReuseDeltaCommits(false); + pw.setTagTargets(tagTargets); + return pw; + } + + private DfsPackDescription writePack(PackSource source, PackWriter pw, + ProgressMonitor pm) throws IOException { + DfsOutputStream out; + DfsPackDescription pack = repo.getObjectDatabase().newPack(source); + newPackDesc.add(pack); + + out = objdb.writePackFile(pack); + try { + pw.writePack(pm, pm, out); + } finally { + out.close(); + } + + out = objdb.writePackIndex(pack); + try { + CountingOutputStream cnt = new CountingOutputStream(out); + pw.writeIndex(cnt); + pack.setIndexSize(cnt.getCount()); + } finally { + out.close(); + } + + PackWriter.Statistics stats = pw.getStatistics(); + pack.setPackStats(stats); + pack.setPackSize(stats.getTotalBytes()); + pack.setObjectCount(stats.getTotalObjects()); + pack.setDeltaCount(stats.getTotalDeltas()); + objectsPacked += stats.getTotalObjects(); + newPackStats.add(stats); + newPackList.add(DfsBlockCache.getInstance().getOrCreate(pack, null)); + return pack; + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsInserter.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsInserter.java new file mode 100644 index 0000000000..335e284e2b --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsInserter.java @@ -0,0 +1,393 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.io.EOFException; +import java.io.IOException; +import java.io.InputStream; +import java.io.OutputStream; +import java.security.MessageDigest; +import java.util.Collections; +import java.util.List; +import java.util.zip.CRC32; +import java.util.zip.Deflater; +import java.util.zip.DeflaterOutputStream; + +import org.eclipse.jgit.lib.Constants; +import org.eclipse.jgit.lib.ObjectId; +import org.eclipse.jgit.lib.ObjectIdOwnerMap; +import org.eclipse.jgit.lib.ObjectInserter; +import org.eclipse.jgit.storage.file.PackIndex; +import org.eclipse.jgit.storage.file.PackIndexWriter; +import org.eclipse.jgit.transport.PackedObjectInfo; +import org.eclipse.jgit.util.BlockList; +import org.eclipse.jgit.util.IO; +import org.eclipse.jgit.util.NB; +import org.eclipse.jgit.util.TemporaryBuffer; +import org.eclipse.jgit.util.io.CountingOutputStream; + +/** Inserts objects into the DFS. */ +public class DfsInserter extends ObjectInserter { + /** Always produce version 2 indexes, to get CRC data. */ + private static final int INDEX_VERSION = 2; + + private final DfsObjDatabase db; + + private List objectList; + private ObjectIdOwnerMap objectMap; + + private DfsBlockCache cache; + private DfsPackKey packKey; + private DfsPackDescription packDsc; + private PackStream packOut; + private boolean rollback; + + /** + * Initialize a new inserter. + * + * @param db + * database the inserter writes to. + */ + protected DfsInserter(DfsObjDatabase db) { + this.db = db; + } + + @Override + public DfsPackParser newPackParser(InputStream in) throws IOException { + return new DfsPackParser(db, this, in); + } + + @Override + public ObjectId insert(int type, byte[] data, int off, int len) + throws IOException { + ObjectId id = idFor(type, data, off, len); + if (objectMap != null && objectMap.contains(id)) + return id; + if (db.has(id)) + return id; + + long offset = beginObject(type, len); + packOut.compress.write(data, off, len); + packOut.compress.finish(); + return endObject(id, offset); + } + + @Override + public ObjectId insert(int type, long len, InputStream in) + throws IOException { + byte[] buf = buffer(); + if (len <= buf.length) { + IO.readFully(in, buf, 0, (int) len); + return insert(type, buf, 0, (int) len); + } + + long offset = beginObject(type, len); + MessageDigest md = digest(); + md.update(Constants.encodedTypeString(type)); + md.update((byte) ' '); + md.update(Constants.encodeASCII(len)); + md.update((byte) 0); + + while (0 < len) { + int n = in.read(buf, 0, (int) Math.min(buf.length, len)); + if (n <= 0) + throw new EOFException(); + md.update(buf, 0, n); + packOut.compress.write(buf, 0, n); + len -= n; + } + packOut.compress.finish(); + return endObject(ObjectId.fromRaw(md.digest()), offset); + } + + @Override + public void flush() throws IOException { + if (packDsc == null) + return; + + if (packOut == null) + throw new IOException(); + + byte[] packHash = packOut.writePackFooter(); + packDsc.setPackSize(packOut.getCount()); + packOut.close(); + packOut = null; + + sortObjectsById(); + + PackIndex index = writePackIndex(packDsc, packHash, objectList); + db.commitPack(Collections.singletonList(packDsc), null); + rollback = false; + + DfsPackFile p = cache.getOrCreate(packDsc, packKey); + if (index != null) + p.setPackIndex(index); + db.addPack(p); + clear(); + } + + @Override + public void release() { + if (packOut != null) { + try { + packOut.close(); + } catch (IOException err) { + // Ignore a close failure, the pack should be removed. + } finally { + packOut = null; + } + } + if (rollback && packDsc != null) { + try { + db.rollbackPack(Collections.singletonList(packDsc)); + } finally { + packDsc = null; + rollback = false; + } + } + clear(); + } + + private void clear() { + objectList = null; + objectMap = null; + packKey = null; + packDsc = null; + } + + private long beginObject(int type, long len) throws IOException { + if (packOut == null) + beginPack(); + long offset = packOut.getCount(); + packOut.beginObject(type, len); + return offset; + } + + private ObjectId endObject(ObjectId id, long offset) { + PackedObjectInfo obj = new PackedObjectInfo(id); + obj.setOffset(offset); + obj.setCRC((int) packOut.crc32.getValue()); + objectList.add(obj); + objectMap.addIfAbsent(obj); + return id; + } + + private void beginPack() throws IOException { + objectList = new BlockList(); + objectMap = new ObjectIdOwnerMap(); + cache = DfsBlockCache.getInstance(); + + rollback = true; + packDsc = db.newPack(DfsObjDatabase.PackSource.INSERT); + packOut = new PackStream(db.writePackFile(packDsc)); + packKey = new DfsPackKey(); + + // Write the header as though it were a single object pack. + byte[] buf = packOut.hdrBuf; + System.arraycopy(Constants.PACK_SIGNATURE, 0, buf, 0, 4); + NB.encodeInt32(buf, 4, 2); // Always use pack version 2. + NB.encodeInt32(buf, 8, 1); // Always assume 1 object. + packOut.write(buf, 0, 12); + } + + @SuppressWarnings("unchecked") + private void sortObjectsById() { + Collections.sort(objectList); + } + + PackIndex writePackIndex(DfsPackDescription pack, byte[] packHash, + List list) throws IOException { + pack.setObjectCount(list.size()); + + // If there are less than 58,000 objects, the entire index fits in under + // 2 MiB. Callers will probably need the index immediately, so buffer + // the index in process and load from the buffer. + TemporaryBuffer.Heap buf = null; + PackIndex packIndex = null; + if (list.size() <= 58000) { + buf = new TemporaryBuffer.Heap(2 << 20); + index(buf, packHash, list); + packIndex = PackIndex.read(buf.openInputStream()); + } + + DfsOutputStream os = db.writePackIndex(pack); + try { + CountingOutputStream cnt = new CountingOutputStream(os); + if (buf != null) + buf.writeTo(cnt, null); + else + index(cnt, packHash, list); + pack.setIndexSize(cnt.getCount()); + } finally { + os.close(); + } + return packIndex; + } + + private static void index(OutputStream out, byte[] packHash, + List list) throws IOException { + PackIndexWriter.createVersion(out, INDEX_VERSION).write(list, packHash); + } + + private class PackStream extends OutputStream { + private final DfsOutputStream out; + private final MessageDigest md; + private final byte[] hdrBuf; + private final Deflater deflater; + private final int blockSize; + + private long currPos; // Position of currBuf[0] in the output stream. + private int currPtr; // Number of bytes in currBuf. + private byte[] currBuf; + + final CRC32 crc32; + final DeflaterOutputStream compress; + + PackStream(DfsOutputStream out) { + this.out = out; + + hdrBuf = new byte[32]; + md = Constants.newMessageDigest(); + crc32 = new CRC32(); + deflater = new Deflater(Deflater.BEST_COMPRESSION); + compress = new DeflaterOutputStream(this, deflater, 8192); + + int size = out.blockSize(); + if (size <= 0) + size = cache.getBlockSize(); + else if (size < cache.getBlockSize()) + size = (cache.getBlockSize() / size) * size; + blockSize = size; + currBuf = new byte[blockSize]; + } + + long getCount() { + return currPos + currPtr; + } + + void beginObject(int objectType, long length) throws IOException { + crc32.reset(); + deflater.reset(); + write(hdrBuf, 0, encodeTypeSize(objectType, length)); + } + + private int encodeTypeSize(int type, long rawLength) { + long nextLength = rawLength >>> 4; + hdrBuf[0] = (byte) ((nextLength > 0 ? 0x80 : 0x00) | (type << 4) | (rawLength & 0x0F)); + rawLength = nextLength; + int n = 1; + while (rawLength > 0) { + nextLength >>>= 7; + hdrBuf[n++] = (byte) ((nextLength > 0 ? 0x80 : 0x00) | (rawLength & 0x7F)); + rawLength = nextLength; + } + return n; + } + + @Override + public void write(final int b) throws IOException { + hdrBuf[0] = (byte) b; + write(hdrBuf, 0, 1); + } + + @Override + public void write(byte[] data, int off, int len) throws IOException { + crc32.update(data, off, len); + md.update(data, off, len); + writeNoHash(data, off, len); + } + + private void writeNoHash(byte[] data, int off, int len) + throws IOException { + while (0 < len) { + int n = Math.min(len, currBuf.length - currPtr); + if (n == 0) { + flushBlock(); + currBuf = new byte[blockSize]; + continue; + } + + System.arraycopy(data, off, currBuf, currPtr, n); + off += n; + len -= n; + currPtr += n; + } + } + + private void flushBlock() throws IOException { + out.write(currBuf, 0, currPtr); + + byte[] buf; + if (currPtr == currBuf.length) + buf = currBuf; + else + buf = copyOf(currBuf, 0, currPtr); + cache.put(new DfsBlock(packKey, currPos, buf)); + + currPos += currPtr; + currPtr = 0; + currBuf = null; + } + + private byte[] copyOf(byte[] src, int ptr, int cnt) { + byte[] dst = new byte[cnt]; + System.arraycopy(src, ptr, dst, 0, cnt); + return dst; + } + + byte[] writePackFooter() throws IOException { + byte[] packHash = md.digest(); + writeNoHash(packHash, 0, packHash.length); + if (currPtr != 0) + flushBlock(); + return packHash; + } + + @Override + public void close() throws IOException { + deflater.end(); + out.close(); + } + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsObjDatabase.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsObjDatabase.java new file mode 100644 index 0000000000..8c95340756 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsObjDatabase.java @@ -0,0 +1,403 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.io.FileNotFoundException; +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.concurrent.atomic.AtomicReference; + +import org.eclipse.jgit.lib.ObjectDatabase; +import org.eclipse.jgit.lib.ObjectInserter; +import org.eclipse.jgit.lib.ObjectReader; + +/** Manages objects stored in {@link DfsPackFile} on a storage system. */ +public abstract class DfsObjDatabase extends ObjectDatabase { + private static final PackList NO_PACKS = new PackList(new DfsPackFile[0]); + + /** Sources for a pack file. */ + public static enum PackSource { + /** The pack is created by ObjectInserter due to local activity. */ + INSERT, + + /** + * The pack is created by PackParser due to a network event. + *

+ * A received pack can be from either a push into the repository, or a + * fetch into the repository, the direction doesn't matter. A received + * pack was built by the remote Git implementation and may not match the + * storage layout preferred by this version. Received packs are likely + * to be either compacted or garbage collected in the future. + */ + RECEIVE, + + /** + * Pack was created by Git garbage collection by this implementation. + *

+ * This source is only used by the {@link DfsGarbageCollector} when it + * builds a pack file by traversing the object graph and copying all + * reachable objects into a new pack stream. + * + * @see DfsGarbageCollector + */ + GC, + + /** + * The pack was created by compacting multiple packs together. + *

+ * Packs created by compacting multiple packs together aren't nearly as + * efficient as a fully garbage collected repository, but may save disk + * space by reducing redundant copies of base objects. + * + * @see DfsPackCompactor + */ + COMPACT, + + /** + * Pack was created by Git garbage collection. + *

+ * This pack contains only unreachable garbage that was found during the + * last GC pass. It is retained in a new pack until it is safe to prune + * these objects from the repository. + */ + UNREACHABLE_GARBAGE; + } + + private final AtomicReference packList; + + private DfsReaderOptions readerOptions; + + /** + * Initialize an object database for are repository. + * + * @param options + * how readers should access the object database. + */ + protected DfsObjDatabase(DfsReaderOptions options) { + this.packList = new AtomicReference(NO_PACKS); + this.readerOptions = options; + } + + /** @return configured reader options, such as read-ahead. */ + public DfsReaderOptions getReaderOptions() { + return readerOptions; + } + + @Override + public ObjectReader newReader() { + return new DfsReader(this); + } + + @Override + public ObjectInserter newInserter() { + return new DfsInserter(this); + } + + /** + * List all available pack files in the repository. + * + * @return list of available packs. The returned array is shared with the + * implementation and must not be modified by the caller. + * @throws IOException + * the pack list cannot be initialized. + */ + public DfsPackFile[] getPacks() throws IOException { + return scanPacks(NO_PACKS).packs; + } + + /** + * Generate a new unique name for a pack file. + * + * @param source + * where the pack stream is created. + * @return a unique name for the pack file. Must not collide with any other + * pack file name in the same DFS. + * @throws IOException + * a new unique pack description cannot be generated. + */ + protected abstract DfsPackDescription newPack(PackSource source) + throws IOException; + + /** + * Commit a pack and index pair that was written to the DFS. + *

+ * Committing the pack/index pair makes them visible to readers. The JGit + * DFS code always writes the pack, then the index. This allows a simple + * commit process to do nothing if readers always look for both files to + * exist and the DFS performs atomic creation of the file (e.g. stream to a + * temporary file and rename to target on close). + *

+ * During pack compaction or GC the new pack file may be replacing other + * older files. Implementations should remove those older files (if any) as + * part of the commit of the new file. + * + * @param desc + * description of the new packs. + * @param replaces + * if not null, list of packs to remove. + * @throws IOException + * the packs cannot be committed. On failure a rollback must + * also be attempted by the caller. + */ + protected abstract void commitPack(Collection desc, + Collection replaces) throws IOException; + + /** + * Try to rollback a pack creation. + *

+ * JGit DFS always writes the pack first, then the index. If the pack does + * not yet exist, then neither does the index. A safe DFS implementation + * would try to remove both files to ensure they are really gone. + *

+ * A rollback does not support failures, as it only occurs when there is + * already a failure in progress. A DFS implementor may wish to log + * warnings/error messages when a rollback fails, but should not send new + * exceptions up the Java callstack. + * + * @param desc + * pack to delete. + */ + protected abstract void rollbackPack(Collection desc); + + /** + * List the available pack files. + *

+ * The returned list must support random access and must be mutable by the + * caller. It is sorted in place using the natural sorting of the returned + * DfsPackDescription objects. + * + * @return available packs. May be empty if there are no packs. + * @throws IOException + * the packs cannot be listed and the object database is not + * functional to the caller. + */ + protected abstract List listPacks() throws IOException; + + /** + * Open a pack file for reading. + * + * @param desc + * description of pack to read. This is an instance previously + * obtained from {@link #listPacks()}, but not necessarily from + * the same DfsObjDatabase instance. + * @return channel to read the pack file. + * @throws FileNotFoundException + * the file does not exist. + * @throws IOException + * the file cannot be opened. + */ + protected abstract ReadableChannel openPackFile(DfsPackDescription desc) + throws FileNotFoundException, IOException; + + /** + * Open a pack index for reading. + * + * @param desc + * description of index to read. This is an instance previously + * obtained from {@link #listPacks()}, but not necessarily from + * the same DfsObjDatabase instance. + * @return channel to read the pack file. + * @throws FileNotFoundException + * the file does not exist. + * @throws IOException + * the file cannot be opened. + */ + protected abstract ReadableChannel openPackIndex(DfsPackDescription desc) + throws FileNotFoundException, IOException; + + /** + * Open a pack file for writing. + * + * @param desc + * description of pack to write. This is an instance previously + * obtained from {@link #newPack(PackSource)}. + * @return channel to write the pack file. + * @throws IOException + * the file cannot be opened. + */ + protected abstract DfsOutputStream writePackFile(DfsPackDescription desc) + throws IOException; + + /** + * Open a pack index for writing. + * + * @param desc + * description of index to write. This is an instance previously + * obtained from {@link #newPack(PackSource)}. + * @return channel to write the index file. + * @throws IOException + * the file cannot be opened. + */ + protected abstract DfsOutputStream writePackIndex(DfsPackDescription desc) + throws IOException; + + void addPack(DfsPackFile newPack) throws IOException { + PackList o, n; + do { + o = packList.get(); + if (o == NO_PACKS) { + // The repository may not have needed any existing objects to + // complete the current task of creating a pack (e.g. push of a + // pack with no external deltas). Because we don't scan for + // newly added packs on missed object lookups, scan now to + // make sure all older packs are available in the packList. + o = scanPacks(o); + + // Its possible the scan identified the pack we were asked to + // add, as the pack was already committed via commitPack(). + // If this is the case return without changing the list. + for (DfsPackFile p : o.packs) { + if (p == newPack) + return; + } + } + + DfsPackFile[] packs = new DfsPackFile[1 + o.packs.length]; + packs[0] = newPack; + System.arraycopy(o.packs, 0, packs, 1, o.packs.length); + n = new PackList(packs); + } while (!packList.compareAndSet(o, n)); + } + + private PackList scanPacks(final PackList original) throws IOException { + synchronized (packList) { + PackList o, n; + do { + o = packList.get(); + if (o != original) { + // Another thread did the scan for us, while we + // were blocked on the monitor above. + // + return o; + } + n = scanPacksImpl(o); + if (n == o) + return n; + } while (!packList.compareAndSet(o, n)); + return n; + } + } + + private PackList scanPacksImpl(PackList old) throws IOException { + DfsBlockCache cache = DfsBlockCache.getInstance(); + Map forReuse = reuseMap(old); + List scanned = listPacks(); + Collections.sort(scanned); + + List list = new ArrayList(scanned.size()); + boolean foundNew = false; + for (DfsPackDescription dsc : scanned) { + DfsPackFile oldPack = forReuse.remove(dsc); + if (oldPack != null) { + list.add(oldPack); + } else { + list.add(cache.getOrCreate(dsc, null)); + foundNew = true; + } + } + + for (DfsPackFile p : forReuse.values()) + p.close(); + if (list.isEmpty()) + return new PackList(NO_PACKS.packs); + if (!foundNew) + return old; + return new PackList(list.toArray(new DfsPackFile[list.size()])); + } + + private static Map reuseMap(PackList old) { + Map forReuse + = new HashMap(); + for (DfsPackFile p : old.packs) { + if (p.invalid()) { + // The pack instance is corrupted, and cannot be safely used + // again. Do not include it in our reuse map. + // + p.close(); + continue; + } + + DfsPackFile prior = forReuse.put(p.getPackDescription(), p); + if (prior != null) { + // This should never occur. It should be impossible for us + // to have two pack files with the same name, as all of them + // came out of the same directory. If it does, we promised to + // close any PackFiles we did not reuse, so close the second, + // readers are likely to be actively using the first. + // + forReuse.put(prior.getPackDescription(), prior); + p.close(); + } + } + return forReuse; + } + + void clearCache() { + packList.set(NO_PACKS); + } + + @Override + public void close() { + // PackList packs = packList.get(); + packList.set(NO_PACKS); + + // TODO Close packs if they aren't cached. + // for (DfsPackFile p : packs.packs) + // p.close(); + } + + private static final class PackList { + /** All known packs, sorted. */ + final DfsPackFile[] packs; + + PackList(final DfsPackFile[] packs) { + this.packs = packs; + } + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsObjectRepresentation.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsObjectRepresentation.java new file mode 100644 index 0000000000..1b8e3a3d42 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsObjectRepresentation.java @@ -0,0 +1,90 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import org.eclipse.jgit.lib.ObjectId; +import org.eclipse.jgit.storage.pack.ObjectToPack; +import org.eclipse.jgit.storage.pack.StoredObjectRepresentation; + +class DfsObjectRepresentation extends StoredObjectRepresentation { + final ObjectToPack object; + + DfsPackFile pack; + + /** + * Position of {@link #pack} in the reader's pack list. Lower numbers are + * newer/more recent packs and less likely to contain the best format for a + * base object. Higher numbered packs are bigger, more stable, and favored + * by PackWriter when selecting representations... but only if they come + * last in the representation ordering. + */ + int packIndex; + + long offset; + + int format; + + long length; + + ObjectId baseId; + + DfsObjectRepresentation(ObjectToPack object) { + this.object = object; + } + + @Override + public int getFormat() { + return format; + } + + @Override + public int getWeight() { + return (int) Math.min(length, Integer.MAX_VALUE); + } + + @Override + public ObjectId getDeltaBase() { + return baseId; + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsObjectToPack.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsObjectToPack.java new file mode 100644 index 0000000000..c5243d9945 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsObjectToPack.java @@ -0,0 +1,82 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import org.eclipse.jgit.revwalk.RevObject; +import org.eclipse.jgit.storage.pack.ObjectToPack; +import org.eclipse.jgit.storage.pack.StoredObjectRepresentation; + +/** {@link ObjectToPack} for {@link DfsObjDatabase}. */ +class DfsObjectToPack extends ObjectToPack { + /** Pack to reuse compressed data from, otherwise null. */ + DfsPackFile pack; + + /** Position of the pack in the reader's pack list. */ + int packIndex; + + /** Offset of the object's header in {@link #pack}. */ + long offset; + + /** Length of the data section of the object. */ + long length; + + DfsObjectToPack(RevObject obj) { + super(obj); + } + + @Override + protected void clearReuseAsIs() { + super.clearReuseAsIs(); + pack = null; + } + + @Override + public void select(StoredObjectRepresentation ref) { + DfsObjectRepresentation ptr = (DfsObjectRepresentation) ref; + this.pack = ptr.pack; + this.packIndex = ptr.packIndex; + this.offset = ptr.offset; + this.length = ptr.length; + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsOutputStream.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsOutputStream.java new file mode 100644 index 0000000000..4346760402 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsOutputStream.java @@ -0,0 +1,98 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.io.IOException; +import java.io.OutputStream; +import java.nio.ByteBuffer; + +/** + * Output stream to create a file on the DFS. + * + * @see DfsObjDatabase#writePackFile(DfsPackDescription) + * @see DfsObjDatabase#writePackIndex(DfsPackDescription) + */ +public abstract class DfsOutputStream extends OutputStream { + /** + * Get the recommended alignment for writing. + *

+ * Starting a write at multiples of the blockSize is more efficient than + * starting a write at any other position. If 0 or -1 the channel does not + * have any specific block size recommendation. + *

+ * Channels should not recommend large block sizes. Sizes up to 1-4 MiB may + * be reasonable, but sizes above that may be horribly inefficient. + * + * @return recommended alignment size for randomly positioned reads. Does + * not need to be a power of 2. + */ + public int blockSize() { + return 0; + } + + @Override + public void write(int b) throws IOException { + write(new byte[] { (byte) b }); + } + + @Override + public abstract void write(byte[] buf, int off, int len) throws IOException; + + /** + * Read back a portion of already written data. + *

+ * The writing position of the output stream is not affected by a read. + * + * @param position + * offset to read from. + * @param buf + * buffer to populate. Up to {@code buf.remaining()} bytes will + * be read from {@code position}. + * @return number of bytes actually read. + * @throws IOException + * reading is not supported, or the read cannot be performed due + * to DFS errors. + */ + public abstract int read(long position, ByteBuffer buf) throws IOException; +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackCompactor.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackCompactor.java new file mode 100644 index 0000000000..2a5882eda2 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackCompactor.java @@ -0,0 +1,317 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import static org.eclipse.jgit.storage.dfs.DfsObjDatabase.PackSource.COMPACT; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collections; +import java.util.Comparator; +import java.util.List; + +import org.eclipse.jgit.JGitText; +import org.eclipse.jgit.errors.IncorrectObjectTypeException; +import org.eclipse.jgit.lib.AnyObjectId; +import org.eclipse.jgit.lib.NullProgressMonitor; +import org.eclipse.jgit.lib.ObjectId; +import org.eclipse.jgit.lib.ProgressMonitor; +import org.eclipse.jgit.revwalk.RevFlag; +import org.eclipse.jgit.revwalk.RevObject; +import org.eclipse.jgit.revwalk.RevWalk; +import org.eclipse.jgit.storage.file.PackIndex; +import org.eclipse.jgit.storage.pack.PackConfig; +import org.eclipse.jgit.storage.pack.PackWriter; +import org.eclipse.jgit.util.BlockList; +import org.eclipse.jgit.util.io.CountingOutputStream; + +/** + * Combine several pack files into one pack. + *

+ * The compactor combines several pack files together by including all objects + * contained in each pack file into the same output pack. If an object appears + * multiple times, it is only included once in the result. Because the new pack + * is constructed by enumerating the indexes of the source packs, it is quicker + * than doing a full repack of the repository, however the result is not nearly + * as space efficient as new delta compression is disabled. + *

+ * This method is suitable for quickly combining several packs together after + * receiving a number of small fetch or push operations into a repository, + * allowing the system to maintain reasonable read performance without expending + * a lot of time repacking the entire repository. + */ +public class DfsPackCompactor { + private final DfsRepository repo; + + private final List srcPacks; + + private final List newPacks; + + private final List newStats; + + private int autoAddSize; + + /** + * Initialize a pack compactor. + * + * @param repository + * repository objects to be packed will be read from. + */ + public DfsPackCompactor(DfsRepository repository) { + repo = repository; + autoAddSize = 5 * 1024 * 1024; // 5 MiB + srcPacks = new ArrayList(); + newPacks = new ArrayList(1); + newStats = new ArrayList(1); + } + + /** + * Add a pack to be compacted. + *

+ * All of the objects in this pack will be copied into the resulting pack. + * The resulting pack will order objects according to the source pack's own + * description ordering (which is based on creation date), and then by the + * order the objects appear in the source pack. + * + * @param pack + * a pack to combine into the resulting pack. + * @return {@code this} + */ + public DfsPackCompactor add(DfsPackFile pack) { + srcPacks.add(pack); + return this; + } + + /** + * Automatically select packs to be included, and add them. + *

+ * Packs are selected based on size, smaller packs get included while bigger + * ones are omitted. + * + * @return {@code this} + * @throws IOException + * existing packs cannot be read. + */ + public DfsPackCompactor autoAdd() throws IOException { + DfsObjDatabase objdb = repo.getObjectDatabase(); + for (DfsPackFile pack : objdb.getPacks()) { + DfsPackDescription d = pack.getPackDescription(); + if (d.getPackSize() < autoAddSize) + add(pack); + } + return this; + } + + /** + * Compact the pack files together. + * + * @param pm + * progress monitor to receive updates on as packing may take a + * while, depending on the size of the repository. + * @throws IOException + * the packs cannot be compacted. + */ + public void compact(ProgressMonitor pm) throws IOException { + if (pm == null) + pm = NullProgressMonitor.INSTANCE; + + DfsObjDatabase objdb = repo.getObjectDatabase(); + DfsReader ctx = (DfsReader) objdb.newReader(); + try { + PackConfig pc = new PackConfig(repo); + pc.setIndexVersion(2); + pc.setDeltaCompress(false); + pc.setReuseDeltas(true); + pc.setReuseObjects(true); + + PackWriter pw = new PackWriter(pc, ctx); + try { + pw.setDeltaBaseAsOffset(true); + pw.setReuseDeltaCommits(false); + + addObjectsToPack(pw, ctx, pm); + if (pw.getObjectCount() == 0) + return; + + boolean rollback = true; + DfsPackDescription pack = objdb.newPack(COMPACT); + try { + writePack(objdb, pack, pw, pm); + writeIndex(objdb, pack, pw); + + PackWriter.Statistics stats = pw.getStatistics(); + pw.release(); + pw = null; + + pack.setPackStats(stats); + objdb.commitPack(Collections.singletonList(pack), toPrune()); + newPacks.add(pack); + newStats.add(stats); + rollback = false; + } finally { + if (rollback) + objdb.rollbackPack(Collections.singletonList(pack)); + } + } finally { + if (pw != null) + pw.release(); + } + } finally { + ctx.release(); + } + } + + /** @return all of the source packs that fed into this compaction. */ + public List getSourcePacks() { + return toPrune(); + } + + /** @return new packs created by this compaction. */ + public List getNewPacks() { + return newPacks; + } + + /** @return statistics corresponding to the {@link #getNewPacks()}. */ + public List getNewPackStatistics() { + return newStats; + } + + private List toPrune() { + int cnt = srcPacks.size(); + List all = new ArrayList(cnt); + for (DfsPackFile pack : srcPacks) + all.add(pack.getPackDescription()); + return all; + } + + private void addObjectsToPack(PackWriter pw, DfsReader ctx, + ProgressMonitor pm) throws IOException, + IncorrectObjectTypeException { + // Sort packs by description ordering, this places newer packs before + // older packs, allowing the PackWriter to be handed newer objects + // first and older objects last. + Collections.sort(srcPacks, new Comparator() { + public int compare(DfsPackFile a, DfsPackFile b) { + return a.getPackDescription().compareTo(b.getPackDescription()); + } + }); + + RevWalk rw = new RevWalk(ctx); + RevFlag added = rw.newFlag("ADDED"); + + pm.beginTask(JGitText.get().countingObjects, ProgressMonitor.UNKNOWN); + for (DfsPackFile src : srcPacks) { + List want = new BlockList(); + for (PackIndex.MutableEntry ent : src.getPackIndex(ctx)) { + ObjectId id = ent.toObjectId(); + RevObject obj = rw.lookupOrNull(id); + if (obj == null || !obj.has(added)) + want.add(new ObjectIdWithOffset(id, ent.getOffset())); + } + + // Sort objects by the order they appear in the pack file, for + // two benefits. Scanning object type information is faster when + // the pack is traversed in order, and this allows the PackWriter + // to be given the new objects in a relatively sane newest-first + // ordering without additional logic, like unpacking commits and + // walking a commit queue. + Collections.sort(want, new Comparator() { + public int compare(ObjectIdWithOffset a, ObjectIdWithOffset b) { + return Long.signum(a.offset - b.offset); + } + }); + + // Only pack each object at most once into the output file. The + // PackWriter will later select a representation to reuse, which + // may be the version in this pack, or may be from another pack if + // the object was copied here to complete a thin pack and is larger + // than a delta from another pack. This is actually somewhat common + // if an object is modified frequently, such as the top level tree. + for (ObjectIdWithOffset id : want) { + int type = src.getObjectType(ctx, id.offset); + RevObject obj = rw.lookupAny(id, type); + if (!obj.has(added)) { + pm.update(1); + pw.addObject(obj); + obj.add(added); + } + } + } + pm.endTask(); + } + + private void writePack(DfsObjDatabase objdb, DfsPackDescription pack, + PackWriter pw, ProgressMonitor pm) throws IOException { + DfsOutputStream out = objdb.writePackFile(pack); + try { + CountingOutputStream cnt = new CountingOutputStream(out); + pw.writePack(pm, pm, cnt); + pack.setObjectCount(pw.getObjectCount()); + pack.setPackSize(cnt.getCount()); + } finally { + out.close(); + } + } + + private void writeIndex(DfsObjDatabase objdb, DfsPackDescription pack, + PackWriter pw) throws IOException { + DfsOutputStream out = objdb.writePackIndex(pack); + try { + CountingOutputStream cnt = new CountingOutputStream(out); + pw.writeIndex(cnt); + pack.setIndexSize(cnt.getCount()); + } finally { + out.close(); + } + } + + private static class ObjectIdWithOffset extends ObjectId { + final long offset; + + ObjectIdWithOffset(AnyObjectId id, long ofs) { + super(id); + offset = ofs; + } + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackDescription.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackDescription.java new file mode 100644 index 0000000000..88a1ec345f --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackDescription.java @@ -0,0 +1,266 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.util.Set; + +import org.eclipse.jgit.lib.ObjectId; +import org.eclipse.jgit.storage.pack.PackWriter; + +/** + * Description of a DFS stored pack/index file. + *

+ * Implementors may extend this class and add additional data members. + *

+ * Instances of this class are cached with the DfsPackFile, and should not be + * modified once initialized and presented to the JGit DFS library. + */ +public class DfsPackDescription implements Comparable { + private final String packName; + + private long lastModified; + + private long packSize; + + private long indexSize; + + private long objectCount; + + private long deltaCount; + + private Set tips; + + private PackWriter.Statistics stats; + + /** + * Initialize a description by pack name. + *

+ * The corresponding index file is assumed to exist and end with ".idx" + * instead of ".pack". If this is not true implementors must extend the + * class and override {@link #getIndexName()}. + *

+ * Callers should also try to fill in other fields if they are reasonably + * free to access at the time this instance is being initialized. + * + * @param name + * name of the pack file. Must end with ".pack". + */ + public DfsPackDescription(String name) { + this.packName = name; + } + + /** @return name of the pack file. */ + public String getPackName() { + return packName; + } + + /** @return name of the index file. */ + public String getIndexName() { + String name = getPackName(); + int dot = name.lastIndexOf('.'); + if (dot < 0) + dot = name.length(); + return name.substring(0, dot) + ".idx"; + } + + /** @return time the pack was created, in milliseconds. */ + public long getLastModified() { + return lastModified; + } + + /** + * @param timeMillis + * time the pack was created, in milliseconds. 0 if not known. + * @return {@code this} + */ + public DfsPackDescription setLastModified(long timeMillis) { + lastModified = timeMillis; + return this; + } + + /** @return size of the pack, in bytes. If 0 the pack size is not yet known. */ + public long getPackSize() { + return packSize; + } + + /** + * @param bytes + * size of the pack in bytes. If 0 the size is not known and will + * be determined on first read. + * @return {@code this} + */ + public DfsPackDescription setPackSize(long bytes) { + packSize = Math.max(0, bytes); + return this; + } + + /** + * @return size of the index, in bytes. If 0 the index size is not yet + * known. + */ + public long getIndexSize() { + return indexSize; + } + + /** + * @param bytes + * size of the index in bytes. If 0 the size is not known and + * will be determined on first read. + * @return {@code this} + */ + public DfsPackDescription setIndexSize(long bytes) { + indexSize = Math.max(0, bytes); + return this; + } + + /** @return number of objects in the pack. */ + public long getObjectCount() { + return objectCount; + } + + /** + * @param cnt + * number of objects in the pack. + * @return {@code this} + */ + public DfsPackDescription setObjectCount(long cnt) { + objectCount = Math.max(0, cnt); + return this; + } + + /** @return number of delta compressed objects in the pack. */ + public long getDeltaCount() { + return deltaCount; + } + + /** + * @param cnt + * number of delta compressed objects in the pack. + * @return {@code this} + */ + public DfsPackDescription setDeltaCount(long cnt) { + deltaCount = Math.max(0, cnt); + return this; + } + + /** @return the tips that created this pack, if known. */ + public Set getTips() { + return tips; + } + + /** + * @param tips + * the tips of the pack, null if it has no known tips. + * @return {@code this} + */ + public DfsPackDescription setTips(Set tips) { + this.tips = tips; + return this; + } + + /** + * @return statistics from PackWriter, if the pack was built with it. + * Generally this is only available for packs created by + * DfsGarbageCollector or DfsPackCompactor, and only when the pack + * is being committed to the repository. + */ + public PackWriter.Statistics getPackStats() { + return stats; + } + + DfsPackDescription setPackStats(PackWriter.Statistics stats) { + this.stats = stats; + return this; + } + + /** + * Discard the pack statistics, if it was populated. + * + * @return {@code this} + */ + public DfsPackDescription clearPackStats() { + stats = null; + return this; + } + + @Override + public int hashCode() { + return getPackName().hashCode(); + } + + @Override + public boolean equals(Object b) { + if (b instanceof DfsPackDescription) + return getPackName().equals(((DfsPackDescription) b).getPackName()); + return false; + } + + /** + * Sort packs according to the optimal lookup ordering. + *

+ * This method tries to position packs in the order readers should examine + * them when looking for objects by SHA-1. The default tries to sort packs + * with more recent modification dates before older packs, and packs with + * fewer objects before packs with more objects. + * + * @param b + * the other pack. + */ + public int compareTo(DfsPackDescription b) { + // Newer packs should sort first. + int cmp = Long.signum(b.getLastModified() - getLastModified()); + if (cmp != 0) + return cmp; + + // Break ties on smaller index. Readers may get lucky and find + // the object they care about in the smaller index. This also pushes + // big historical packs to the end of the list, due to more objects. + return Long.signum(getObjectCount() - b.getObjectCount()); + } + + @Override + public String toString() { + return getPackName(); + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackFile.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackFile.java new file mode 100644 index 0000000000..02ba0a0b20 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackFile.java @@ -0,0 +1,1025 @@ +/* + * Copyright (C) 2008-2011, Google Inc. + * Copyright (C) 2007, Robin Rosenberg + * Copyright (C) 2006-2008, Shawn O. Pearce + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.io.BufferedInputStream; +import java.io.EOFException; +import java.io.IOException; +import java.io.InputStream; +import java.nio.channels.Channels; +import java.text.MessageFormat; +import java.util.Set; +import java.util.zip.CRC32; +import java.util.zip.DataFormatException; +import java.util.zip.Inflater; + +import org.eclipse.jgit.JGitText; +import org.eclipse.jgit.errors.CorruptObjectException; +import org.eclipse.jgit.errors.LargeObjectException; +import org.eclipse.jgit.errors.MissingObjectException; +import org.eclipse.jgit.errors.PackInvalidException; +import org.eclipse.jgit.errors.StoredObjectRepresentationNotAvailableException; +import org.eclipse.jgit.lib.AbbreviatedObjectId; +import org.eclipse.jgit.lib.AnyObjectId; +import org.eclipse.jgit.lib.Constants; +import org.eclipse.jgit.lib.ObjectId; +import org.eclipse.jgit.lib.ObjectLoader; +import org.eclipse.jgit.storage.file.PackIndex; +import org.eclipse.jgit.storage.file.PackReverseIndex; +import org.eclipse.jgit.storage.pack.BinaryDelta; +import org.eclipse.jgit.storage.pack.PackOutputStream; +import org.eclipse.jgit.storage.pack.StoredObjectRepresentation; +import org.eclipse.jgit.util.IO; +import org.eclipse.jgit.util.LongList; + +/** + * A Git version 2 pack file representation. A pack file contains Git objects in + * delta packed format yielding high compression of lots of object where some + * objects are similar. + */ +public final class DfsPackFile { + /** + * File offset used to cache {@link #index} in {@link DfsBlockCache}. + *

+ * To better manage memory, the forward index is stored as a single block in + * the block cache under this file position. A negative value is used + * because it cannot occur in a normal pack file, and it is less likely to + * collide with a valid data block from the file as the high bits will all + * be set when treated as an unsigned long by the cache code. + */ + private static final long POS_INDEX = -1; + + /** Offset used to cache {@link #reverseIndex}. See {@link #POS_INDEX}. */ + private static final long POS_REVERSE_INDEX = -2; + + /** Cache that owns this pack file and its data. */ + private final DfsBlockCache cache; + + /** Description of the pack file's storage. */ + private final DfsPackDescription packDesc; + + /** Unique identity of this pack while in-memory. */ + final DfsPackKey key; + + /** + * Total number of bytes in this pack file. + *

+ * This field initializes to -1 and gets populated when a block is loaded. + */ + volatile long length; + + /** + * Preferred alignment for loading blocks from the backing file. + *

+ * It is initialized to 0 and filled in on the first read made from the + * file. Block sizes may be odd, e.g. 4091, caused by the underling DFS + * storing 4091 user bytes and 5 bytes block metadata into a lower level + * 4096 byte block on disk. + */ + private volatile int blockSize; + + /** True once corruption has been detected that cannot be worked around. */ + private volatile boolean invalid; + + /** + * Lock for initialization of {@link #index} and {@link #corruptObjects}. + *

+ * This lock ensures only one thread can perform the initialization work. + */ + private final Object initLock = new Object(); + + /** Index mapping {@link ObjectId} to position within the pack stream. */ + private volatile DfsBlockCache.Ref index; + + /** Reverse version of {@link #index} mapping position to {@link ObjectId}. */ + private volatile DfsBlockCache.Ref reverseIndex; + + /** + * Objects we have tried to read, and discovered to be corrupt. + *

+ * The list is allocated after the first corruption is found, and filled in + * as more entries are discovered. Typically this list is never used, as + * pack files do not usually contain corrupt objects. + */ + private volatile LongList corruptObjects; + + /** + * Construct a reader for an existing, packfile. + * + * @param cache + * cache that owns the pack data. + * @param desc + * description of the pack within the DFS. + * @param key + * interned key used to identify blocks in the block cache. + */ + DfsPackFile(DfsBlockCache cache, DfsPackDescription desc, DfsPackKey key) { + this.cache = cache; + this.packDesc = desc; + this.key = key; + + length = desc.getPackSize(); + if (length <= 0) + length = -1; + } + + /** @return description that was originally used to configure this pack file. */ + public DfsPackDescription getPackDescription() { + return packDesc; + } + + private String getPackName() { + return packDesc.getPackName(); + } + + void setBlockSize(int newSize) { + blockSize = newSize; + } + + void setPackIndex(PackIndex idx) { + long objCnt = idx.getObjectCount(); + int recSize = Constants.OBJECT_ID_LENGTH + 8; + int sz = (int) Math.min(objCnt * recSize, Integer.MAX_VALUE); + index = cache.put(key, POS_INDEX, sz, idx); + } + + PackIndex getPackIndex(DfsReader ctx) throws IOException { + return idx(ctx); + } + + private PackIndex idx(DfsReader ctx) throws IOException { + DfsBlockCache.Ref idxref = index; + if (idxref != null) { + PackIndex idx = idxref.get(); + if (idx != null) + return idx; + } + + if (invalid) + throw new PackInvalidException(getPackName()); + + synchronized (initLock) { + idxref = index; + if (idxref != null) { + PackIndex idx = idxref.get(); + if (idx != null) + return idx; + } + + PackIndex idx; + try { + ReadableChannel rc = ctx.db.openPackIndex(packDesc); + try { + InputStream in = Channels.newInputStream(rc); + int wantSize = 8192; + int bs = rc.blockSize(); + if (0 < bs && bs < wantSize) + bs = (wantSize / bs) * bs; + else if (bs <= 0) + bs = wantSize; + in = new BufferedInputStream(in, bs); + idx = PackIndex.read(in); + } finally { + rc.close(); + } + } catch (EOFException e) { + invalid = true; + IOException e2 = new IOException(MessageFormat.format( + DfsText.get().shortReadOfIndex, packDesc.getIndexName())); + e2.initCause(e); + throw e2; + } catch (IOException e) { + invalid = true; + IOException e2 = new IOException(MessageFormat.format( + DfsText.get().cannotReadIndex, packDesc.getIndexName())); + e2.initCause(e); + throw e2; + } + + setPackIndex(idx); + return idx; + } + } + + private PackReverseIndex getReverseIdx(DfsReader ctx) throws IOException { + DfsBlockCache.Ref revref = reverseIndex; + if (revref != null) { + PackReverseIndex revidx = revref.get(); + if (revidx != null) + return revidx; + } + + synchronized (initLock) { + revref = reverseIndex; + if (revref != null) { + PackReverseIndex revidx = revref.get(); + if (revidx != null) + return revidx; + } + + PackIndex fwdidx = idx(ctx); + PackReverseIndex revidx = new PackReverseIndex(fwdidx); + long objCnt = fwdidx.getObjectCount(); + int sz = (int) Math.min(objCnt * 8, Integer.MAX_VALUE); + reverseIndex = cache.put(key, POS_REVERSE_INDEX, sz, revidx); + return revidx; + } + } + + boolean hasObject(DfsReader ctx, AnyObjectId id) throws IOException { + final long offset = idx(ctx).findOffset(id); + return 0 < offset && !isCorrupt(offset); + } + + /** + * Get an object from this pack. + * + * @param ctx + * temporary working space associated with the calling thread. + * @param id + * the object to obtain from the pack. Must not be null. + * @return the object loader for the requested object if it is contained in + * this pack; null if the object was not found. + * @throws IOException + * the pack file or the index could not be read. + */ + ObjectLoader get(DfsReader ctx, AnyObjectId id) + throws IOException { + long offset = idx(ctx).findOffset(id); + return 0 < offset && !isCorrupt(offset) ? load(ctx, offset) : null; + } + + long findOffset(DfsReader ctx, AnyObjectId id) throws IOException { + return idx(ctx).findOffset(id); + } + + void resolve(DfsReader ctx, Set matches, AbbreviatedObjectId id, + int matchLimit) throws IOException { + idx(ctx).resolve(matches, id, matchLimit); + } + + /** Release all memory used by this DfsPackFile instance. */ + public void close() { + cache.remove(this); + index = null; + reverseIndex = null; + } + + /** + * Obtain the total number of objects available in this pack. This method + * relies on pack index, giving number of effectively available objects. + * + * @param ctx + * current reader for the calling thread. + * @return number of objects in index of this pack, likewise in this pack + * @throws IOException + * the index file cannot be loaded into memory. + */ + long getObjectCount(DfsReader ctx) throws IOException { + return idx(ctx).getObjectCount(); + } + + /** + * Search for object id with the specified start offset in associated pack + * (reverse) index. + * + * @param ctx + * current reader for the calling thread. + * @param offset + * start offset of object to find + * @return object id for this offset, or null if no object was found + * @throws IOException + * the index file cannot be loaded into memory. + */ + ObjectId findObjectForOffset(DfsReader ctx, long offset) throws IOException { + return getReverseIdx(ctx).findObject(offset); + } + + private byte[] decompress(long position, int sz, DfsReader ctx) + throws IOException, DataFormatException { + byte[] dstbuf; + try { + dstbuf = new byte[sz]; + } catch (OutOfMemoryError noMemory) { + // The size may be larger than our heap allows, return null to + // let the caller know allocation isn't possible and it should + // use the large object streaming approach instead. + // + // For example, this can occur when sz is 640 MB, and JRE + // maximum heap size is only 256 MB. Even if the JRE has + // 200 MB free, it cannot allocate a 640 MB byte array. + return null; + } + + if (ctx.inflate(this, position, dstbuf, false) != sz) + throw new EOFException(MessageFormat.format( + JGitText.get().shortCompressedStreamAt, + Long.valueOf(position))); + return dstbuf; + } + + void copyPackAsIs(PackOutputStream out, boolean validate, DfsReader ctx) + throws IOException { + // Pin the first window, this ensures the length is accurate. + ctx.pin(this, 0); + ctx.copyPackAsIs(this, length, validate, out); + } + + void copyAsIs(PackOutputStream out, DfsObjectToPack src, + boolean validate, DfsReader ctx) throws IOException, + StoredObjectRepresentationNotAvailableException { + final CRC32 crc1 = validate ? new CRC32() : null; + final CRC32 crc2 = validate ? new CRC32() : null; + final byte[] buf = out.getCopyBuffer(); + + // Rip apart the header so we can discover the size. + // + try { + readFully(src.offset, buf, 0, 20, ctx); + } catch (IOException ioError) { + StoredObjectRepresentationNotAvailableException gone; + gone = new StoredObjectRepresentationNotAvailableException(src); + gone.initCause(ioError); + throw gone; + } + int c = buf[0] & 0xff; + final int typeCode = (c >> 4) & 7; + long inflatedLength = c & 15; + int shift = 4; + int headerCnt = 1; + while ((c & 0x80) != 0) { + c = buf[headerCnt++] & 0xff; + inflatedLength += (c & 0x7f) << shift; + shift += 7; + } + + if (typeCode == Constants.OBJ_OFS_DELTA) { + do { + c = buf[headerCnt++] & 0xff; + } while ((c & 128) != 0); + if (validate) { + crc1.update(buf, 0, headerCnt); + crc2.update(buf, 0, headerCnt); + } + } else if (typeCode == Constants.OBJ_REF_DELTA) { + if (validate) { + crc1.update(buf, 0, headerCnt); + crc2.update(buf, 0, headerCnt); + } + + readFully(src.offset + headerCnt, buf, 0, 20, ctx); + if (validate) { + crc1.update(buf, 0, 20); + crc2.update(buf, 0, 20); + } + headerCnt += 20; + } else if (validate) { + crc1.update(buf, 0, headerCnt); + crc2.update(buf, 0, headerCnt); + } + + final long dataOffset = src.offset + headerCnt; + final long dataLength = src.length; + final long expectedCRC; + final DfsBlock quickCopy; + + // Verify the object isn't corrupt before sending. If it is, + // we report it missing instead. + // + try { + quickCopy = ctx.quickCopy(this, dataOffset, dataLength); + + if (validate && idx(ctx).hasCRC32Support()) { + // Index has the CRC32 code cached, validate the object. + // + expectedCRC = idx(ctx).findCRC32(src); + if (quickCopy != null) { + quickCopy.crc32(crc1, dataOffset, (int) dataLength); + } else { + long pos = dataOffset; + long cnt = dataLength; + while (cnt > 0) { + final int n = (int) Math.min(cnt, buf.length); + readFully(pos, buf, 0, n, ctx); + crc1.update(buf, 0, n); + pos += n; + cnt -= n; + } + } + if (crc1.getValue() != expectedCRC) { + setCorrupt(src.offset); + throw new CorruptObjectException(MessageFormat.format( + JGitText.get().objectAtHasBadZlibStream, + Long.valueOf(src.offset), getPackName())); + } + } else if (validate) { + // We don't have a CRC32 code in the index, so compute it + // now while inflating the raw data to get zlib to tell us + // whether or not the data is safe. + // + Inflater inf = ctx.inflater(); + byte[] tmp = new byte[1024]; + if (quickCopy != null) { + quickCopy.check(inf, tmp, dataOffset, (int) dataLength); + } else { + long pos = dataOffset; + long cnt = dataLength; + while (cnt > 0) { + final int n = (int) Math.min(cnt, buf.length); + readFully(pos, buf, 0, n, ctx); + crc1.update(buf, 0, n); + inf.setInput(buf, 0, n); + while (inf.inflate(tmp, 0, tmp.length) > 0) + continue; + pos += n; + cnt -= n; + } + } + if (!inf.finished() || inf.getBytesRead() != dataLength) { + setCorrupt(src.offset); + throw new EOFException(MessageFormat.format( + JGitText.get().shortCompressedStreamAt, + Long.valueOf(src.offset))); + } + expectedCRC = crc1.getValue(); + } else { + expectedCRC = -1; + } + } catch (DataFormatException dataFormat) { + setCorrupt(src.offset); + + CorruptObjectException corruptObject = new CorruptObjectException( + MessageFormat.format( + JGitText.get().objectAtHasBadZlibStream, + Long.valueOf(src.offset), getPackName())); + corruptObject.initCause(dataFormat); + + StoredObjectRepresentationNotAvailableException gone; + gone = new StoredObjectRepresentationNotAvailableException(src); + gone.initCause(corruptObject); + throw gone; + + } catch (IOException ioError) { + StoredObjectRepresentationNotAvailableException gone; + gone = new StoredObjectRepresentationNotAvailableException(src); + gone.initCause(ioError); + throw gone; + } + + if (quickCopy != null) { + // The entire object fits into a single byte array window slice, + // and we have it pinned. Write this out without copying. + // + out.writeHeader(src, inflatedLength); + quickCopy.write(out, dataOffset, (int) dataLength, null); + + } else if (dataLength <= buf.length) { + // Tiny optimization: Lots of objects are very small deltas or + // deflated commits that are likely to fit in the copy buffer. + // + if (!validate) { + long pos = dataOffset; + long cnt = dataLength; + while (cnt > 0) { + final int n = (int) Math.min(cnt, buf.length); + readFully(pos, buf, 0, n, ctx); + pos += n; + cnt -= n; + } + } + out.writeHeader(src, inflatedLength); + out.write(buf, 0, (int) dataLength); + } else { + // Now we are committed to sending the object. As we spool it out, + // check its CRC32 code to make sure there wasn't corruption between + // the verification we did above, and us actually outputting it. + // + out.writeHeader(src, inflatedLength); + long pos = dataOffset; + long cnt = dataLength; + while (cnt > 0) { + final int n = (int) Math.min(cnt, buf.length); + readFully(pos, buf, 0, n, ctx); + if (validate) + crc2.update(buf, 0, n); + out.write(buf, 0, n); + pos += n; + cnt -= n; + } + if (validate && crc2.getValue() != expectedCRC) { + throw new CorruptObjectException(MessageFormat.format( + JGitText.get().objectAtHasBadZlibStream, + Long.valueOf(src.offset), getPackName())); + } + } + } + + boolean invalid() { + return invalid; + } + + void setInvalid() { + invalid = true; + } + + private void readFully(long position, byte[] dstbuf, int dstoff, int cnt, + DfsReader ctx) throws IOException { + if (ctx.copy(this, position, dstbuf, dstoff, cnt) != cnt) + throw new EOFException(); + } + + long alignToBlock(long pos) { + int size = blockSize; + if (size == 0) + size = cache.getBlockSize(); + return (pos / size) * size; + } + + DfsBlock getOrLoadBlock(long pos, DfsReader ctx) throws IOException { + return cache.getOrLoad(this, pos, ctx); + } + + DfsBlock readOneBlock(long pos, DfsReader ctx) + throws IOException { + if (invalid) + throw new PackInvalidException(getPackName()); + + boolean close = true; + ReadableChannel rc = ctx.db.openPackFile(packDesc); + try { + // If the block alignment is not yet known, discover it. Prefer the + // larger size from either the cache or the file itself. + int size = blockSize; + if (size == 0) { + size = rc.blockSize(); + if (size <= 0) + size = cache.getBlockSize(); + else if (size < cache.getBlockSize()) + size = (cache.getBlockSize() / size) * size; + blockSize = size; + pos = (pos / size) * size; + } + + // If the size of the file is not yet known, try to discover it. + // Channels may choose to return -1 to indicate they don't + // know the length yet, in this case read up to the size unit + // given by the caller, then recheck the length. + long len = length; + if (len < 0) { + len = rc.size(); + if (0 <= len) + length = len; + } + + if (0 <= len && len < pos + size) + size = (int) (len - pos); + if (size <= 0) + throw new EOFException(MessageFormat.format( + DfsText.get().shortReadOfBlock, Long.valueOf(pos), + getPackName(), Long.valueOf(0), Long.valueOf(0))); + + byte[] buf = new byte[size]; + rc.position(pos); + int cnt = IO.read(rc, buf, 0, size); + if (cnt != size) { + if (0 <= len) { + throw new EOFException(MessageFormat.format( + DfsText.get().shortReadOfBlock, + Long.valueOf(pos), + getPackName(), + Integer.valueOf(size), + Integer.valueOf(cnt))); + } + + // Assume the entire thing was read in a single shot, compact + // the buffer to only the space required. + byte[] n = new byte[cnt]; + System.arraycopy(buf, 0, n, 0, n.length); + buf = n; + } else if (len < 0) { + // With no length at the start of the read, the channel should + // have the length available at the end. + length = len = rc.size(); + } + + DfsBlock v = new DfsBlock(key, pos, buf); + if (v.end < len) + close = !cache.readAhead(rc, key, size, v.end, len, ctx); + return v; + } finally { + if (close) + rc.close(); + } + } + + ObjectLoader load(DfsReader ctx, long pos) + throws IOException { + try { + final byte[] ib = ctx.tempId; + Delta delta = null; + byte[] data = null; + int type = Constants.OBJ_BAD; + boolean cached = false; + + SEARCH: for (;;) { + readFully(pos, ib, 0, 20, ctx); + int c = ib[0] & 0xff; + final int typeCode = (c >> 4) & 7; + long sz = c & 15; + int shift = 4; + int p = 1; + while ((c & 0x80) != 0) { + c = ib[p++] & 0xff; + sz += (c & 0x7f) << shift; + shift += 7; + } + + switch (typeCode) { + case Constants.OBJ_COMMIT: + case Constants.OBJ_TREE: + case Constants.OBJ_BLOB: + case Constants.OBJ_TAG: { + if (delta != null) { + data = decompress(pos + p, (int) sz, ctx); + type = typeCode; + break SEARCH; + } + + if (sz < ctx.getStreamFileThreshold()) { + data = decompress(pos + p, (int) sz, ctx); + if (data != null) + return new ObjectLoader.SmallObject(typeCode, data); + } + return new LargePackedWholeObject(typeCode, sz, pos, p, this, ctx.db); + } + + case Constants.OBJ_OFS_DELTA: { + c = ib[p++] & 0xff; + long base = c & 127; + while ((c & 128) != 0) { + base += 1; + c = ib[p++] & 0xff; + base <<= 7; + base += (c & 127); + } + base = pos - base; + delta = new Delta(delta, pos, (int) sz, p, base); + if (sz != delta.deltaSize) + break SEARCH; + + DeltaBaseCache.Entry e = ctx.getDeltaBaseCache().get(key, base); + if (e != null) { + type = e.type; + data = e.data; + cached = true; + break SEARCH; + } + pos = base; + continue SEARCH; + } + + case Constants.OBJ_REF_DELTA: { + readFully(pos + p, ib, 0, 20, ctx); + long base = findDeltaBase(ctx, ObjectId.fromRaw(ib)); + delta = new Delta(delta, pos, (int) sz, p + 20, base); + if (sz != delta.deltaSize) + break SEARCH; + + DeltaBaseCache.Entry e = ctx.getDeltaBaseCache().get(key, base); + if (e != null) { + type = e.type; + data = e.data; + cached = true; + break SEARCH; + } + pos = base; + continue SEARCH; + } + + default: + throw new IOException(MessageFormat.format( + JGitText.get().unknownObjectType, Integer.valueOf(typeCode))); + } + } + + // At this point there is at least one delta to apply to data. + // (Whole objects with no deltas to apply return early above.) + + if (data == null) + throw new LargeObjectException(); + + do { + // Cache only the base immediately before desired object. + if (cached) + cached = false; + else if (delta.next == null) + ctx.getDeltaBaseCache().put(key, delta.basePos, type, data); + + pos = delta.deltaPos; + + byte[] cmds = decompress(pos + delta.hdrLen, delta.deltaSize, ctx); + if (cmds == null) { + data = null; // Discard base in case of OutOfMemoryError + throw new LargeObjectException(); + } + + final long sz = BinaryDelta.getResultSize(cmds); + if (Integer.MAX_VALUE <= sz) + throw new LargeObjectException.ExceedsByteArrayLimit(); + + final byte[] result; + try { + result = new byte[(int) sz]; + } catch (OutOfMemoryError tooBig) { + data = null; // Discard base in case of OutOfMemoryError + cmds = null; + throw new LargeObjectException.OutOfMemory(tooBig); + } + + BinaryDelta.apply(data, cmds, result); + data = result; + delta = delta.next; + } while (delta != null); + + return new ObjectLoader.SmallObject(type, data); + + } catch (DataFormatException dfe) { + CorruptObjectException coe = new CorruptObjectException( + MessageFormat.format( + JGitText.get().objectAtHasBadZlibStream, Long.valueOf(pos), + getPackName())); + coe.initCause(dfe); + throw coe; + } + } + + private long findDeltaBase(DfsReader ctx, ObjectId baseId) + throws IOException, MissingObjectException { + long ofs = idx(ctx).findOffset(baseId); + if (ofs < 0) + throw new MissingObjectException(baseId, + JGitText.get().missingDeltaBase); + return ofs; + } + + private static class Delta { + /** Child that applies onto this object. */ + final Delta next; + + /** Offset of the delta object. */ + final long deltaPos; + + /** Size of the inflated delta stream. */ + final int deltaSize; + + /** Total size of the delta's pack entry header (including base). */ + final int hdrLen; + + /** Offset of the base object this delta applies onto. */ + final long basePos; + + Delta(Delta next, long ofs, int sz, int hdrLen, long baseOffset) { + this.next = next; + this.deltaPos = ofs; + this.deltaSize = sz; + this.hdrLen = hdrLen; + this.basePos = baseOffset; + } + } + + byte[] getDeltaHeader(DfsReader wc, long pos) + throws IOException, DataFormatException { + // The delta stream starts as two variable length integers. If we + // assume they are 64 bits each, we need 16 bytes to encode them, + // plus 2 extra bytes for the variable length overhead. So 18 is + // the longest delta instruction header. + // + final byte[] hdr = new byte[32]; + wc.inflate(this, pos, hdr, true /* header only */); + return hdr; + } + + int getObjectType(DfsReader ctx, long pos) throws IOException { + final byte[] ib = ctx.tempId; + for (;;) { + readFully(pos, ib, 0, 20, ctx); + int c = ib[0] & 0xff; + final int type = (c >> 4) & 7; + + switch (type) { + case Constants.OBJ_COMMIT: + case Constants.OBJ_TREE: + case Constants.OBJ_BLOB: + case Constants.OBJ_TAG: + return type; + + case Constants.OBJ_OFS_DELTA: { + int p = 1; + while ((c & 0x80) != 0) + c = ib[p++] & 0xff; + c = ib[p++] & 0xff; + long ofs = c & 127; + while ((c & 128) != 0) { + ofs += 1; + c = ib[p++] & 0xff; + ofs <<= 7; + ofs += (c & 127); + } + pos = pos - ofs; + continue; + } + + case Constants.OBJ_REF_DELTA: { + int p = 1; + while ((c & 0x80) != 0) + c = ib[p++] & 0xff; + readFully(pos + p, ib, 0, 20, ctx); + pos = findDeltaBase(ctx, ObjectId.fromRaw(ib)); + continue; + } + + default: + throw new IOException(MessageFormat.format( + JGitText.get().unknownObjectType, Integer.valueOf(type))); + } + } + } + + long getObjectSize(DfsReader ctx, AnyObjectId id) throws IOException { + final long offset = idx(ctx).findOffset(id); + return 0 < offset ? getObjectSize(ctx, offset) : -1; + } + + long getObjectSize(DfsReader ctx, long pos) + throws IOException { + final byte[] ib = ctx.tempId; + readFully(pos, ib, 0, 20, ctx); + int c = ib[0] & 0xff; + final int type = (c >> 4) & 7; + long sz = c & 15; + int shift = 4; + int p = 1; + while ((c & 0x80) != 0) { + c = ib[p++] & 0xff; + sz += (c & 0x7f) << shift; + shift += 7; + } + + long deltaAt; + switch (type) { + case Constants.OBJ_COMMIT: + case Constants.OBJ_TREE: + case Constants.OBJ_BLOB: + case Constants.OBJ_TAG: + return sz; + + case Constants.OBJ_OFS_DELTA: + c = ib[p++] & 0xff; + while ((c & 128) != 0) + c = ib[p++] & 0xff; + deltaAt = pos + p; + break; + + case Constants.OBJ_REF_DELTA: + deltaAt = pos + p + 20; + break; + + default: + throw new IOException(MessageFormat.format( + JGitText.get().unknownObjectType, Integer.valueOf(type))); + } + + try { + return BinaryDelta.getResultSize(getDeltaHeader(ctx, deltaAt)); + } catch (DataFormatException dfe) { + CorruptObjectException coe = new CorruptObjectException( + MessageFormat.format( + JGitText.get().objectAtHasBadZlibStream, Long.valueOf(pos), + getPackName())); + coe.initCause(dfe); + throw coe; + } + } + + void representation(DfsReader ctx, DfsObjectRepresentation r) + throws IOException { + final long pos = r.offset; + final byte[] ib = ctx.tempId; + readFully(pos, ib, 0, 20, ctx); + int c = ib[0] & 0xff; + int p = 1; + final int typeCode = (c >> 4) & 7; + while ((c & 0x80) != 0) + c = ib[p++] & 0xff; + + long len = (getReverseIdx(ctx).findNextOffset(pos, length - 20) - pos); + switch (typeCode) { + case Constants.OBJ_COMMIT: + case Constants.OBJ_TREE: + case Constants.OBJ_BLOB: + case Constants.OBJ_TAG: + r.format = StoredObjectRepresentation.PACK_WHOLE; + r.length = len - p; + return; + + case Constants.OBJ_OFS_DELTA: { + c = ib[p++] & 0xff; + long ofs = c & 127; + while ((c & 128) != 0) { + ofs += 1; + c = ib[p++] & 0xff; + ofs <<= 7; + ofs += (c & 127); + } + ofs = pos - ofs; + r.format = StoredObjectRepresentation.PACK_DELTA; + r.baseId = findObjectForOffset(ctx, ofs); + r.length = len - p; + return; + } + + case Constants.OBJ_REF_DELTA: { + len -= p; + len -= Constants.OBJECT_ID_LENGTH; + readFully(pos + p, ib, 0, 20, ctx); + ObjectId id = ObjectId.fromRaw(ib); + r.format = StoredObjectRepresentation.PACK_DELTA; + r.baseId = id; + r.length = len; + return; + } + + default: + throw new IOException(MessageFormat.format( + JGitText.get().unknownObjectType, Integer.valueOf(typeCode))); + } + } + + private boolean isCorrupt(long offset) { + LongList list = corruptObjects; + if (list == null) + return false; + synchronized (list) { + return list.contains(offset); + } + } + + private void setCorrupt(long offset) { + LongList list = corruptObjects; + if (list == null) { + synchronized (initLock) { + list = corruptObjects; + if (list == null) { + list = new LongList(); + corruptObjects = list; + } + } + } + synchronized (list) { + list.add(offset); + } + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackKey.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackKey.java new file mode 100644 index 0000000000..8e77dbaa27 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackKey.java @@ -0,0 +1,55 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +final class DfsPackKey { + final int hash; + + DfsPackKey() { + // Multiply by 31 here so we can more directly combine with another + // value without doing the multiply there. + // + hash = System.identityHashCode(this) * 31; + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackParser.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackParser.java new file mode 100644 index 0000000000..6ec01fc47f --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsPackParser.java @@ -0,0 +1,450 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.io.EOFException; +import java.io.IOException; +import java.io.InputStream; +import java.nio.ByteBuffer; +import java.security.MessageDigest; +import java.util.Collections; +import java.util.List; +import java.util.zip.CRC32; +import java.util.zip.Deflater; + +import org.eclipse.jgit.lib.AnyObjectId; +import org.eclipse.jgit.lib.Constants; +import org.eclipse.jgit.lib.ProgressMonitor; +import org.eclipse.jgit.storage.file.PackIndex; +import org.eclipse.jgit.storage.file.PackLock; +import org.eclipse.jgit.transport.PackParser; +import org.eclipse.jgit.transport.PackedObjectInfo; + +/** Parses a pack stream into the DFS, by creating a new pack and index. */ +public class DfsPackParser extends PackParser { + private final DfsObjDatabase objdb; + + private final DfsInserter objins; + + /** CRC-32 computation for objects that are appended onto the pack. */ + private final CRC32 crc; + + /** Running SHA-1 of the entire pack stream. */ + private final MessageDigest packDigest; + + /** Block size to use when caching data for read back. */ + private int blockSize; + + /** Current end of the pack file. */ + private long packEnd; + + /** Checksum of the entire pack file. */ + private byte[] packHash; + + /** Compresses delta bases when completing a thin pack. */ + private Deflater def; + + /** True if the pack is an empty pack. */ + private boolean isEmptyPack; + + /** Name of the pack file, computed in {@link #onPackHeader(long)}. */ + private DfsPackDescription packDsc; + + /** Key used during delta resolution reading delta chains. */ + private DfsPackKey packKey; + + /** If the index was small enough, the entire index after writing. */ + private PackIndex packIndex; + + /** Stream to the DFS storage, opened during {@link #onPackHeader(long)}. */ + private DfsOutputStream out; + + /** Data being written that has not yet been cached. */ + private byte[] currBuf; + private long currPos; // Position of currBuf in the file. + private int currEnd; // Position inside of currBuf to append to next. + + /** Cache the chunks were stored into or get read back from. */ + private DfsBlockCache blockCache; + + /** Cached block that is being read. */ + private long readPos; + private DfsBlock readBlock; + + /** + * Initialize a new pack parser. + * + * @param db + * database the objects will be imported into. + * @param ins + * inserter the parser will use to help it inject the objects. + * @param in + * the stream to parse. + */ + protected DfsPackParser(DfsObjDatabase db, DfsInserter ins, InputStream in) { + super(db, in); + this.objdb = db; + this.objins = ins; + this.crc = new CRC32(); + this.packDigest = Constants.newMessageDigest(); + } + + @Override + public PackLock parse(ProgressMonitor receiving, ProgressMonitor resolving) + throws IOException { + boolean rollback = true; + try { + blockCache = DfsBlockCache.getInstance(); + super.parse(receiving, resolving); + if (isEmptyPack) + return null; + buffer(packHash, 0, packHash.length); + if (currEnd != 0) + flushBlock(); + out.close(); + out = null; + currBuf = null; + readBlock = null; + packDsc.setPackSize(packEnd); + + writePackIndex(); + objdb.commitPack(Collections.singletonList(packDsc), null); + rollback = false; + + DfsPackFile p = blockCache.getOrCreate(packDsc, packKey); + p.setBlockSize(blockSize); + if (packIndex != null) + p.setPackIndex(packIndex); + + objdb.addPack(p); + + return null; + } finally { + blockCache = null; + currBuf = null; + readBlock = null; + + if (def != null) { + def.end(); + def = null; + } + + if (out != null) { + try { + out.close(); + } catch (IOException err) { + // Ignore a close error, rollbackPack is also important. + } + out = null; + } + + if (rollback && packDsc != null) { + try { + objdb.rollbackPack(Collections.singletonList(packDsc)); + } finally { + packDsc = null; + } + } + } + } + + /** @return description of the imported pack, if one was made. */ + public DfsPackDescription getPackDescription() { + return packDsc; + } + + @Override + protected void onPackHeader(long objectCount) throws IOException { + if (objectCount == 0) { + isEmptyPack = true; + currBuf = new byte[256]; + return; + } + + packDsc = objdb.newPack(DfsObjDatabase.PackSource.RECEIVE); + packKey = new DfsPackKey(); + + out = objdb.writePackFile(packDsc); + int size = out.blockSize(); + if (size <= 0) + size = blockCache.getBlockSize(); + else if (size < blockCache.getBlockSize()) + size = (blockCache.getBlockSize() / size) * size; + blockSize = size; + currBuf = new byte[blockSize]; + } + + @Override + protected void onBeginWholeObject(long streamPosition, int type, + long inflatedSize) throws IOException { + crc.reset(); + } + + @Override + protected void onEndWholeObject(PackedObjectInfo info) throws IOException { + info.setCRC((int) crc.getValue()); + } + + @Override + protected void onBeginOfsDelta(long streamPosition, + long baseStreamPosition, long inflatedSize) throws IOException { + crc.reset(); + } + + @Override + protected void onBeginRefDelta(long streamPosition, AnyObjectId baseId, + long inflatedSize) throws IOException { + crc.reset(); + } + + @Override + protected UnresolvedDelta onEndDelta() throws IOException { + UnresolvedDelta delta = new UnresolvedDelta(); + delta.setCRC((int) crc.getValue()); + return delta; + } + + @Override + protected void onInflatedObjectData(PackedObjectInfo obj, int typeCode, + byte[] data) throws IOException { + // DfsPackParser ignores this event. + } + + @Override + protected void onObjectHeader(Source src, byte[] raw, int pos, int len) + throws IOException { + crc.update(raw, pos, len); + } + + @Override + protected void onObjectData(Source src, byte[] raw, int pos, int len) + throws IOException { + crc.update(raw, pos, len); + } + + @Override + protected void onStoreStream(byte[] raw, int pos, int len) + throws IOException { + buffer(raw, pos, len); + packDigest.update(raw, pos, len); + } + + private void buffer(byte[] raw, int pos, int len) throws IOException { + while (0 < len) { + int n = Math.min(len, currBuf.length - currEnd); + if (n == 0) { + DfsBlock v = flushBlock(); + currBuf = new byte[blockSize]; + currEnd = 0; + currPos += v.size(); + continue; + } + + System.arraycopy(raw, pos, currBuf, currEnd, n); + pos += n; + len -= n; + currEnd += n; + packEnd += n; + } + } + + private DfsBlock flushBlock() throws IOException { + if (isEmptyPack) + throw new IOException(DfsText.get().willNotStoreEmptyPack); + + out.write(currBuf, 0, currEnd); + + byte[] buf; + if (currEnd == currBuf.length) { + buf = currBuf; + } else { + buf = new byte[currEnd]; + System.arraycopy(currBuf, 0, buf, 0, currEnd); + } + + DfsBlock v = new DfsBlock(packKey, currPos, buf); + readBlock = v; + blockCache.put(v); + return v; + } + + @Override + protected void onPackFooter(byte[] hash) throws IOException { + // The base class will validate the original hash matches + // what the stream has stored at the end. We are called + // only if the hash was good. Save it in case there are no + // missing bases to append. + packHash = hash; + } + + @Override + protected ObjectTypeAndSize seekDatabase(PackedObjectInfo obj, + ObjectTypeAndSize info) throws IOException { + readPos = obj.getOffset(); + crc.reset(); + return readObjectHeader(info); + } + + @Override + protected ObjectTypeAndSize seekDatabase(UnresolvedDelta delta, + ObjectTypeAndSize info) throws IOException { + readPos = delta.getOffset(); + crc.reset(); + return readObjectHeader(info); + } + + @Override + protected int readDatabase(byte[] dst, int pos, int cnt) throws IOException { + if (cnt == 0) + return 0; + + if (currPos <= readPos) { + // Requested read is still buffered. Copy direct from buffer. + int p = (int) (readPos - currPos); + int n = Math.min(cnt, currEnd - p); + if (n == 0) + throw new EOFException(); + System.arraycopy(currBuf, p, dst, pos, n); + readPos += n; + return n; + } + + if (readBlock == null || !readBlock.contains(packKey, readPos)) { + long start = toBlockStart(readPos); + readBlock = blockCache.get(packKey, start); + if (readBlock == null) { + int size = (int) Math.min(blockSize, packEnd - start); + byte[] buf = new byte[size]; + if (read(start, buf, 0, size) != size) + throw new EOFException(); + readBlock = new DfsBlock(packKey, start, buf); + blockCache.put(readBlock); + } + } + + int n = readBlock.copy(readPos, dst, pos, cnt); + readPos += n; + return n; + } + + private int read(long pos, byte[] dst, int off, int len) throws IOException { + if (len == 0) + return 0; + + int cnt = 0; + while (0 < len) { + int r = out.read(pos, ByteBuffer.wrap(dst, off, len)); + if (r <= 0) + break; + pos += r; + off += r; + len -= r; + cnt += r; + } + return cnt != 0 ? cnt : -1; + } + + private long toBlockStart(long pos) { + return (pos / blockSize) * blockSize; + } + + @Override + protected boolean checkCRC(int oldCRC) { + return oldCRC == (int) crc.getValue(); + } + + @Override + protected boolean onAppendBase(final int typeCode, final byte[] data, + final PackedObjectInfo info) throws IOException { + info.setOffset(packEnd); + + final byte[] buf = buffer(); + int sz = data.length; + int len = 0; + buf[len++] = (byte) ((typeCode << 4) | sz & 15); + sz >>>= 4; + while (sz > 0) { + buf[len - 1] |= 0x80; + buf[len++] = (byte) (sz & 0x7f); + sz >>>= 7; + } + + packDigest.update(buf, 0, len); + crc.reset(); + crc.update(buf, 0, len); + buffer(buf, 0, len); + + if (def == null) + def = new Deflater(Deflater.DEFAULT_COMPRESSION, false); + else + def.reset(); + def.setInput(data); + def.finish(); + + while (!def.finished()) { + len = def.deflate(buf); + packDigest.update(buf, 0, len); + crc.update(buf, 0, len); + buffer(buf, 0, len); + } + + info.setCRC((int) crc.getValue()); + return true; + } + + @Override + protected void onEndThinPack() throws IOException { + // Normally when a thin pack is closed the pack header gets + // updated to reflect the actual object count. This is not going + // to be possible on most DFS backends, so instead we allow + // the header to have an incorrect count, but we do change the + // trailing digest to be correct. + packHash = packDigest.digest(); + } + + private void writePackIndex() throws IOException { + List list = getSortedObjectList(null /* by ObjectId */); + packIndex = objins.writePackIndex(packDsc, packHash, list); + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsReader.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsReader.java new file mode 100644 index 0000000000..02077a99ac --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsReader.java @@ -0,0 +1,793 @@ +/* + * Copyright (C) 2008-2011, Google Inc. + * Copyright (C) 2006-2008, Shawn O. Pearce + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import static org.eclipse.jgit.lib.Constants.OBJECT_ID_LENGTH; +import static org.eclipse.jgit.lib.Constants.OBJ_BLOB; +import static org.eclipse.jgit.lib.Constants.OBJ_TREE; + +import java.io.IOException; +import java.io.InterruptedIOException; +import java.security.MessageDigest; +import java.text.MessageFormat; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collection; +import java.util.Collections; +import java.util.Comparator; +import java.util.HashSet; +import java.util.Iterator; +import java.util.LinkedList; +import java.util.List; +import java.util.concurrent.ExecutionException; +import java.util.zip.DataFormatException; +import java.util.zip.Inflater; + +import org.eclipse.jgit.JGitText; +import org.eclipse.jgit.errors.IncorrectObjectTypeException; +import org.eclipse.jgit.errors.MissingObjectException; +import org.eclipse.jgit.errors.StoredObjectRepresentationNotAvailableException; +import org.eclipse.jgit.lib.AbbreviatedObjectId; +import org.eclipse.jgit.lib.AnyObjectId; +import org.eclipse.jgit.lib.AsyncObjectLoaderQueue; +import org.eclipse.jgit.lib.AsyncObjectSizeQueue; +import org.eclipse.jgit.lib.Constants; +import org.eclipse.jgit.lib.InflaterCache; +import org.eclipse.jgit.lib.ObjectId; +import org.eclipse.jgit.lib.ObjectLoader; +import org.eclipse.jgit.lib.ObjectReader; +import org.eclipse.jgit.lib.ProgressMonitor; +import org.eclipse.jgit.revwalk.ObjectWalk; +import org.eclipse.jgit.revwalk.RevCommit; +import org.eclipse.jgit.revwalk.RevObject; +import org.eclipse.jgit.revwalk.RevWalk; +import org.eclipse.jgit.storage.pack.CachedPack; +import org.eclipse.jgit.storage.pack.ObjectReuseAsIs; +import org.eclipse.jgit.storage.pack.ObjectToPack; +import org.eclipse.jgit.storage.pack.PackOutputStream; +import org.eclipse.jgit.storage.pack.PackWriter; +import org.eclipse.jgit.util.BlockList; + +final class DfsReader extends ObjectReader implements ObjectReuseAsIs { + /** Temporary buffer large enough for at least one raw object id. */ + final byte[] tempId = new byte[OBJECT_ID_LENGTH]; + + /** Database this reader loads objects from. */ + final DfsObjDatabase db; + + private Inflater inf; + + private DfsBlock block; + + private DeltaBaseCache baseCache; + + private DfsPackFile last; + + private boolean wantReadAhead; + + private List pendingReadAhead; + + DfsReader(DfsObjDatabase db) { + this.db = db; + } + + DfsReaderOptions getOptions() { + return db.getReaderOptions(); + } + + DeltaBaseCache getDeltaBaseCache() { + if (baseCache == null) + baseCache = new DeltaBaseCache(this); + return baseCache; + } + + int getStreamFileThreshold() { + return getOptions().getStreamFileThreshold(); + } + + @Override + public ObjectReader newReader() { + return new DfsReader(db); + } + + @Override + public Collection resolve(AbbreviatedObjectId id) + throws IOException { + if (id.isComplete()) + return Collections.singleton(id.toObjectId()); + HashSet matches = new HashSet(4); + for (DfsPackFile pack : db.getPacks()) { + pack.resolve(this, matches, id, 256); + if (256 <= matches.size()) + break; + } + return matches; + } + + @Override + public boolean has(AnyObjectId objectId) throws IOException { + if (last != null && last.hasObject(this, objectId)) + return true; + for (DfsPackFile pack : db.getPacks()) { + if (last == pack) + continue; + if (pack.hasObject(this, objectId)) { + last = pack; + return true; + } + } + return false; + } + + @Override + public ObjectLoader open(AnyObjectId objectId, int typeHint) + throws MissingObjectException, IncorrectObjectTypeException, + IOException { + if (last != null) { + ObjectLoader ldr = last.get(this, objectId); + if (ldr != null) + return ldr; + } + + for (DfsPackFile pack : db.getPacks()) { + if (pack == last) + continue; + ObjectLoader ldr = pack.get(this, objectId); + if (ldr != null) { + last = pack; + return ldr; + } + } + + if (typeHint == OBJ_ANY) + throw new MissingObjectException(objectId.copy(), "unknown"); + throw new MissingObjectException(objectId.copy(), typeHint); + } + + private static final Comparator> FOUND_OBJECT_SORT = new Comparator>() { + public int compare(FoundObject a, FoundObject b) { + int cmp = a.packIndex - b.packIndex; + if (cmp == 0) + cmp = Long.signum(a.offset - b.offset); + return cmp; + } + }; + + private static class FoundObject { + final T id; + final DfsPackFile pack; + final long offset; + final int packIndex; + + FoundObject(T objectId, int packIdx, DfsPackFile pack, long offset) { + this.id = objectId; + this.pack = pack; + this.offset = offset; + this.packIndex = packIdx; + } + + FoundObject(T objectId) { + this.id = objectId; + this.pack = null; + this.offset = 0; + this.packIndex = 0; + } + } + + private Iterable> findAll( + Iterable objectIds) throws IOException { + ArrayList> r = new ArrayList>(); + DfsPackFile[] packList = db.getPacks(); + if (packList.length == 0) { + for (T t : objectIds) + r.add(new FoundObject(t)); + return r; + } + + int lastIdx = 0; + DfsPackFile lastPack = packList[lastIdx]; + + OBJECT_SCAN: for (T t : objectIds) { + try { + long p = lastPack.findOffset(this, t); + if (0 < p) { + r.add(new FoundObject(t, lastIdx, lastPack, p)); + continue; + } + } catch (IOException e) { + // Fall though and try to examine other packs. + } + + for (int i = 0; i < packList.length; i++) { + if (i == lastIdx) + continue; + DfsPackFile pack = packList[i]; + try { + long p = pack.findOffset(this, t); + if (0 < p) { + r.add(new FoundObject(t, i, pack, p)); + lastIdx = i; + lastPack = pack; + continue OBJECT_SCAN; + } + } catch (IOException e) { + // Examine other packs. + } + } + + r.add(new FoundObject(t)); + } + + Collections.sort(r, FOUND_OBJECT_SORT); + last = lastPack; + return r; + } + + @Override + public AsyncObjectLoaderQueue open( + Iterable objectIds, final boolean reportMissing) { + wantReadAhead = true; + + Iterable> order; + IOException error = null; + try { + order = findAll(objectIds); + } catch (IOException e) { + order = Collections.emptyList(); + error = e; + } + + final Iterator> idItr = order.iterator(); + final IOException findAllError = error; + return new AsyncObjectLoaderQueue() { + private FoundObject cur; + + public boolean next() throws MissingObjectException, IOException { + if (idItr.hasNext()) { + cur = idItr.next(); + return true; + } else if (findAllError != null) { + throw findAllError; + } else { + cancelReadAhead(); + return false; + } + } + + public T getCurrent() { + return cur.id; + } + + public ObjectId getObjectId() { + return cur.id; + } + + public ObjectLoader open() throws IOException { + if (cur.pack == null) + throw new MissingObjectException(cur.id, "unknown"); + return cur.pack.load(DfsReader.this, cur.offset); + } + + public boolean cancel(boolean mayInterruptIfRunning) { + cancelReadAhead(); + return true; + } + + public void release() { + cancelReadAhead(); + } + }; + } + + @Override + public AsyncObjectSizeQueue getObjectSize( + Iterable objectIds, final boolean reportMissing) { + wantReadAhead = true; + + Iterable> order; + IOException error = null; + try { + order = findAll(objectIds); + } catch (IOException e) { + order = Collections.emptyList(); + error = e; + } + + final Iterator> idItr = order.iterator(); + final IOException findAllError = error; + return new AsyncObjectSizeQueue() { + private FoundObject cur; + + private long sz; + + public boolean next() throws MissingObjectException, IOException { + if (idItr.hasNext()) { + cur = idItr.next(); + if (cur.pack == null) + throw new MissingObjectException(cur.id, "unknown"); + sz = cur.pack.getObjectSize(DfsReader.this, cur.offset); + return true; + } else if (findAllError != null) { + throw findAllError; + } else { + cancelReadAhead(); + return false; + } + } + + public T getCurrent() { + return cur.id; + } + + public ObjectId getObjectId() { + return cur.id; + } + + public long getSize() { + return sz; + } + + public boolean cancel(boolean mayInterruptIfRunning) { + cancelReadAhead(); + return true; + } + + public void release() { + cancelReadAhead(); + } + }; + } + + @Override + public void walkAdviceBeginCommits(RevWalk walk, Collection roots) { + wantReadAhead = true; + } + + @Override + public void walkAdviceBeginTrees(ObjectWalk ow, RevCommit min, RevCommit max) { + wantReadAhead = true; + } + + @Override + public void walkAdviceEnd() { + cancelReadAhead(); + } + + @Override + public long getObjectSize(AnyObjectId objectId, int typeHint) + throws MissingObjectException, IncorrectObjectTypeException, + IOException { + if (last != null) { + long sz = last.getObjectSize(this, objectId); + if (0 <= sz) + return sz; + } + + for (DfsPackFile pack : db.getPacks()) { + if (pack == last) + continue; + long sz = pack.getObjectSize(this, objectId); + if (0 <= sz) { + last = pack; + return sz; + } + } + + if (typeHint == OBJ_ANY) + throw new MissingObjectException(objectId.copy(), "unknown"); + throw new MissingObjectException(objectId.copy(), typeHint); + } + + public DfsObjectToPack newObjectToPack(RevObject obj) { + return new DfsObjectToPack(obj); + } + + private static final Comparator REPRESENTATION_SORT = new Comparator() { + public int compare(DfsObjectRepresentation a, DfsObjectRepresentation b) { + int cmp = a.packIndex - b.packIndex; + if (cmp == 0) + cmp = Long.signum(a.offset - b.offset); + return cmp; + } + }; + + public void selectObjectRepresentation(PackWriter packer, + ProgressMonitor monitor, Iterable objects) + throws IOException, MissingObjectException { + DfsPackFile[] packList = db.getPacks(); + if (packList.length == 0) { + Iterator itr = objects.iterator(); + if (itr.hasNext()) + throw new MissingObjectException(itr.next(), "unknown"); + return; + } + + int packIndex = 0; + DfsPackFile packLast = packList[packIndex]; + + int updated = 0; + int posted = 0; + List all = new BlockList(); + for (ObjectToPack otp : objects) { + long p = packLast.findOffset(this, otp); + if (p < 0) { + int skip = packIndex; + for (packIndex = 0; packIndex < packList.length; packIndex++) { + if (skip == packIndex) + continue; + packLast = packList[packIndex]; + p = packLast.findOffset(this, otp); + if (0 < p) + break; + } + if (packIndex == packList.length) + throw new MissingObjectException(otp, otp.getType()); + } + + DfsObjectRepresentation r = new DfsObjectRepresentation(otp); + r.pack = packLast; + r.packIndex = packIndex; + r.offset = p; + all.add(r); + + if ((++updated & 1) == 1) { + monitor.update(1); // Update by 50%, the other 50% is below. + posted++; + } + } + Collections.sort(all, REPRESENTATION_SORT); + + try { + wantReadAhead = true; + for (DfsObjectRepresentation r : all) { + r.pack.representation(this, r); + packer.select(r.object, r); + if ((++updated & 1) == 1) { + monitor.update(1); + posted++; + } + } + } finally { + cancelReadAhead(); + } + if (posted < all.size()) + monitor.update(all.size() - posted); + } + + public void copyObjectAsIs(PackOutputStream out, ObjectToPack otp, + boolean validate) throws IOException, + StoredObjectRepresentationNotAvailableException { + DfsObjectToPack src = (DfsObjectToPack) otp; + src.pack.copyAsIs(out, src, validate, this); + } + + private static final Comparator WRITE_SORT = new Comparator() { + public int compare(ObjectToPack o1, ObjectToPack o2) { + DfsObjectToPack a = (DfsObjectToPack) o1; + DfsObjectToPack b = (DfsObjectToPack) o2; + int cmp = a.packIndex - b.packIndex; + if (cmp == 0) + cmp = Long.signum(a.offset - b.offset); + return cmp; + } + }; + + public void writeObjects(PackOutputStream out, List list) + throws IOException { + if (list.isEmpty()) + return; + + // Sorting objects by order in the current packs is usually + // worthwhile. Most packs are going to be OFS_DELTA style, + // where the base must appear before the deltas. If both base + // and delta are to be reused, this ensures the base writes in + // the output first without the recursive write-base-first logic + // used by PackWriter to ensure OFS_DELTA can be used. + // + // Sorting by pack also ensures newer objects go first, which + // typically matches the desired order. + // + // Only do this sorting for OBJ_TREE and OBJ_BLOB. Commits + // are very likely to already be sorted in a good order in the + // incoming list, and if they aren't, JGit's PackWriter has fixed + // the order to be more optimal for readers, so honor that. + switch (list.get(0).getType()) { + case OBJ_TREE: + case OBJ_BLOB: + Collections.sort(list, WRITE_SORT); + } + + try { + wantReadAhead = true; + for (ObjectToPack otp : list) + out.writeObject(otp); + } finally { + cancelReadAhead(); + } + } + + @SuppressWarnings("unchecked") + public Collection getCachedPacks() throws IOException { + DfsPackFile[] packList = db.getPacks(); + List cached = new ArrayList(packList.length); + for (DfsPackFile pack : packList) { + DfsPackDescription desc = pack.getPackDescription(); + if (desc.getTips() == null || desc.getTips().isEmpty()) + continue; + cached.add(new DfsCachedPack(pack)); + } + return cached; + } + + public void copyPackAsIs(PackOutputStream out, CachedPack pack, + boolean validate) throws IOException { + try { + wantReadAhead = true; + ((DfsCachedPack) pack).copyAsIs(out, validate, this); + } finally { + cancelReadAhead(); + } + } + + /** + * Copy bytes from the window to a caller supplied buffer. + * + * @param pack + * the file the desired window is stored within. + * @param position + * position within the file to read from. + * @param dstbuf + * destination buffer to copy into. + * @param dstoff + * offset within dstbuf to start copying into. + * @param cnt + * number of bytes to copy. This value may exceed the number of + * bytes remaining in the window starting at offset + * pos. + * @return number of bytes actually copied; this may be less than + * cnt if cnt exceeded the number of bytes + * available. + * @throws IOException + * this cursor does not match the provider or id and the proper + * window could not be acquired through the provider's cache. + */ + int copy(DfsPackFile pack, long position, byte[] dstbuf, int dstoff, int cnt) + throws IOException { + if (cnt == 0) + return 0; + + long length = pack.length; + if (0 <= length && length <= position) + return 0; + + int need = cnt; + do { + pin(pack, position); + int r = block.copy(position, dstbuf, dstoff, need); + position += r; + dstoff += r; + need -= r; + if (length < 0) + length = pack.length; + } while (0 < need && position < length); + return cnt - need; + } + + void copyPackAsIs(DfsPackFile pack, long length, boolean validate, + PackOutputStream out) throws IOException { + MessageDigest md = null; + if (validate) { + md = Constants.newMessageDigest(); + byte[] buf = out.getCopyBuffer(); + pin(pack, 0); + if (block.copy(0, buf, 0, 12) != 12) { + pack.setInvalid(); + throw new IOException(JGitText.get().packfileIsTruncated); + } + md.update(buf, 0, 12); + } + + long position = 12; + long remaining = length - (12 + 20); + while (0 < remaining) { + pin(pack, position); + + int ptr = (int) (position - block.start); + int n = (int) Math.min(block.size() - ptr, remaining); + block.write(out, position, n, md); + position += n; + remaining -= n; + } + + if (md != null) { + byte[] buf = new byte[20]; + byte[] actHash = md.digest(); + + pin(pack, position); + if (block.copy(position, buf, 0, 20) != 20) { + pack.setInvalid(); + throw new IOException(JGitText.get().packfileIsTruncated); + } + if (!Arrays.equals(actHash, buf)) { + pack.setInvalid(); + throw new IOException(MessageFormat.format( + JGitText.get().packfileCorruptionDetected, + pack.getPackDescription().getPackName())); + } + } + } + + /** + * Inflate a region of the pack starting at {@code position}. + * + * @param pack + * the file the desired window is stored within. + * @param position + * position within the file to read from. + * @param dstbuf + * destination buffer the inflater should output decompressed + * data to. + * @param headerOnly + * if true the caller wants only {@code dstbuf.length} bytes. + * @return updated dstoff based on the number of bytes + * successfully inflated into dstbuf. + * @throws IOException + * this cursor does not match the provider or id and the proper + * window could not be acquired through the provider's cache. + * @throws DataFormatException + * the inflater encountered an invalid chunk of data. Data + * stream corruption is likely. + */ + int inflate(DfsPackFile pack, long position, byte[] dstbuf, + boolean headerOnly) throws IOException, DataFormatException { + prepareInflater(); + pin(pack, position); + int dstoff = 0; + for (;;) { + dstoff = block.inflate(inf, position, dstbuf, dstoff); + + if (headerOnly & dstoff == dstbuf.length) + return dstoff; + if (inf.needsInput()) { + position += block.remaining(position); + pin(pack, position); + } else if (inf.finished()) + return dstoff; + else + throw new DataFormatException(); + } + } + + DfsBlock quickCopy(DfsPackFile p, long pos, long cnt) + throws IOException { + pin(p, pos); + if (block.contains(p.key, pos + (cnt - 1))) + return block; + return null; + } + + Inflater inflater() { + prepareInflater(); + return inf; + } + + private void prepareInflater() { + if (inf == null) + inf = InflaterCache.get(); + else + inf.reset(); + } + + void pin(DfsPackFile pack, long position) throws IOException { + DfsBlock b = block; + if (b == null || !b.contains(pack.key, position)) { + // If memory is low, we may need what is in our window field to + // be cleaned up by the GC during the get for the next window. + // So we always clear it, even though we are just going to set + // it again. + // + block = null; + + if (pendingReadAhead != null) + waitForBlock(pack.key, position); + block = pack.getOrLoadBlock(position, this); + } + } + + boolean wantReadAhead() { + return wantReadAhead; + } + + void startedReadAhead(List blocks) { + if (pendingReadAhead == null) + pendingReadAhead = new LinkedList(); + pendingReadAhead.addAll(blocks); + } + + private void cancelReadAhead() { + if (pendingReadAhead != null) { + for (ReadAheadTask.BlockFuture f : pendingReadAhead) + f.cancel(true); + pendingReadAhead = null; + } + wantReadAhead = false; + } + + private void waitForBlock(DfsPackKey key, long position) + throws InterruptedIOException { + Iterator itr = pendingReadAhead.iterator(); + while (itr.hasNext()) { + ReadAheadTask.BlockFuture f = itr.next(); + if (f.contains(key, position)) { + try { + f.get(); + } catch (InterruptedException e) { + throw new InterruptedIOException(); + } catch (ExecutionException e) { + // Exceptions should never be thrown by get(). Ignore + // this and let the normal load paths identify any error. + } + itr.remove(); + if (pendingReadAhead.isEmpty()) + pendingReadAhead = null; + break; + } + } + } + + /** Release the current window cursor. */ + @Override + public void release() { + cancelReadAhead(); + last = null; + block = null; + baseCache = null; + try { + InflaterCache.release(inf); + } finally { + inf = null; + } + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsReaderOptions.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsReaderOptions.java new file mode 100644 index 0000000000..2cb83e181e --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsReaderOptions.java @@ -0,0 +1,135 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import static org.eclipse.jgit.lib.ConfigConstants.CONFIG_CORE_SECTION; +import static org.eclipse.jgit.lib.ConfigConstants.CONFIG_DFS_SECTION; +import static org.eclipse.jgit.lib.ConfigConstants.CONFIG_KEY_DELTA_BASE_CACHE_LIMIT; +import static org.eclipse.jgit.lib.ConfigConstants.CONFIG_KEY_STREAM_FILE_TRESHOLD; + +import org.eclipse.jgit.lib.Config; +import org.eclipse.jgit.storage.pack.PackConfig; + +/** Options controlling how objects are read from a DHT stored repository. */ +public class DfsReaderOptions { + /** 1024 (number of bytes in one kibibyte/kilobyte) */ + public static final int KiB = 1024; + + /** 1024 {@link #KiB} (number of bytes in one mebibyte/megabyte) */ + public static final int MiB = 1024 * KiB; + + private int deltaBaseCacheLimit; + + private int streamFileThreshold; + + /** Create a default reader configuration. */ + public DfsReaderOptions() { + setDeltaBaseCacheLimit(10 * MiB); + setStreamFileThreshold(PackConfig.DEFAULT_BIG_FILE_THRESHOLD); + } + + /** @return maximum number of bytes to hold in per-reader DeltaBaseCache. */ + public int getDeltaBaseCacheLimit() { + return deltaBaseCacheLimit; + } + + /** + * Set the maximum number of bytes in the DeltaBaseCache. + * + * @param maxBytes + * the new limit. + * @return {@code this} + */ + public DfsReaderOptions setDeltaBaseCacheLimit(int maxBytes) { + deltaBaseCacheLimit = Math.max(0, maxBytes); + return this; + } + + /** @return the size threshold beyond which objects must be streamed. */ + public int getStreamFileThreshold() { + return streamFileThreshold; + } + + /** + * @param newLimit + * new byte limit for objects that must be streamed. Objects + * smaller than this size can be obtained as a contiguous byte + * array, while objects bigger than this size require using an + * {@link org.eclipse.jgit.lib.ObjectStream}. + * @return {@code this} + */ + public DfsReaderOptions setStreamFileThreshold(final int newLimit) { + streamFileThreshold = Math.max(0, newLimit); + return this; + } + + /** + * Update properties by setting fields from the configuration. + *

+ * If a property is not defined in the configuration, then it is left + * unmodified. + * + * @param rc + * configuration to read properties from. + * @return {@code this} + */ + public DfsReaderOptions fromConfig(Config rc) { + setDeltaBaseCacheLimit(rc.getInt( + CONFIG_CORE_SECTION, + CONFIG_DFS_SECTION, + CONFIG_KEY_DELTA_BASE_CACHE_LIMIT, + getDeltaBaseCacheLimit())); + + long maxMem = Runtime.getRuntime().maxMemory(); + long sft = rc.getLong( + CONFIG_CORE_SECTION, + CONFIG_DFS_SECTION, + CONFIG_KEY_STREAM_FILE_TRESHOLD, + getStreamFileThreshold()); + sft = Math.min(sft, maxMem / 4); // don't use more than 1/4 of the heap + sft = Math.min(sft, Integer.MAX_VALUE); // cannot exceed array length + setStreamFileThreshold((int) sft); + return this; + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRefDatabase.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRefDatabase.java new file mode 100644 index 0000000000..383431bdf6 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRefDatabase.java @@ -0,0 +1,450 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import static org.eclipse.jgit.lib.Ref.Storage.NEW; + +import java.io.IOException; +import java.util.Collections; +import java.util.List; +import java.util.Map; +import java.util.concurrent.atomic.AtomicReference; + +import org.eclipse.jgit.errors.MissingObjectException; +import org.eclipse.jgit.lib.ObjectIdRef; +import org.eclipse.jgit.lib.Ref; +import org.eclipse.jgit.lib.RefDatabase; +import org.eclipse.jgit.lib.RefRename; +import org.eclipse.jgit.lib.SymbolicRef; +import org.eclipse.jgit.revwalk.RevObject; +import org.eclipse.jgit.revwalk.RevTag; +import org.eclipse.jgit.revwalk.RevWalk; +import org.eclipse.jgit.util.RefList; +import org.eclipse.jgit.util.RefMap; + +/** */ +public abstract class DfsRefDatabase extends RefDatabase { + private final DfsRepository repository; + + private final AtomicReference cache; + + /** + * Initialize the reference database for a repository. + * + * @param repository + * the repository this database instance manages references for. + */ + protected DfsRefDatabase(DfsRepository repository) { + this.repository = repository; + this.cache = new AtomicReference(); + } + + /** @return the repository the database holds the references of. */ + protected DfsRepository getRepository() { + return repository; + } + + boolean exists() throws IOException { + return 0 < read().size(); + } + + @Override + public Ref getRef(String needle) throws IOException { + RefCache curr = read(); + for (String prefix : SEARCH_PATH) { + Ref ref = curr.ids.get(prefix + needle); + if (ref != null) { + ref = resolve(ref, 0, curr.ids); + return ref; + } + } + return null; + } + + private Ref getOneRef(String refName) throws IOException { + RefCache curr = read(); + Ref ref = curr.ids.get(refName); + if (ref != null) + return resolve(ref, 0, curr.ids); + return ref; + } + + @Override + public List getAdditionalRefs() { + return Collections.emptyList(); + } + + @Override + public Map getRefs(String prefix) throws IOException { + RefCache curr = read(); + RefList packed = RefList.emptyList(); + RefList loose = curr.ids; + RefList.Builder sym = new RefList.Builder(curr.sym.size()); + + for (int idx = 0; idx < curr.sym.size(); idx++) { + Ref ref = curr.sym.get(idx); + String name = ref.getName(); + ref = resolve(ref, 0, loose); + if (ref != null && ref.getObjectId() != null) { + sym.add(ref); + } else { + // A broken symbolic reference, we have to drop it from the + // collections the client is about to receive. Should be a + // rare occurrence so pay a copy penalty. + int toRemove = loose.find(name); + if (0 <= toRemove) + loose = loose.remove(toRemove); + } + } + + return new RefMap(prefix, packed, loose, sym.toRefList()); + } + + private Ref resolve(Ref ref, int depth, RefList loose) + throws IOException { + if (!ref.isSymbolic()) + return ref; + + Ref dst = ref.getTarget(); + + if (MAX_SYMBOLIC_REF_DEPTH <= depth) + return null; // claim it doesn't exist + + dst = loose.get(dst.getName()); + if (dst == null) + return ref; + + dst = resolve(dst, depth + 1, loose); + if (dst == null) + return null; + return new SymbolicRef(ref.getName(), dst); + } + + @Override + public Ref peel(Ref ref) throws IOException { + final Ref oldLeaf = ref.getLeaf(); + if (oldLeaf.isPeeled() || oldLeaf.getObjectId() == null) + return ref; + + Ref newLeaf = doPeel(oldLeaf); + + RefCache cur = read(); + int idx = cur.ids.find(oldLeaf.getName()); + if (0 <= idx && cur.ids.get(idx) == oldLeaf) { + RefList newList = cur.ids.set(idx, newLeaf); + cache.compareAndSet(cur, new RefCache(newList, cur)); + cachePeeledState(oldLeaf, newLeaf); + } + + return recreate(ref, newLeaf); + } + + private Ref doPeel(final Ref leaf) throws MissingObjectException, + IOException { + RevWalk rw = new RevWalk(repository); + try { + RevObject obj = rw.parseAny(leaf.getObjectId()); + if (obj instanceof RevTag) { + return new ObjectIdRef.PeeledTag( + leaf.getStorage(), + leaf.getName(), + leaf.getObjectId(), + rw.peel(obj).copy()); + } else { + return new ObjectIdRef.PeeledNonTag( + leaf.getStorage(), + leaf.getName(), + leaf.getObjectId()); + } + } finally { + rw.release(); + } + } + + private static Ref recreate(Ref old, Ref leaf) { + if (old.isSymbolic()) { + Ref dst = recreate(old.getTarget(), leaf); + return new SymbolicRef(old.getName(), dst); + } + return leaf; + } + + @Override + public DfsRefUpdate newUpdate(String refName, boolean detach) + throws IOException { + boolean detachingSymbolicRef = false; + Ref ref = getOneRef(refName); + if (ref == null) + ref = new ObjectIdRef.Unpeeled(NEW, refName, null); + else + detachingSymbolicRef = detach && ref.isSymbolic(); + + if (detachingSymbolicRef) { + ref = new ObjectIdRef.Unpeeled(NEW, refName, ref.getObjectId()); + } + + DfsRefUpdate update = new DfsRefUpdate(this, ref); + if (detachingSymbolicRef) + update.setDetachingSymbolicRef(); + return update; + } + + @Override + public RefRename newRename(String fromName, String toName) + throws IOException { + DfsRefUpdate src = newUpdate(fromName, true); + DfsRefUpdate dst = newUpdate(toName, true); + return new DfsRefRename(src, dst); + } + + @Override + public boolean isNameConflicting(String refName) throws IOException { + RefList all = read().ids; + + // Cannot be nested within an existing reference. + int lastSlash = refName.lastIndexOf('/'); + while (0 < lastSlash) { + String needle = refName.substring(0, lastSlash); + if (all.contains(needle)) + return true; + lastSlash = refName.lastIndexOf('/', lastSlash - 1); + } + + // Cannot be the container of an existing reference. + String prefix = refName + '/'; + int idx = -(all.find(prefix) + 1); + if (idx < all.size() && all.get(idx).getName().startsWith(prefix)) + return true; + return false; + } + + @Override + public void create() { + // Nothing to do. + } + + @Override + public void close() { + clearCache(); + } + + void clearCache() { + cache.set(null); + } + + void stored(Ref ref) { + RefCache oldCache, newCache; + do { + oldCache = cache.get(); + if (oldCache == null) + return; + newCache = oldCache.put(ref); + } while (!cache.compareAndSet(oldCache, newCache)); + } + + void removed(String refName) { + RefCache oldCache, newCache; + do { + oldCache = cache.get(); + if (oldCache == null) + return; + newCache = oldCache.remove(refName); + } while (!cache.compareAndSet(oldCache, newCache)); + } + + private RefCache read() throws IOException { + RefCache c = cache.get(); + if (c == null) { + c = scanAllRefs(); + cache.set(c); + } + return c; + } + + /** + * Read all known references in the repository. + * + * @return all current references of the repository. + * @throws IOException + * references cannot be accessed. + */ + protected abstract RefCache scanAllRefs() throws IOException; + + /** + * Compare a reference, and put if it matches. + * + * @param oldRef + * old value to compare to. If the reference is expected to not + * exist the old value has a storage of + * {@link org.eclipse.jgit.lib.Ref.Storage#NEW} and an ObjectId + * value of {@code null}. + * @param newRef + * new reference to store. + * @return true if the put was successful; false otherwise. + * @throws IOException + * the reference cannot be put due to a system error. + */ + protected abstract boolean compareAndPut(Ref oldRef, Ref newRef) + throws IOException; + + /** + * Compare a reference, and delete if it matches. + * + * @param oldRef + * the old reference information that was previously read. + * @return true if the remove was successful; false otherwise. + * @throws IOException + * the reference could not be removed due to a system error. + */ + protected abstract boolean compareAndRemove(Ref oldRef) throws IOException; + + /** + * Update the cached peeled state of a reference + *

+ * The ref database invokes this method after it peels a reference that had + * not been peeled before. This allows the storage to cache the peel state + * of the reference, and if it is actually peelable, the target that it + * peels to, so that on-the-fly peeling doesn't have to happen on the next + * reference read. + * + * @param oldLeaf + * the old reference. + * @param newLeaf + * the new reference, with peel information. + */ + protected void cachePeeledState(Ref oldLeaf, Ref newLeaf) { + try { + compareAndPut(oldLeaf, newLeaf); + } catch (IOException e) { + // Ignore an exception during caching. + } + } + + /** Collection of references managed by this database. */ + public static class RefCache { + final RefList ids; + + final RefList sym; + + /** + * Initialize a new reference cache. + *

+ * The two reference lists supplied must be sorted in correct order + * (string compare order) by name. + * + * @param ids + * references that carry an ObjectId, and all of {@code sym}. + * @param sym + * references that are symbolic references to others. + */ + public RefCache(RefList ids, RefList sym) { + this.ids = ids; + this.sym = sym; + } + + RefCache(RefList ids, RefCache old) { + this(ids, old.sym); + } + + /** @return number of references in this cache. */ + public int size() { + return ids.size(); + } + + /** + * Find a reference by name. + * + * @param name + * full name of the reference. + * @return the reference, if it exists, otherwise null. + */ + public Ref get(String name) { + return ids.get(name); + } + + /** + * Obtain a modified copy of the cache with a ref stored. + *

+ * This cache instance is not modified by this method. + * + * @param ref + * reference to add or replace. + * @return a copy of this cache, with the reference added or replaced. + */ + public RefCache put(Ref ref) { + RefList newIds = this.ids.put(ref); + RefList newSym = this.sym; + if (ref.isSymbolic()) { + newSym = newSym.put(ref); + } else { + int p = newSym.find(ref.getName()); + if (0 <= p) + newSym = newSym.remove(p); + } + return new RefCache(newIds, newSym); + } + + /** + * Obtain a modified copy of the cache with the ref removed. + *

+ * This cache instance is not modified by this method. + * + * @param refName + * reference to remove, if it exists. + * @return a copy of this cache, with the reference removed. + */ + public RefCache remove(String refName) { + RefList newIds = this.ids; + int p = newIds.find(refName); + if (0 <= p) + newIds = newIds.remove(p); + + RefList newSym = this.sym; + p = newSym.find(refName); + if (0 <= p) + newSym = newSym.remove(p); + return new RefCache(newIds, newSym); + } + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRefRename.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRefRename.java new file mode 100644 index 0000000000..dfef797b00 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRefRename.java @@ -0,0 +1,73 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.io.IOException; + +import org.eclipse.jgit.lib.ObjectId; +import org.eclipse.jgit.lib.RefRename; +import org.eclipse.jgit.lib.RefUpdate.Result; + +final class DfsRefRename extends RefRename { + DfsRefRename(DfsRefUpdate src, DfsRefUpdate dst) { + super(src, dst); + } + + @Override + protected Result doRename() throws IOException { + // TODO Correctly handle renaming foo/bar to foo. + // TODO Batch these together into one log update. + + destination.setExpectedOldObjectId(ObjectId.zeroId()); + destination.setNewObjectId(source.getRef().getObjectId()); + switch (destination.update()) { + case NEW: + source.delete(); + return Result.RENAMED; + + default: + return destination.getResult(); + } + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRefUpdate.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRefUpdate.java new file mode 100644 index 0000000000..148329d872 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRefUpdate.java @@ -0,0 +1,157 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.io.IOException; + +import org.eclipse.jgit.lib.ObjectIdRef; +import org.eclipse.jgit.lib.Ref; +import org.eclipse.jgit.lib.RefUpdate; +import org.eclipse.jgit.lib.SymbolicRef; +import org.eclipse.jgit.lib.Ref.Storage; +import org.eclipse.jgit.revwalk.RevObject; +import org.eclipse.jgit.revwalk.RevTag; +import org.eclipse.jgit.revwalk.RevWalk; + +final class DfsRefUpdate extends RefUpdate { + private final DfsRefDatabase refdb; + + private Ref dstRef; + + private RevWalk rw; + + DfsRefUpdate(DfsRefDatabase refdb, Ref ref) { + super(ref); + this.refdb = refdb; + } + + @Override + protected DfsRefDatabase getRefDatabase() { + return refdb; + } + + @Override + protected DfsRepository getRepository() { + return refdb.getRepository(); + } + + @Override + protected boolean tryLock(boolean deref) throws IOException { + dstRef = getRef(); + if (deref) + dstRef = dstRef.getLeaf(); + + if (dstRef.isSymbolic()) + setOldObjectId(null); + else + setOldObjectId(dstRef.getObjectId()); + + return true; + } + + @Override + protected void unlock() { + // No state is held while "locked". + } + + @Override + public Result update(RevWalk walk) throws IOException { + try { + rw = walk; + return super.update(walk); + } finally { + rw = null; + } + } + + @Override + protected Result doUpdate(Result desiredResult) throws IOException { + ObjectIdRef newRef; + RevObject obj = rw.parseAny(getNewObjectId()); + if (obj instanceof RevTag) { + newRef = new ObjectIdRef.PeeledTag( + Storage.PACKED, + dstRef.getName(), + getNewObjectId(), + rw.peel(obj).copy()); + } else { + newRef = new ObjectIdRef.PeeledNonTag( + Storage.PACKED, + dstRef.getName(), + getNewObjectId()); + } + + if (getRefDatabase().compareAndPut(dstRef, newRef)) { + getRefDatabase().stored(newRef); + return desiredResult; + } + return Result.LOCK_FAILURE; + } + + @Override + protected Result doDelete(Result desiredResult) throws IOException { + if (getRefDatabase().compareAndRemove(dstRef)) { + getRefDatabase().removed(dstRef.getName()); + return desiredResult; + } + return Result.LOCK_FAILURE; + } + + @Override + protected Result doLink(String target) throws IOException { + final SymbolicRef newRef = new SymbolicRef( + dstRef.getName(), + new ObjectIdRef.Unpeeled( + Storage.NEW, + target, + null)); + if (getRefDatabase().compareAndPut(dstRef, newRef)) { + getRefDatabase().stored(newRef); + if (dstRef.getStorage() == Ref.Storage.NEW) + return Result.NEW; + return Result.FORCED; + } + return Result.LOCK_FAILURE; + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRepository.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRepository.java new file mode 100644 index 0000000000..df54f2fca5 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRepository.java @@ -0,0 +1,121 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.io.IOException; +import java.text.MessageFormat; + +import org.eclipse.jgit.JGitText; +import org.eclipse.jgit.lib.Constants; +import org.eclipse.jgit.lib.RefUpdate; +import org.eclipse.jgit.lib.Repository; +import org.eclipse.jgit.lib.StoredConfig; +import org.eclipse.jgit.storage.file.ReflogReader; + +/** A Git repository on a DFS. */ +public abstract class DfsRepository extends Repository { + private final DfsConfig config; + + /** + * Initialize a DFS repository. + * + * @param builder + * description of the repository. + */ + protected DfsRepository(DfsRepositoryBuilder builder) { + super(builder); + this.config = new DfsConfig(); + } + + @Override + public abstract DfsObjDatabase getObjectDatabase(); + + @Override + public abstract DfsRefDatabase getRefDatabase(); + + /** + * Check if the repository already exists. + * + * @return true if the repository exists; false if it is new. + * @throws IOException + * the repository cannot be checked. + */ + public boolean exists() throws IOException { + return getRefDatabase().exists(); + } + + @Override + public void create(boolean bare) throws IOException { + if (exists()) + throw new IOException(MessageFormat.format( + JGitText.get().repositoryAlreadyExists, "")); + + String master = Constants.R_HEADS + Constants.MASTER; + RefUpdate.Result result = updateRef(Constants.HEAD, true).link(master); + if (result != RefUpdate.Result.NEW) + throw new IOException(result.name()); + } + + @Override + public StoredConfig getConfig() { + return config; + } + + @Override + public void scanForRepoChanges() throws IOException { + getRefDatabase().clearCache(); + getObjectDatabase().clearCache(); + } + + @Override + public void notifyIndexChanged() { + // Do not send notifications. + // There is no index, as there is no working tree. + } + + @Override + public ReflogReader getReflogReader(String refName) throws IOException { + throw new UnsupportedOperationException(); + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRepositoryBuilder.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRepositoryBuilder.java new file mode 100644 index 0000000000..6fd3ccaebb --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsRepositoryBuilder.java @@ -0,0 +1,140 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.io.File; +import java.io.IOException; + +import org.eclipse.jgit.lib.BaseRepositoryBuilder; + +/** + * Constructs a {@link DfsRepository}. + * + * @param + * type of the builder class. + * @param + * type of the repository class. + */ +public abstract class DfsRepositoryBuilder + extends BaseRepositoryBuilder { + private DfsReaderOptions readerOptions; + + /** @return options used by readers accessing the repository. */ + public DfsReaderOptions getReaderOptions() { + return readerOptions; + } + + /** + * Set the reader options. + * + * @param opt + * new reader options object. + * @return {@code this} + */ + public B setReaderOptions(DfsReaderOptions opt) { + readerOptions = opt; + return self(); + } + + @Override + public B setup() throws IllegalArgumentException, IOException { + super.setup(); + if (getReaderOptions() == null) + setReaderOptions(new DfsReaderOptions()); + return self(); + } + + /** + * Create a repository matching the configuration in this builder. + *

+ * If an option was not set, the build method will try to default the option + * based on other options. If insufficient information is available, an + * exception is thrown to the caller. + * + * @return a repository matching this configuration. + * @throws IllegalArgumentException + * insufficient parameters were set. + * @throws IOException + * the repository could not be accessed to configure the rest of + * the builder's parameters. + */ + @SuppressWarnings("unchecked") + @Override + public abstract R build() throws IOException; + + // We don't support local file IO and thus shouldn't permit these to set. + + @Override + public B setGitDir(File gitDir) { + if (gitDir != null) + throw new IllegalArgumentException(); + return self(); + } + + @Override + public B setObjectDirectory(File objectDirectory) { + if (objectDirectory != null) + throw new IllegalArgumentException(); + return self(); + } + + @Override + public B addAlternateObjectDirectory(File other) { + throw new UnsupportedOperationException("Alternates not supported"); + } + + @Override + public B setWorkTree(File workTree) { + if (workTree != null) + throw new IllegalArgumentException(); + return self(); + } + + @Override + public B setIndexFile(File indexFile) { + if (indexFile != null) + throw new IllegalArgumentException(); + return self(); + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsText.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsText.java new file mode 100644 index 0000000000..499b75a34d --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/DfsText.java @@ -0,0 +1,60 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import org.eclipse.jgit.nls.NLS; +import org.eclipse.jgit.nls.TranslationBundle; + +/** Translation bundle for the DFS storage implementation. */ +public class DfsText extends TranslationBundle { + /** @return instance of this translation bundle */ + public static DfsText get() { + return NLS.getBundleFor(DfsText.class); + } + + /***/ public String cannotReadIndex; + /***/ public String shortReadOfBlock; + /***/ public String shortReadOfIndex; + /***/ public String willNotStoreEmptyPack; +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/InMemoryRepository.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/InMemoryRepository.java new file mode 100644 index 0000000000..51e3af1343 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/InMemoryRepository.java @@ -0,0 +1,279 @@ +package org.eclipse.jgit.storage.dfs; + +import java.io.ByteArrayOutputStream; +import java.io.FileNotFoundException; +import java.io.IOException; +import java.nio.ByteBuffer; +import java.util.ArrayList; +import java.util.Collection; +import java.util.List; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.ConcurrentMap; +import java.util.concurrent.atomic.AtomicInteger; + +import org.eclipse.jgit.lib.Ref; +import org.eclipse.jgit.lib.Ref.Storage; +import org.eclipse.jgit.util.RefList; + +/** + * Git repository stored entirely in the local process memory. + *

+ * This implementation builds on the DFS repository by storing all reference and + * object data in the local process. It is not very efficient and exists only + * for unit testing and small experiments. + *

+ * The repository is thread-safe. Memory used is released only when this object + * is garbage collected. Closing the repository has no impact on its memory. + */ +public class InMemoryRepository extends DfsRepository { + private final DfsObjDatabase objdb; + + private final DfsRefDatabase refdb; + + /** Initialize a new in-memory repository. */ + public InMemoryRepository() { + super(new DfsRepositoryBuilder() { + @Override + public InMemoryRepository build() throws IOException { + throw new UnsupportedOperationException(); + } + }); + + objdb = new MemObjDatabase(); + refdb = new MemRefDatabase(); + } + + @Override + public DfsObjDatabase getObjectDatabase() { + return objdb; + } + + @Override + public DfsRefDatabase getRefDatabase() { + return refdb; + } + + private class MemObjDatabase extends DfsObjDatabase { + private final AtomicInteger packId = new AtomicInteger(); + private List packs = new ArrayList(); + + MemObjDatabase() { + super(new DfsReaderOptions()); + } + + @Override + protected synchronized List listPacks() { + return packs; + } + + @Override + protected DfsPackDescription newPack(PackSource source) { + int id = packId.incrementAndGet(); + return new MemPack("pack-" + id + "-" + source.name()); + } + + @Override + protected synchronized void commitPack( + Collection desc, + Collection replace) { + List n; + n = new ArrayList(desc.size() + packs.size()); + n.addAll(desc); + n.addAll(packs); + if (replace != null) + n.removeAll(replace); + packs = n; + } + + @Override + protected void rollbackPack(Collection desc) { + // Do nothing. Pack is not recorded until commitPack. + } + + @Override + protected ReadableChannel openPackFile(DfsPackDescription desc) + throws FileNotFoundException { + MemPack memPack = (MemPack) desc; + if (memPack.packFile == null) + throw new FileNotFoundException(desc.getPackName()); + return new ByteArrayReadableChannel(memPack.packFile); + } + + @Override + protected ReadableChannel openPackIndex(DfsPackDescription desc) + throws FileNotFoundException { + MemPack memPack = (MemPack) desc; + if (memPack.packIndex == null) + throw new FileNotFoundException(desc.getIndexName()); + return new ByteArrayReadableChannel(memPack.packIndex); + } + + @Override + protected DfsOutputStream writePackFile(DfsPackDescription desc) { + final MemPack memPack = (MemPack) desc; + return new Out() { + @Override + public void flush() { + memPack.packFile = getData(); + } + }; + } + + @Override + protected DfsOutputStream writePackIndex(DfsPackDescription desc) { + final MemPack memPack = (MemPack) desc; + return new Out() { + @Override + public void flush() { + memPack.packIndex = getData(); + } + }; + } + } + + private static class MemPack extends DfsPackDescription { + private byte[] packFile; + + private byte[] packIndex; + + MemPack(String name) { + super(name); + } + } + + private abstract static class Out extends DfsOutputStream { + private final ByteArrayOutputStream dst = new ByteArrayOutputStream(); + + private byte[] data; + + @Override + public void write(byte[] buf, int off, int len) { + data = null; + dst.write(buf, off, len); + } + + @Override + public int read(long position, ByteBuffer buf) { + byte[] d = getData(); + int n = Math.min(buf.remaining(), d.length - (int) position); + if (n == 0) + return -1; + buf.put(d, (int) position, n); + return n; + } + + byte[] getData() { + if (data == null) + data = dst.toByteArray(); + return data; + } + + @Override + public abstract void flush(); + + @Override + public void close() { + flush(); + } + + } + + private static class ByteArrayReadableChannel implements ReadableChannel { + private final byte[] data; + + private int position; + + private boolean open = true; + + ByteArrayReadableChannel(byte[] buf) { + data = buf; + } + + public int read(ByteBuffer dst) { + int n = Math.min(dst.remaining(), data.length - position); + if (n == 0) + return -1; + dst.put(data, position, n); + position += n; + return n; + } + + public void close() { + open = false; + } + + public boolean isOpen() { + return open; + } + + public long position() { + return position; + } + + public void position(long newPosition) { + position = (int) newPosition; + } + + public long size() { + return data.length; + } + + public int blockSize() { + return 0; + } + } + + private class MemRefDatabase extends DfsRefDatabase { + private final ConcurrentMap refs = new ConcurrentHashMap(); + + MemRefDatabase() { + super(InMemoryRepository.this); + } + + @Override + protected RefCache scanAllRefs() throws IOException { + RefList.Builder ids = new RefList.Builder(); + RefList.Builder sym = new RefList.Builder(); + for (Ref ref : refs.values()) { + if (ref.isSymbolic()) + sym.add(ref); + ids.add(ref); + } + ids.sort(); + sym.sort(); + return new RefCache(ids.toRefList(), sym.toRefList()); + } + + @Override + protected boolean compareAndPut(Ref oldRef, Ref newRef) + throws IOException { + String name = newRef.getName(); + if (oldRef == null || oldRef.getStorage() == Storage.NEW) + return refs.putIfAbsent(name, newRef) == null; + Ref cur = refs.get(name); + if (cur != null && eq(cur, oldRef)) + return refs.replace(name, cur, newRef); + else + return false; + + } + + @Override + protected boolean compareAndRemove(Ref oldRef) throws IOException { + String name = oldRef.getName(); + Ref cur = refs.get(name); + if (cur != null && eq(cur, oldRef)) + return refs.remove(name, cur); + else + return false; + } + + private boolean eq(Ref a, Ref b) { + if (a.getObjectId() == null && b.getObjectId() == null) + return true; + if (a.getObjectId() != null) + return a.getObjectId().equals(b.getObjectId()); + return false; + } + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/LargePackedWholeObject.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/LargePackedWholeObject.java new file mode 100644 index 0000000000..46e7d58223 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/LargePackedWholeObject.java @@ -0,0 +1,128 @@ +/* + * Copyright (C) 2010, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.io.BufferedInputStream; +import java.io.IOException; +import java.io.InputStream; +import java.util.zip.InflaterInputStream; + +import org.eclipse.jgit.errors.LargeObjectException; +import org.eclipse.jgit.errors.MissingObjectException; +import org.eclipse.jgit.lib.ObjectId; +import org.eclipse.jgit.lib.ObjectLoader; +import org.eclipse.jgit.lib.ObjectStream; + +final class LargePackedWholeObject extends ObjectLoader { + private final int type; + + private final long size; + + private final long objectOffset; + + private final int headerLength; + + private final DfsPackFile pack; + + private final DfsObjDatabase db; + + LargePackedWholeObject(int type, long size, long objectOffset, + int headerLength, DfsPackFile pack, DfsObjDatabase db) { + this.type = type; + this.size = size; + this.objectOffset = objectOffset; + this.headerLength = headerLength; + this.pack = pack; + this.db = db; + } + + @Override + public int getType() { + return type; + } + + @Override + public long getSize() { + return size; + } + + @Override + public boolean isLarge() { + return true; + } + + @Override + public byte[] getCachedBytes() throws LargeObjectException { + throw new LargeObjectException(); + } + + @Override + public ObjectStream openStream() throws MissingObjectException, IOException { + DfsReader ctx = new DfsReader(db); + InputStream in; + try { + in = new PackInputStream(pack, objectOffset + headerLength, ctx); + } catch (IOException packGone) { + // If the pack file cannot be pinned into the cursor, it + // probably was repacked recently. Go find the object + // again and open the stream from that location instead. + // + try { + ObjectId obj = pack.findObjectForOffset(ctx, objectOffset); + return ctx.open(obj, type).openStream(); + } finally { + ctx.release(); + } + } + + // Align buffer to inflater size, at a larger than default block. + // This reduces the number of context switches from the + // caller down into the pack stream inflation. + int bufsz = 8192; + in = new BufferedInputStream( + new InflaterInputStream(in, ctx.inflater(), bufsz), + bufsz); + return new ObjectStream.Filter(type, size, in); + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/PackInputStream.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/PackInputStream.java new file mode 100644 index 0000000000..b15b2a92bf --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/PackInputStream.java @@ -0,0 +1,85 @@ +/* + * Copyright (C) 2010, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.io.IOException; +import java.io.InputStream; + +final class PackInputStream extends InputStream { + private final DfsReader ctx; + + private final DfsPackFile pack; + + private long pos; + + PackInputStream(DfsPackFile pack, long pos, DfsReader ctx) + throws IOException { + this.pack = pack; + this.pos = pos; + this.ctx = ctx; + + // Pin the first window, to ensure the pack is open and valid. + // + ctx.pin(pack, pos); + } + + @Override + public int read(byte[] b, int off, int len) throws IOException { + int n = ctx.copy(pack, pos, b, off, len); + pos += n; + return n; + } + + @Override + public int read() throws IOException { + byte[] buf = new byte[1]; + int n = read(buf, 0, 1); + return n == 1 ? buf[0] & 0xff : -1; + } + + @Override + public void close() { + ctx.release(); + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/ReadAheadRejectedExecutionHandler.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/ReadAheadRejectedExecutionHandler.java new file mode 100644 index 0000000000..78c2cae330 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/ReadAheadRejectedExecutionHandler.java @@ -0,0 +1,61 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.util.concurrent.RejectedExecutionHandler; +import java.util.concurrent.ThreadPoolExecutor; + +/** This handler aborts a {@link ReadAheadTask} when the queue is full. */ +final class ReadAheadRejectedExecutionHandler implements + RejectedExecutionHandler { + static final ReadAheadRejectedExecutionHandler INSTANCE = new ReadAheadRejectedExecutionHandler(); + + private ReadAheadRejectedExecutionHandler() { + // Singleton, do not create more instances. + } + + public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) { + ((ReadAheadTask.TaskFuture) r).task.abort(); + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/ReadAheadTask.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/ReadAheadTask.java new file mode 100644 index 0000000000..35b76773af --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/ReadAheadTask.java @@ -0,0 +1,241 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.io.EOFException; +import java.io.IOException; +import java.util.List; +import java.util.concurrent.Callable; +import java.util.concurrent.CountDownLatch; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.Future; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; + +import org.eclipse.jgit.util.IO; + +final class ReadAheadTask implements Callable { + private final DfsBlockCache cache; + + private final ReadableChannel channel; + + private final List futures; + + private boolean running; + + ReadAheadTask(DfsBlockCache cache, ReadableChannel channel, + List futures) { + this.cache = cache; + this.channel = channel; + this.futures = futures; + } + + public Void call() { + int idx = 0; + try { + synchronized (this) { + if (channel.isOpen()) + running = true; + else + return null; + } + + long position = channel.position(); + for (; idx < futures.size() && !Thread.interrupted(); idx++) { + BlockFuture f = futures.get(idx); + if (cache.contains(f.pack, f.start)) { + f.done(); + continue; + } + + if (position != f.start) + channel.position(f.start); + + int size = (int) (f.end - f.start); + byte[] buf = new byte[size]; + if (IO.read(channel, buf, 0, size) != size) + throw new EOFException(); + + cache.put(new DfsBlock(f.pack, f.start, buf)); + f.done(); + position = f.end; + } + } catch (IOException err) { + // Ignore read-ahead errors. These will be caught later on. + } finally { + for (; idx < futures.size(); idx++) + futures.get(idx).abort(); + close(); + } + return null; + } + + void abort() { + for (BlockFuture f : futures) + f.abort(); + + synchronized (this) { + if (!running) + close(); + } + } + + private synchronized void close() { + try { + if (channel.isOpen()) + channel.close(); + } catch (IOException err) { + // Ignore close errors on a read-only channel. + } + } + + static final class TaskFuture extends java.util.concurrent.FutureTask { + final ReadAheadTask task; + + TaskFuture(ReadAheadTask task) { + super(task); + this.task = task; + } + + @Override + public boolean cancel(boolean mayInterruptIfRunning) { + if (super.cancel(mayInterruptIfRunning)) { + task.abort(); + return true; + } + return false; + } + } + + /** A scheduled read-ahead block load. */ + static final class BlockFuture implements Future { + private static enum State { + PENDING, DONE, CANCELLED; + } + + private volatile State state; + + private volatile Future task; + + private final CountDownLatch latch; + + final DfsPackKey pack; + + final long start; + + final long end; + + BlockFuture(DfsPackKey key, long start, long end) { + this.state = State.PENDING; + this.latch = new CountDownLatch(1); + this.pack = key; + this.start = start; + this.end = end; + } + + synchronized void setTask(Future task) { + if (state == State.PENDING) + this.task = task; + } + + boolean contains(DfsPackKey want, long pos) { + return pack == want && start <= pos && pos < end; + } + + synchronized void done() { + if (state == State.PENDING) { + latch.countDown(); + state = State.DONE; + task = null; + } + } + + synchronized void abort() { + if (state == State.PENDING) { + latch.countDown(); + state = State.CANCELLED; + task = null; + } + } + + public boolean cancel(boolean mayInterruptIfRunning) { + Future t = task; + if (t == null) + return false; + + boolean r = t.cancel(mayInterruptIfRunning); + abort(); + return r; + } + + public Void get() throws InterruptedException, ExecutionException { + latch.await(); + return null; + } + + public Void get(long timeout, TimeUnit unit) + throws InterruptedException, ExecutionException, + TimeoutException { + if (latch.await(timeout, unit)) + return null; + else + throw new TimeoutException(); + } + + public boolean isCancelled() { + State s = state; + if (s == State.DONE) + return false; + if (s == State.CANCELLED) + return true; + + Future t = task; + return t != null ? t.isCancelled() : true; + } + + public boolean isDone() { + return state == State.DONE; + } + } +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/ReadableChannel.java b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/ReadableChannel.java new file mode 100644 index 0000000000..3680931de4 --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/storage/dfs/ReadableChannel.java @@ -0,0 +1,103 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.storage.dfs; + +import java.io.IOException; +import java.nio.channels.ReadableByteChannel; + +/** Readable random access byte channel from a file. */ +public interface ReadableChannel extends ReadableByteChannel { + /** + * Get the current position of the channel. + * + * @return r current offset. + * @throws IOException + * the channel's current position cannot be obtained. + */ + public long position() throws IOException; + + /** + * Seek the current position of the channel to a new offset. + * + * @param newPosition + * position to move the channel to. The next read will start from + * here. This should be a multiple of the {@link #blockSize()}. + * @throws IOException + * the position cannot be updated. This may be because the + * channel only supports block aligned IO and the current + * position is not block aligned. + */ + public void position(long newPosition) throws IOException; + + /** + * Get the total size of the channel. + *

+ * Prior to reading from a channel the size might not yet be known. + * Implementors may return -1 until after the first read method call. Once a + * read has been completed, the underlying file size should be available. + * + * @return r total size of the channel; -1 if not yet available. + * @throws IOException + * the size cannot be determined. + */ + public long size() throws IOException; + + /** + * Get the recommended alignment for reads. + *

+ * Starting a read at multiples of the blockSize is more efficient than + * starting a read at any other position. If 0 or -1 the channel does not + * have any specific block size recommendation. + *

+ * Channels should not recommend large block sizes. Sizes up to 1-4 MiB may + * be reasonable, but sizes above that may be horribly inefficient. The + * {@link DfsBlockCache} favors the alignment suggested by the channel + * rather than the configured size under the assumption that reads are very + * expensive and the channel knows what size is best to access it with. + * + * @return recommended alignment size for randomly positioned reads. Does + * not need to be a power of 2. + */ + public int blockSize(); +} diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/util/IO.java b/org.eclipse.jgit/src/org/eclipse/jgit/util/IO.java index 4b9c572013..439fe09b5e 100644 --- a/org.eclipse.jgit/src/org/eclipse/jgit/util/IO.java +++ b/org.eclipse.jgit/src/org/eclipse/jgit/util/IO.java @@ -52,6 +52,7 @@ import java.io.FileNotFoundException; import java.io.IOException; import java.io.InputStream; import java.nio.ByteBuffer; +import java.nio.channels.ReadableByteChannel; import java.text.MessageFormat; import org.eclipse.jgit.JGitText; @@ -221,6 +222,37 @@ public class IO { } } + /** + * Read as much of the array as possible from a channel. + * + * @param channel + * channel to read data from. + * @param dst + * buffer that must be fully populated, [off, off+len). + * @param off + * position within the buffer to start writing to. + * @param len + * number of bytes that should be read. + * @return number of bytes actually read. + * @throws IOException + * there was an error reading from the channel. + */ + public static int read(ReadableByteChannel channel, byte[] dst, int off, + int len) throws IOException { + if (len == 0) + return 0; + int cnt = 0; + while (0 < len) { + int r = channel.read(ByteBuffer.wrap(dst, off, len)); + if (r <= 0) + break; + off += r; + len -= r; + cnt += r; + } + return cnt != 0 ? cnt : -1; + } + /** * Skip an entire region of an input stream. *

diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/util/io/CountingOutputStream.java b/org.eclipse.jgit/src/org/eclipse/jgit/util/io/CountingOutputStream.java new file mode 100644 index 0000000000..ad6f1deceb --- /dev/null +++ b/org.eclipse.jgit/src/org/eclipse/jgit/util/io/CountingOutputStream.java @@ -0,0 +1,90 @@ +/* + * Copyright (C) 2011, Google Inc. + * and other copyright owners as documented in the project's IP log. + * + * This program and the accompanying materials are made available + * under the terms of the Eclipse Distribution License v1.0 which + * accompanies this distribution, is reproduced below, and is + * available at http://www.eclipse.org/org/documents/edl-v10.php + * + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials provided + * with the distribution. + * + * - Neither the name of the Eclipse Foundation, Inc. nor the + * names of its contributors may be used to endorse or promote + * products derived from this software without specific prior + * written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND + * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, + * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER + * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, + * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF + * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +package org.eclipse.jgit.util.io; + +import java.io.IOException; +import java.io.OutputStream; + +/** Counts the number of bytes written. */ +public class CountingOutputStream extends OutputStream { + private final OutputStream out; + private long cnt; + + /** + * Initialize a new counting stream. + * + * @param out + * stream to output all writes to. + */ + public CountingOutputStream(OutputStream out) { + this.out = out; + } + + /** @return current number of bytes written. */ + public long getCount() { + return cnt; + } + + @Override + public void write(int val) throws IOException { + out.write(val); + cnt++; + } + + @Override + public void write(byte[] buf, int off, int len) throws IOException { + out.write(buf, off, len); + cnt += len; + } + + @Override + public void flush() throws IOException { + out.flush(); + } + + @Override + public void close() throws IOException { + out.close(); + } +} -- 2.39.5