Вы не можете выбрать более 25 тем Темы должны начинаться с буквы или цифры, могут содержать дефисы(-) и должны содержать не более 35 символов.

Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 лет назад
Support creating pack bitmap indexes in PackWriter. Update the PackWriter to support writing out pack bitmap indexes, a parallel ".bitmap" file to the ".pack" file. Bitmaps are selected at commits every 1 to 5,000 commits for each unique path from the start. The most recent 100 commits are all bitmapped. The next 19,000 commits have a bitmaps every 100 commits. The remaining commits have a bitmap every 5,000 commits. Commits with more than 1 parent are prefered over ones with 1 or less. Furthermore, previously computed bitmaps are reused, if the previous entry had the reuse flag set, which is set when the bitmap was placed at the max allowed distance. Bitmaps are used to speed up the counting phase when packing, for requests that are not shallow. The PackWriterBitmapWalker uses a RevFilter to proactively mark commits with RevFlag.SEEN, when they appear in a bitmap. The walker produces the full closure of reachable ObjectIds, given the collection of starting ObjectIds. For fetch request, two ObjectWalks are executed to compute the ObjectIds reachable from the haves and from the wants. The ObjectIds needed to be written are determined by taking all the resulting wants AND NOT the haves. For clone requests, we get cached pack support for "free" since it is possible to determine if all of the ObjectIds in a pack file are included in the resulting list of ObjectIds to write. On my machine, the best times for clones and fetches of the linux kernel repository (with about 2.6M objects and 300K commits) are tabulated below: Operation Index V2 Index VE003 Clone 37530ms (524.06 MiB) 82ms (524.06 MiB) Fetch (1 commit back) 75ms 107ms Fetch (10 commits back) 456ms (269.51 KiB) 341ms (265.19 KiB) Fetch (100 commits back) 449ms (269.91 KiB) 337ms (267.28 KiB) Fetch (1000 commits back) 2229ms ( 14.75 MiB) 189ms ( 14.42 MiB) Fetch (10000 commits back) 2177ms ( 16.30 MiB) 254ms ( 15.88 MiB) Fetch (100000 commits back) 14340ms (185.83 MiB) 1655ms (189.39 MiB) Change-Id: Icdb0cdd66ff168917fb9ef17b96093990cc6a98d
11 лет назад
Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 лет назад
Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
11 лет назад
Support creating pack bitmap indexes in PackWriter. Update the PackWriter to support writing out pack bitmap indexes, a parallel ".bitmap" file to the ".pack" file. Bitmaps are selected at commits every 1 to 5,000 commits for each unique path from the start. The most recent 100 commits are all bitmapped. The next 19,000 commits have a bitmaps every 100 commits. The remaining commits have a bitmap every 5,000 commits. Commits with more than 1 parent are prefered over ones with 1 or less. Furthermore, previously computed bitmaps are reused, if the previous entry had the reuse flag set, which is set when the bitmap was placed at the max allowed distance. Bitmaps are used to speed up the counting phase when packing, for requests that are not shallow. The PackWriterBitmapWalker uses a RevFilter to proactively mark commits with RevFlag.SEEN, when they appear in a bitmap. The walker produces the full closure of reachable ObjectIds, given the collection of starting ObjectIds. For fetch request, two ObjectWalks are executed to compute the ObjectIds reachable from the haves and from the wants. The ObjectIds needed to be written are determined by taking all the resulting wants AND NOT the haves. For clone requests, we get cached pack support for "free" since it is possible to determine if all of the ObjectIds in a pack file are included in the resulting list of ObjectIds to write. On my machine, the best times for clones and fetches of the linux kernel repository (with about 2.6M objects and 300K commits) are tabulated below: Operation Index V2 Index VE003 Clone 37530ms (524.06 MiB) 82ms (524.06 MiB) Fetch (1 commit back) 75ms 107ms Fetch (10 commits back) 456ms (269.51 KiB) 341ms (265.19 KiB) Fetch (100 commits back) 449ms (269.91 KiB) 337ms (267.28 KiB) Fetch (1000 commits back) 2229ms ( 14.75 MiB) 189ms ( 14.42 MiB) Fetch (10000 commits back) 2177ms ( 16.30 MiB) 254ms ( 15.88 MiB) Fetch (100000 commits back) 14340ms (185.83 MiB) 1655ms (189.39 MiB) Change-Id: Icdb0cdd66ff168917fb9ef17b96093990cc6a98d
11 лет назад
Support creating pack bitmap indexes in PackWriter. Update the PackWriter to support writing out pack bitmap indexes, a parallel ".bitmap" file to the ".pack" file. Bitmaps are selected at commits every 1 to 5,000 commits for each unique path from the start. The most recent 100 commits are all bitmapped. The next 19,000 commits have a bitmaps every 100 commits. The remaining commits have a bitmap every 5,000 commits. Commits with more than 1 parent are prefered over ones with 1 or less. Furthermore, previously computed bitmaps are reused, if the previous entry had the reuse flag set, which is set when the bitmap was placed at the max allowed distance. Bitmaps are used to speed up the counting phase when packing, for requests that are not shallow. The PackWriterBitmapWalker uses a RevFilter to proactively mark commits with RevFlag.SEEN, when they appear in a bitmap. The walker produces the full closure of reachable ObjectIds, given the collection of starting ObjectIds. For fetch request, two ObjectWalks are executed to compute the ObjectIds reachable from the haves and from the wants. The ObjectIds needed to be written are determined by taking all the resulting wants AND NOT the haves. For clone requests, we get cached pack support for "free" since it is possible to determine if all of the ObjectIds in a pack file are included in the resulting list of ObjectIds to write. On my machine, the best times for clones and fetches of the linux kernel repository (with about 2.6M objects and 300K commits) are tabulated below: Operation Index V2 Index VE003 Clone 37530ms (524.06 MiB) 82ms (524.06 MiB) Fetch (1 commit back) 75ms 107ms Fetch (10 commits back) 456ms (269.51 KiB) 341ms (265.19 KiB) Fetch (100 commits back) 449ms (269.91 KiB) 337ms (267.28 KiB) Fetch (1000 commits back) 2229ms ( 14.75 MiB) 189ms ( 14.42 MiB) Fetch (10000 commits back) 2177ms ( 16.30 MiB) 254ms ( 15.88 MiB) Fetch (100000 commits back) 14340ms (185.83 MiB) 1655ms (189.39 MiB) Change-Id: Icdb0cdd66ff168917fb9ef17b96093990cc6a98d
11 лет назад
Extract PackFile specific code to ObjectToPack subclass The ObjectReader class is dual-purposed into being a factory for the ObjectToPack, permitting specific ObjectDatabase implementations to override the method and offer their own custom subclass of the generic ObjectToPack class. By allowing them to directly extend the type, each implementation can add custom fields to support tracking where an object is stored, without incurring any additional penalties like a parallel Map<ObjectId,Object> would cost. The reader was chosen to act as a factory rather than the database, as the reader will eventually be tied more tightly with the ObjectWalk and TreeWalk. During object enumeration the reader would have had to load the object for the RevWalk, and may chose to cache object position data internally so it can later be reused and fed into the ObjectToPack instance supplied to the PackWriter. Since a reader is not thread-safe, and is scoped to this PackWriter and its internal ObjectWalk, its a great place for the database to perform caching, if any. Right now this change goes a bit backwards by changing what should be generic ObjectToPack references inside of PackWriter to the very PackFile specific LocalObjectToPack subclass. We will correct these in a later commit as we start to refine what the ObjectToPack API will eventually look like in order to better support the PackWriter. Change-Id: I9f047d26b97e46dee3bc0ccb4060bbebedbe8ea9 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 лет назад
PackWriter: Support reuse of entire packs The most expensive part of packing a repository for transport to another system is enumerating all of the objects in the repository. Once this gets to the size of the linux-2.6 repository (1.8 million objects), enumeration can take several CPU minutes and costs a lot of temporary working set memory. Teach PackWriter to efficiently reuse an existing "cached pack" by answering a clone request with a thin pack followed by a larger cached pack appended to the end. This requires the repository owner to first construct the cached pack by hand, and record the tip commits inside of $GIT_DIR/objects/info/cached-packs: cd $GIT_DIR root=$(git rev-parse master) tmp=objects/.tmp-$$ names=$(echo $root | git pack-objects --keep-true-parents --revs $tmp) for n in $names; do chmod a-w $tmp-$n.pack $tmp-$n.idx touch objects/pack/pack-$n.keep mv $tmp-$n.pack objects/pack/pack-$n.pack mv $tmp-$n.idx objects/pack/pack-$n.idx done (echo "+ $root"; for n in $names; do echo "P $n"; done; echo) >>objects/info/cached-packs git repack -a -d When a clone request needs to include $root, the corresponding cached pack will be copied as-is, rather than enumerating all of the objects that are reachable from $root. For a linux-2.6 kernel repository that should be about 376 MiB, the above process creates two packs of 368 MiB and 38 MiB[1]. This is a local disk usage increase of ~26 MiB, due to reduced delta compression between the large cached pack and the smaller recent activity pack. The overhead is similar to 1 full copy of the compressed project sources. With this cached pack in hand, JGit daemon completes a clone request in 1m17s less time, but a slightly larger data transfer (+2.39 MiB): Before: remote: Counting objects: 1861830, done remote: Finding sources: 100% (1861830/1861830) remote: Getting sizes: 100% (88243/88243) remote: Compressing objects: 100% (88184/88184) Receiving objects: 100% (1861830/1861830), 376.01 MiB | 19.01 MiB/s, done. remote: Total 1861830 (delta 4706), reused 1851053 (delta 1553844) Resolving deltas: 100% (1564621/1564621), done. real 3m19.005s After: remote: Counting objects: 1601, done remote: Counting objects: 1828460, done remote: Finding sources: 100% (50475/50475) remote: Getting sizes: 100% (18843/18843) remote: Compressing objects: 100% (7585/7585) remote: Total 1861830 (delta 2407), reused 1856197 (delta 37510) Receiving objects: 100% (1861830/1861830), 378.40 MiB | 31.31 MiB/s, done. Resolving deltas: 100% (1559477/1559477), done. real 2m2.938s Repository owners can periodically refresh their cached packs by repacking their repository, folding all newer objects into a larger cached pack. Since repacking is already considered to be a normal Git maintenance activity, this isn't a very big burden. [1] In this test $root was set back about two weeks. Change-Id: Ib87131d5c4b5e8c5cacb0f4fe16ff4ece554734b Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 лет назад
PackWriter: Support reuse of entire packs The most expensive part of packing a repository for transport to another system is enumerating all of the objects in the repository. Once this gets to the size of the linux-2.6 repository (1.8 million objects), enumeration can take several CPU minutes and costs a lot of temporary working set memory. Teach PackWriter to efficiently reuse an existing "cached pack" by answering a clone request with a thin pack followed by a larger cached pack appended to the end. This requires the repository owner to first construct the cached pack by hand, and record the tip commits inside of $GIT_DIR/objects/info/cached-packs: cd $GIT_DIR root=$(git rev-parse master) tmp=objects/.tmp-$$ names=$(echo $root | git pack-objects --keep-true-parents --revs $tmp) for n in $names; do chmod a-w $tmp-$n.pack $tmp-$n.idx touch objects/pack/pack-$n.keep mv $tmp-$n.pack objects/pack/pack-$n.pack mv $tmp-$n.idx objects/pack/pack-$n.idx done (echo "+ $root"; for n in $names; do echo "P $n"; done; echo) >>objects/info/cached-packs git repack -a -d When a clone request needs to include $root, the corresponding cached pack will be copied as-is, rather than enumerating all of the objects that are reachable from $root. For a linux-2.6 kernel repository that should be about 376 MiB, the above process creates two packs of 368 MiB and 38 MiB[1]. This is a local disk usage increase of ~26 MiB, due to reduced delta compression between the large cached pack and the smaller recent activity pack. The overhead is similar to 1 full copy of the compressed project sources. With this cached pack in hand, JGit daemon completes a clone request in 1m17s less time, but a slightly larger data transfer (+2.39 MiB): Before: remote: Counting objects: 1861830, done remote: Finding sources: 100% (1861830/1861830) remote: Getting sizes: 100% (88243/88243) remote: Compressing objects: 100% (88184/88184) Receiving objects: 100% (1861830/1861830), 376.01 MiB | 19.01 MiB/s, done. remote: Total 1861830 (delta 4706), reused 1851053 (delta 1553844) Resolving deltas: 100% (1564621/1564621), done. real 3m19.005s After: remote: Counting objects: 1601, done remote: Counting objects: 1828460, done remote: Finding sources: 100% (50475/50475) remote: Getting sizes: 100% (18843/18843) remote: Compressing objects: 100% (7585/7585) remote: Total 1861830 (delta 2407), reused 1856197 (delta 37510) Receiving objects: 100% (1861830/1861830), 378.40 MiB | 31.31 MiB/s, done. Resolving deltas: 100% (1559477/1559477), done. real 2m2.938s Repository owners can periodically refresh their cached packs by repacking their repository, folding all newer objects into a larger cached pack. Since repacking is already considered to be a normal Git maintenance activity, this isn't a very big burden. [1] In this test $root was set back about two weeks. Change-Id: Ib87131d5c4b5e8c5cacb0f4fe16ff4ece554734b Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 лет назад
PackWriter: Support reuse of entire packs The most expensive part of packing a repository for transport to another system is enumerating all of the objects in the repository. Once this gets to the size of the linux-2.6 repository (1.8 million objects), enumeration can take several CPU minutes and costs a lot of temporary working set memory. Teach PackWriter to efficiently reuse an existing "cached pack" by answering a clone request with a thin pack followed by a larger cached pack appended to the end. This requires the repository owner to first construct the cached pack by hand, and record the tip commits inside of $GIT_DIR/objects/info/cached-packs: cd $GIT_DIR root=$(git rev-parse master) tmp=objects/.tmp-$$ names=$(echo $root | git pack-objects --keep-true-parents --revs $tmp) for n in $names; do chmod a-w $tmp-$n.pack $tmp-$n.idx touch objects/pack/pack-$n.keep mv $tmp-$n.pack objects/pack/pack-$n.pack mv $tmp-$n.idx objects/pack/pack-$n.idx done (echo "+ $root"; for n in $names; do echo "P $n"; done; echo) >>objects/info/cached-packs git repack -a -d When a clone request needs to include $root, the corresponding cached pack will be copied as-is, rather than enumerating all of the objects that are reachable from $root. For a linux-2.6 kernel repository that should be about 376 MiB, the above process creates two packs of 368 MiB and 38 MiB[1]. This is a local disk usage increase of ~26 MiB, due to reduced delta compression between the large cached pack and the smaller recent activity pack. The overhead is similar to 1 full copy of the compressed project sources. With this cached pack in hand, JGit daemon completes a clone request in 1m17s less time, but a slightly larger data transfer (+2.39 MiB): Before: remote: Counting objects: 1861830, done remote: Finding sources: 100% (1861830/1861830) remote: Getting sizes: 100% (88243/88243) remote: Compressing objects: 100% (88184/88184) Receiving objects: 100% (1861830/1861830), 376.01 MiB | 19.01 MiB/s, done. remote: Total 1861830 (delta 4706), reused 1851053 (delta 1553844) Resolving deltas: 100% (1564621/1564621), done. real 3m19.005s After: remote: Counting objects: 1601, done remote: Counting objects: 1828460, done remote: Finding sources: 100% (50475/50475) remote: Getting sizes: 100% (18843/18843) remote: Compressing objects: 100% (7585/7585) remote: Total 1861830 (delta 2407), reused 1856197 (delta 37510) Receiving objects: 100% (1861830/1861830), 378.40 MiB | 31.31 MiB/s, done. Resolving deltas: 100% (1559477/1559477), done. real 2m2.938s Repository owners can periodically refresh their cached packs by repacking their repository, folding all newer objects into a larger cached pack. Since repacking is already considered to be a normal Git maintenance activity, this isn't a very big burden. [1] In this test $root was set back about two weeks. Change-Id: Ib87131d5c4b5e8c5cacb0f4fe16ff4ece554734b Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 лет назад
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368
  1. /*
  2. * Copyright (C) 2008-2009, Google Inc.
  3. * Copyright (C) 2006-2008, Shawn O. Pearce <spearce@spearce.org> and others
  4. *
  5. * This program and the accompanying materials are made available under the
  6. * terms of the Eclipse Distribution License v. 1.0 which is available at
  7. * https://www.eclipse.org/org/documents/edl-v10.php.
  8. *
  9. * SPDX-License-Identifier: BSD-3-Clause
  10. */
  11. package org.eclipse.jgit.internal.storage.file;
  12. import java.io.IOException;
  13. import java.util.Collection;
  14. import java.util.Collections;
  15. import java.util.HashSet;
  16. import java.util.List;
  17. import java.util.Set;
  18. import java.util.zip.DataFormatException;
  19. import java.util.zip.Inflater;
  20. import org.eclipse.jgit.annotations.Nullable;
  21. import org.eclipse.jgit.errors.IncorrectObjectTypeException;
  22. import org.eclipse.jgit.errors.MissingObjectException;
  23. import org.eclipse.jgit.errors.StoredObjectRepresentationNotAvailableException;
  24. import org.eclipse.jgit.errors.StoredPackRepresentationNotAvailableException;
  25. import org.eclipse.jgit.internal.JGitText;
  26. import org.eclipse.jgit.internal.storage.pack.CachedPack;
  27. import org.eclipse.jgit.internal.storage.pack.ObjectReuseAsIs;
  28. import org.eclipse.jgit.internal.storage.pack.ObjectToPack;
  29. import org.eclipse.jgit.internal.storage.pack.PackOutputStream;
  30. import org.eclipse.jgit.internal.storage.pack.PackWriter;
  31. import org.eclipse.jgit.lib.AbbreviatedObjectId;
  32. import org.eclipse.jgit.lib.AnyObjectId;
  33. import org.eclipse.jgit.lib.BitmapIndex;
  34. import org.eclipse.jgit.lib.BitmapIndex.BitmapBuilder;
  35. import org.eclipse.jgit.lib.Constants;
  36. import org.eclipse.jgit.lib.InflaterCache;
  37. import org.eclipse.jgit.lib.ObjectId;
  38. import org.eclipse.jgit.lib.ObjectInserter;
  39. import org.eclipse.jgit.lib.ObjectLoader;
  40. import org.eclipse.jgit.lib.ObjectReader;
  41. import org.eclipse.jgit.lib.ProgressMonitor;
  42. /** Active handle to a ByteWindow. */
  43. final class WindowCursor extends ObjectReader implements ObjectReuseAsIs {
  44. /** Temporary buffer large enough for at least one raw object id. */
  45. final byte[] tempId = new byte[Constants.OBJECT_ID_LENGTH];
  46. private Inflater inf;
  47. private ByteWindow window;
  48. private DeltaBaseCache baseCache;
  49. @Nullable
  50. private final ObjectInserter createdFromInserter;
  51. final FileObjectDatabase db;
  52. WindowCursor(FileObjectDatabase db) {
  53. this.db = db;
  54. this.createdFromInserter = null;
  55. this.streamFileThreshold = WindowCache.getStreamFileThreshold();
  56. }
  57. WindowCursor(FileObjectDatabase db,
  58. @Nullable ObjectDirectoryInserter createdFromInserter) {
  59. this.db = db;
  60. this.createdFromInserter = createdFromInserter;
  61. this.streamFileThreshold = WindowCache.getStreamFileThreshold();
  62. }
  63. DeltaBaseCache getDeltaBaseCache() {
  64. if (baseCache == null)
  65. baseCache = new DeltaBaseCache();
  66. return baseCache;
  67. }
  68. /** {@inheritDoc} */
  69. @Override
  70. public ObjectReader newReader() {
  71. return new WindowCursor(db);
  72. }
  73. /** {@inheritDoc} */
  74. @Override
  75. public BitmapIndex getBitmapIndex() throws IOException {
  76. for (Pack pack : db.getPacks()) {
  77. PackBitmapIndex index = pack.getBitmapIndex();
  78. if (index != null)
  79. return new BitmapIndexImpl(index);
  80. }
  81. return null;
  82. }
  83. /** {@inheritDoc} */
  84. @Override
  85. public Collection<CachedPack> getCachedPacksAndUpdate(
  86. BitmapBuilder needBitmap) throws IOException {
  87. for (Pack pack : db.getPacks()) {
  88. PackBitmapIndex index = pack.getBitmapIndex();
  89. if (needBitmap.removeAllOrNone(index))
  90. return Collections.<CachedPack> singletonList(
  91. new LocalCachedPack(Collections.singletonList(pack)));
  92. }
  93. return Collections.emptyList();
  94. }
  95. /** {@inheritDoc} */
  96. @Override
  97. public Collection<ObjectId> resolve(AbbreviatedObjectId id)
  98. throws IOException {
  99. if (id.isComplete())
  100. return Collections.singleton(id.toObjectId());
  101. HashSet<ObjectId> matches = new HashSet<>(4);
  102. db.resolve(matches, id);
  103. return matches;
  104. }
  105. /** {@inheritDoc} */
  106. @Override
  107. public boolean has(AnyObjectId objectId) throws IOException {
  108. return db.has(objectId);
  109. }
  110. /** {@inheritDoc} */
  111. @Override
  112. public ObjectLoader open(AnyObjectId objectId, int typeHint)
  113. throws MissingObjectException, IncorrectObjectTypeException,
  114. IOException {
  115. final ObjectLoader ldr = db.openObject(this, objectId);
  116. if (ldr == null) {
  117. if (typeHint == OBJ_ANY)
  118. throw new MissingObjectException(objectId.copy(),
  119. JGitText.get().unknownObjectType2);
  120. throw new MissingObjectException(objectId.copy(), typeHint);
  121. }
  122. if (typeHint != OBJ_ANY && ldr.getType() != typeHint)
  123. throw new IncorrectObjectTypeException(objectId.copy(), typeHint);
  124. return ldr;
  125. }
  126. /** {@inheritDoc} */
  127. @Override
  128. public Set<ObjectId> getShallowCommits() throws IOException {
  129. return db.getShallowCommits();
  130. }
  131. /** {@inheritDoc} */
  132. @Override
  133. public long getObjectSize(AnyObjectId objectId, int typeHint)
  134. throws MissingObjectException, IncorrectObjectTypeException,
  135. IOException {
  136. long sz = db.getObjectSize(this, objectId);
  137. if (sz < 0) {
  138. if (typeHint == OBJ_ANY)
  139. throw new MissingObjectException(objectId.copy(),
  140. JGitText.get().unknownObjectType2);
  141. throw new MissingObjectException(objectId.copy(), typeHint);
  142. }
  143. return sz;
  144. }
  145. /** {@inheritDoc} */
  146. @Override
  147. public LocalObjectToPack newObjectToPack(AnyObjectId objectId, int type) {
  148. return new LocalObjectToPack(objectId, type);
  149. }
  150. /** {@inheritDoc} */
  151. @Override
  152. public void selectObjectRepresentation(PackWriter packer,
  153. ProgressMonitor monitor, Iterable<ObjectToPack> objects)
  154. throws IOException, MissingObjectException {
  155. for (ObjectToPack otp : objects) {
  156. db.selectObjectRepresentation(packer, otp, this);
  157. monitor.update(1);
  158. }
  159. }
  160. /** {@inheritDoc} */
  161. @Override
  162. public void copyObjectAsIs(PackOutputStream out, ObjectToPack otp,
  163. boolean validate) throws IOException,
  164. StoredObjectRepresentationNotAvailableException {
  165. LocalObjectToPack src = (LocalObjectToPack) otp;
  166. src.pack.copyAsIs(out, src, validate, this);
  167. }
  168. /** {@inheritDoc} */
  169. @Override
  170. public void writeObjects(PackOutputStream out, List<ObjectToPack> list)
  171. throws IOException {
  172. for (ObjectToPack otp : list)
  173. out.writeObject(otp);
  174. }
  175. /**
  176. * Copy bytes from the window to a caller supplied buffer.
  177. *
  178. * @param pack
  179. * the file the desired window is stored within.
  180. * @param position
  181. * position within the file to read from.
  182. * @param dstbuf
  183. * destination buffer to copy into.
  184. * @param dstoff
  185. * offset within <code>dstbuf</code> to start copying into.
  186. * @param cnt
  187. * number of bytes to copy. This value may exceed the number of
  188. * bytes remaining in the window starting at offset
  189. * <code>pos</code>.
  190. * @return number of bytes actually copied; this may be less than
  191. * <code>cnt</code> if <code>cnt</code> exceeded the number of bytes
  192. * available.
  193. * @throws IOException
  194. * this cursor does not match the provider or id and the proper
  195. * window could not be acquired through the provider's cache.
  196. */
  197. int copy(final Pack pack, long position, final byte[] dstbuf,
  198. int dstoff, final int cnt) throws IOException {
  199. final long length = pack.length;
  200. int need = cnt;
  201. while (need > 0 && position < length) {
  202. pin(pack, position);
  203. final int r = window.copy(position, dstbuf, dstoff, need);
  204. position += r;
  205. dstoff += r;
  206. need -= r;
  207. }
  208. return cnt - need;
  209. }
  210. /** {@inheritDoc} */
  211. @Override
  212. public void copyPackAsIs(PackOutputStream out, CachedPack pack)
  213. throws IOException, StoredPackRepresentationNotAvailableException {
  214. ((LocalCachedPack) pack).copyAsIs(out, this);
  215. }
  216. void copyPackAsIs(final Pack pack, final long length,
  217. final PackOutputStream out) throws IOException, StoredPackRepresentationNotAvailableException {
  218. long position = 12;
  219. long remaining = length - (12 + 20);
  220. while (0 < remaining) {
  221. boolean reloadedPacks = false;
  222. COPYPACK: for (; ; ) {
  223. try {
  224. pin(pack, position);
  225. int ptr = (int) (position - window.start);
  226. int n = (int) Math.min(window.size() - ptr, remaining);
  227. window.write(out, position, n);
  228. position += n;
  229. remaining -= n;
  230. } catch(IOException e){
  231. if (reloadedPacks) {
  232. throw new StoredPackRepresentationNotAvailableException(pack, e);
  233. } else {
  234. reloadedPacks = true;
  235. WindowCache.purge(pack);
  236. continue COPYPACK;
  237. }
  238. }
  239. break COPYPACK;
  240. }
  241. }
  242. }
  243. /**
  244. * Inflate a region of the pack starting at {@code position}.
  245. *
  246. * @param pack
  247. * the file the desired window is stored within.
  248. * @param position
  249. * position within the file to read from.
  250. * @param dstbuf
  251. * destination buffer the inflater should output decompressed
  252. * data to. Must be large enough to store the entire stream,
  253. * unless headerOnly is true.
  254. * @param headerOnly
  255. * if true the caller wants only {@code dstbuf.length} bytes.
  256. * @return number of bytes inflated into <code>dstbuf</code>.
  257. * @throws IOException
  258. * this cursor does not match the provider or id and the proper
  259. * window could not be acquired through the provider's cache.
  260. * @throws DataFormatException
  261. * the inflater encountered an invalid chunk of data. Data
  262. * stream corruption is likely.
  263. */
  264. int inflate(final Pack pack, long position, final byte[] dstbuf,
  265. boolean headerOnly) throws IOException, DataFormatException {
  266. prepareInflater();
  267. pin(pack, position);
  268. position += window.setInput(position, inf);
  269. for (int dstoff = 0;;) {
  270. int n = inf.inflate(dstbuf, dstoff, dstbuf.length - dstoff);
  271. dstoff += n;
  272. if (inf.finished() || (headerOnly && dstoff == dstbuf.length))
  273. return dstoff;
  274. if (inf.needsInput()) {
  275. pin(pack, position);
  276. position += window.setInput(position, inf);
  277. } else if (n == 0)
  278. throw new DataFormatException();
  279. }
  280. }
  281. ByteArrayWindow quickCopy(Pack p, long pos, long cnt)
  282. throws IOException {
  283. pin(p, pos);
  284. if (window instanceof ByteArrayWindow
  285. && window.contains(p, pos + (cnt - 1)))
  286. return (ByteArrayWindow) window;
  287. return null;
  288. }
  289. Inflater inflater() {
  290. prepareInflater();
  291. return inf;
  292. }
  293. private void prepareInflater() {
  294. if (inf == null)
  295. inf = InflaterCache.get();
  296. else
  297. inf.reset();
  298. }
  299. void pin(Pack pack, long position)
  300. throws IOException {
  301. final ByteWindow w = window;
  302. if (w == null || !w.contains(pack, position)) {
  303. // If memory is low, we may need what is in our window field to
  304. // be cleaned up by the GC during the get for the next window.
  305. // So we always clear it, even though we are just going to set
  306. // it again.
  307. //
  308. window = null;
  309. window = WindowCache.get(pack, position);
  310. }
  311. }
  312. /** {@inheritDoc} */
  313. @Override
  314. @Nullable
  315. public ObjectInserter getCreatedFromInserter() {
  316. return createdFromInserter;
  317. }
  318. /**
  319. * {@inheritDoc}
  320. * <p>
  321. * Release the current window cursor.
  322. */
  323. @Override
  324. public void close() {
  325. window = null;
  326. baseCache = null;
  327. try {
  328. InflaterCache.release(inf);
  329. } finally {
  330. inf = null;
  331. }
  332. }
  333. }