You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

DfsPackFile.java 29KB

DFS: A storage layer for JGit In practice the DHT storage layer has not been performing as well as large scale server environments want to see from a Git server. The performance of the DHT schema degrades rapidly as small changes are pushed into the repository due to the chunk size being less than 1/3 of the pushed pack size. Small chunks cause poor prefetch performance during reading, and require significantly longer prefetch lists inside of the chunk meta field to work around the small size. The DHT code is very complex (>17,000 lines of code) and is very sensitive to the underlying database round-trip time, as well as the way objects were written into the pack stream that was chunked and stored on the database. A poor pack layout (from any version of C Git prior to Junio reworking it) can cause the DHT code to be unable to enumerate the objects of the linux-2.6 repository in a completable time scale. Performing a clone from a DHT stored repository of 2 million objects takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row for each object being cloned. This is very difficult for some DHTs to scale, even at 5000 rows/second the lookup stage alone takes 6 minutes (on local filesystem, this is almost too fast to bother measuring). Some servers like Apache Cassandra just fall over and cannot complete the 2 million lookups in rapid fire. On a ~400 MiB repository, the DHT schema has an extra 25 MiB of redundant data that gets downloaded to the JGit process, and that is before you consider the cost of the OBJECT_INDEX table also being fully loaded, which is at least 223 MiB of data for the linux kernel repository. In the DHT schema answering a `git clone` of the ~400 MiB linux kernel needs to load 248 MiB of "index" data from the DHT, in addition to the ~400 MiB of pack data that gets sent to the client. This is 193 MiB more data to be accessed than the native filesystem format, but it needs to come over a much smaller pipe (local Ethernet typically) than the local SATA disk drive. I also never got around to writing the "repack" support for the DHT schema, as it turns out to be fairly complex to safely repack data in the repository while also trying to minimize the amount of changes made to the database, due to very common limitations on database mutation rates.. This new DFS storage layer fixes a lot of those issues by taking the simple approach for storing relatively standard Git pack and index files on an abstract filesystem. Packs are accessed by an in-process buffer cache, similar to the WindowCache used by the local filesystem storage layer. Unlike the local file IO, there are some assumptions that the storage system has relatively high latency and no concept of "file handles". Instead it looks at the file more like HTTP byte range requests, where a read channel is a simply a thunk to trigger a read request over the network. The DFS code in this change is still abstract, it does not store on any particular filesystem, but is fairly well suited to the Amazon S3 or Apache Hadoop HDFS. Storing packs directly on HDFS rather than HBase removes a layer of abstraction, as most HBase row reads turn into an HDFS read. Most of the DFS code in this change was blatently copied from the local filesystem code. Most parts should be refactored to be shared between the two storage systems, but right now I am hesistent to do this due to how well tuned the local filesystem code currently is. Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb
13 years ago
DFS: A storage layer for JGit In practice the DHT storage layer has not been performing as well as large scale server environments want to see from a Git server. The performance of the DHT schema degrades rapidly as small changes are pushed into the repository due to the chunk size being less than 1/3 of the pushed pack size. Small chunks cause poor prefetch performance during reading, and require significantly longer prefetch lists inside of the chunk meta field to work around the small size. The DHT code is very complex (>17,000 lines of code) and is very sensitive to the underlying database round-trip time, as well as the way objects were written into the pack stream that was chunked and stored on the database. A poor pack layout (from any version of C Git prior to Junio reworking it) can cause the DHT code to be unable to enumerate the objects of the linux-2.6 repository in a completable time scale. Performing a clone from a DHT stored repository of 2 million objects takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row for each object being cloned. This is very difficult for some DHTs to scale, even at 5000 rows/second the lookup stage alone takes 6 minutes (on local filesystem, this is almost too fast to bother measuring). Some servers like Apache Cassandra just fall over and cannot complete the 2 million lookups in rapid fire. On a ~400 MiB repository, the DHT schema has an extra 25 MiB of redundant data that gets downloaded to the JGit process, and that is before you consider the cost of the OBJECT_INDEX table also being fully loaded, which is at least 223 MiB of data for the linux kernel repository. In the DHT schema answering a `git clone` of the ~400 MiB linux kernel needs to load 248 MiB of "index" data from the DHT, in addition to the ~400 MiB of pack data that gets sent to the client. This is 193 MiB more data to be accessed than the native filesystem format, but it needs to come over a much smaller pipe (local Ethernet typically) than the local SATA disk drive. I also never got around to writing the "repack" support for the DHT schema, as it turns out to be fairly complex to safely repack data in the repository while also trying to minimize the amount of changes made to the database, due to very common limitations on database mutation rates.. This new DFS storage layer fixes a lot of those issues by taking the simple approach for storing relatively standard Git pack and index files on an abstract filesystem. Packs are accessed by an in-process buffer cache, similar to the WindowCache used by the local filesystem storage layer. Unlike the local file IO, there are some assumptions that the storage system has relatively high latency and no concept of "file handles". Instead it looks at the file more like HTTP byte range requests, where a read channel is a simply a thunk to trigger a read request over the network. The DFS code in this change is still abstract, it does not store on any particular filesystem, but is fairly well suited to the Amazon S3 or Apache Hadoop HDFS. Storing packs directly on HDFS rather than HBase removes a layer of abstraction, as most HBase row reads turn into an HDFS read. Most of the DFS code in this change was blatently copied from the local filesystem code. Most parts should be refactored to be shared between the two storage systems, but right now I am hesistent to do this due to how well tuned the local filesystem code currently is. Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb
13 years ago
DFS: A storage layer for JGit In practice the DHT storage layer has not been performing as well as large scale server environments want to see from a Git server. The performance of the DHT schema degrades rapidly as small changes are pushed into the repository due to the chunk size being less than 1/3 of the pushed pack size. Small chunks cause poor prefetch performance during reading, and require significantly longer prefetch lists inside of the chunk meta field to work around the small size. The DHT code is very complex (>17,000 lines of code) and is very sensitive to the underlying database round-trip time, as well as the way objects were written into the pack stream that was chunked and stored on the database. A poor pack layout (from any version of C Git prior to Junio reworking it) can cause the DHT code to be unable to enumerate the objects of the linux-2.6 repository in a completable time scale. Performing a clone from a DHT stored repository of 2 million objects takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row for each object being cloned. This is very difficult for some DHTs to scale, even at 5000 rows/second the lookup stage alone takes 6 minutes (on local filesystem, this is almost too fast to bother measuring). Some servers like Apache Cassandra just fall over and cannot complete the 2 million lookups in rapid fire. On a ~400 MiB repository, the DHT schema has an extra 25 MiB of redundant data that gets downloaded to the JGit process, and that is before you consider the cost of the OBJECT_INDEX table also being fully loaded, which is at least 223 MiB of data for the linux kernel repository. In the DHT schema answering a `git clone` of the ~400 MiB linux kernel needs to load 248 MiB of "index" data from the DHT, in addition to the ~400 MiB of pack data that gets sent to the client. This is 193 MiB more data to be accessed than the native filesystem format, but it needs to come over a much smaller pipe (local Ethernet typically) than the local SATA disk drive. I also never got around to writing the "repack" support for the DHT schema, as it turns out to be fairly complex to safely repack data in the repository while also trying to minimize the amount of changes made to the database, due to very common limitations on database mutation rates.. This new DFS storage layer fixes a lot of those issues by taking the simple approach for storing relatively standard Git pack and index files on an abstract filesystem. Packs are accessed by an in-process buffer cache, similar to the WindowCache used by the local filesystem storage layer. Unlike the local file IO, there are some assumptions that the storage system has relatively high latency and no concept of "file handles". Instead it looks at the file more like HTTP byte range requests, where a read channel is a simply a thunk to trigger a read request over the network. The DFS code in this change is still abstract, it does not store on any particular filesystem, but is fairly well suited to the Amazon S3 or Apache Hadoop HDFS. Storing packs directly on HDFS rather than HBase removes a layer of abstraction, as most HBase row reads turn into an HDFS read. Most of the DFS code in this change was blatently copied from the local filesystem code. Most parts should be refactored to be shared between the two storage systems, but right now I am hesistent to do this due to how well tuned the local filesystem code currently is. Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb
13 years ago
DFS: A storage layer for JGit In practice the DHT storage layer has not been performing as well as large scale server environments want to see from a Git server. The performance of the DHT schema degrades rapidly as small changes are pushed into the repository due to the chunk size being less than 1/3 of the pushed pack size. Small chunks cause poor prefetch performance during reading, and require significantly longer prefetch lists inside of the chunk meta field to work around the small size. The DHT code is very complex (>17,000 lines of code) and is very sensitive to the underlying database round-trip time, as well as the way objects were written into the pack stream that was chunked and stored on the database. A poor pack layout (from any version of C Git prior to Junio reworking it) can cause the DHT code to be unable to enumerate the objects of the linux-2.6 repository in a completable time scale. Performing a clone from a DHT stored repository of 2 million objects takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row for each object being cloned. This is very difficult for some DHTs to scale, even at 5000 rows/second the lookup stage alone takes 6 minutes (on local filesystem, this is almost too fast to bother measuring). Some servers like Apache Cassandra just fall over and cannot complete the 2 million lookups in rapid fire. On a ~400 MiB repository, the DHT schema has an extra 25 MiB of redundant data that gets downloaded to the JGit process, and that is before you consider the cost of the OBJECT_INDEX table also being fully loaded, which is at least 223 MiB of data for the linux kernel repository. In the DHT schema answering a `git clone` of the ~400 MiB linux kernel needs to load 248 MiB of "index" data from the DHT, in addition to the ~400 MiB of pack data that gets sent to the client. This is 193 MiB more data to be accessed than the native filesystem format, but it needs to come over a much smaller pipe (local Ethernet typically) than the local SATA disk drive. I also never got around to writing the "repack" support for the DHT schema, as it turns out to be fairly complex to safely repack data in the repository while also trying to minimize the amount of changes made to the database, due to very common limitations on database mutation rates.. This new DFS storage layer fixes a lot of those issues by taking the simple approach for storing relatively standard Git pack and index files on an abstract filesystem. Packs are accessed by an in-process buffer cache, similar to the WindowCache used by the local filesystem storage layer. Unlike the local file IO, there are some assumptions that the storage system has relatively high latency and no concept of "file handles". Instead it looks at the file more like HTTP byte range requests, where a read channel is a simply a thunk to trigger a read request over the network. The DFS code in this change is still abstract, it does not store on any particular filesystem, but is fairly well suited to the Amazon S3 or Apache Hadoop HDFS. Storing packs directly on HDFS rather than HBase removes a layer of abstraction, as most HBase row reads turn into an HDFS read. Most of the DFS code in this change was blatently copied from the local filesystem code. Most parts should be refactored to be shared between the two storage systems, but right now I am hesistent to do this due to how well tuned the local filesystem code currently is. Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb
13 years ago
DFS: A storage layer for JGit In practice the DHT storage layer has not been performing as well as large scale server environments want to see from a Git server. The performance of the DHT schema degrades rapidly as small changes are pushed into the repository due to the chunk size being less than 1/3 of the pushed pack size. Small chunks cause poor prefetch performance during reading, and require significantly longer prefetch lists inside of the chunk meta field to work around the small size. The DHT code is very complex (>17,000 lines of code) and is very sensitive to the underlying database round-trip time, as well as the way objects were written into the pack stream that was chunked and stored on the database. A poor pack layout (from any version of C Git prior to Junio reworking it) can cause the DHT code to be unable to enumerate the objects of the linux-2.6 repository in a completable time scale. Performing a clone from a DHT stored repository of 2 million objects takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row for each object being cloned. This is very difficult for some DHTs to scale, even at 5000 rows/second the lookup stage alone takes 6 minutes (on local filesystem, this is almost too fast to bother measuring). Some servers like Apache Cassandra just fall over and cannot complete the 2 million lookups in rapid fire. On a ~400 MiB repository, the DHT schema has an extra 25 MiB of redundant data that gets downloaded to the JGit process, and that is before you consider the cost of the OBJECT_INDEX table also being fully loaded, which is at least 223 MiB of data for the linux kernel repository. In the DHT schema answering a `git clone` of the ~400 MiB linux kernel needs to load 248 MiB of "index" data from the DHT, in addition to the ~400 MiB of pack data that gets sent to the client. This is 193 MiB more data to be accessed than the native filesystem format, but it needs to come over a much smaller pipe (local Ethernet typically) than the local SATA disk drive. I also never got around to writing the "repack" support for the DHT schema, as it turns out to be fairly complex to safely repack data in the repository while also trying to minimize the amount of changes made to the database, due to very common limitations on database mutation rates.. This new DFS storage layer fixes a lot of those issues by taking the simple approach for storing relatively standard Git pack and index files on an abstract filesystem. Packs are accessed by an in-process buffer cache, similar to the WindowCache used by the local filesystem storage layer. Unlike the local file IO, there are some assumptions that the storage system has relatively high latency and no concept of "file handles". Instead it looks at the file more like HTTP byte range requests, where a read channel is a simply a thunk to trigger a read request over the network. The DFS code in this change is still abstract, it does not store on any particular filesystem, but is fairly well suited to the Amazon S3 or Apache Hadoop HDFS. Storing packs directly on HDFS rather than HBase removes a layer of abstraction, as most HBase row reads turn into an HDFS read. Most of the DFS code in this change was blatently copied from the local filesystem code. Most parts should be refactored to be shared between the two storage systems, but right now I am hesistent to do this due to how well tuned the local filesystem code currently is. Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb
13 years ago
DFS: A storage layer for JGit In practice the DHT storage layer has not been performing as well as large scale server environments want to see from a Git server. The performance of the DHT schema degrades rapidly as small changes are pushed into the repository due to the chunk size being less than 1/3 of the pushed pack size. Small chunks cause poor prefetch performance during reading, and require significantly longer prefetch lists inside of the chunk meta field to work around the small size. The DHT code is very complex (>17,000 lines of code) and is very sensitive to the underlying database round-trip time, as well as the way objects were written into the pack stream that was chunked and stored on the database. A poor pack layout (from any version of C Git prior to Junio reworking it) can cause the DHT code to be unable to enumerate the objects of the linux-2.6 repository in a completable time scale. Performing a clone from a DHT stored repository of 2 million objects takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row for each object being cloned. This is very difficult for some DHTs to scale, even at 5000 rows/second the lookup stage alone takes 6 minutes (on local filesystem, this is almost too fast to bother measuring). Some servers like Apache Cassandra just fall over and cannot complete the 2 million lookups in rapid fire. On a ~400 MiB repository, the DHT schema has an extra 25 MiB of redundant data that gets downloaded to the JGit process, and that is before you consider the cost of the OBJECT_INDEX table also being fully loaded, which is at least 223 MiB of data for the linux kernel repository. In the DHT schema answering a `git clone` of the ~400 MiB linux kernel needs to load 248 MiB of "index" data from the DHT, in addition to the ~400 MiB of pack data that gets sent to the client. This is 193 MiB more data to be accessed than the native filesystem format, but it needs to come over a much smaller pipe (local Ethernet typically) than the local SATA disk drive. I also never got around to writing the "repack" support for the DHT schema, as it turns out to be fairly complex to safely repack data in the repository while also trying to minimize the amount of changes made to the database, due to very common limitations on database mutation rates.. This new DFS storage layer fixes a lot of those issues by taking the simple approach for storing relatively standard Git pack and index files on an abstract filesystem. Packs are accessed by an in-process buffer cache, similar to the WindowCache used by the local filesystem storage layer. Unlike the local file IO, there are some assumptions that the storage system has relatively high latency and no concept of "file handles". Instead it looks at the file more like HTTP byte range requests, where a read channel is a simply a thunk to trigger a read request over the network. The DFS code in this change is still abstract, it does not store on any particular filesystem, but is fairly well suited to the Amazon S3 or Apache Hadoop HDFS. Storing packs directly on HDFS rather than HBase removes a layer of abstraction, as most HBase row reads turn into an HDFS read. Most of the DFS code in this change was blatently copied from the local filesystem code. Most parts should be refactored to be shared between the two storage systems, but right now I am hesistent to do this due to how well tuned the local filesystem code currently is. Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb
13 years ago
DFS: A storage layer for JGit In practice the DHT storage layer has not been performing as well as large scale server environments want to see from a Git server. The performance of the DHT schema degrades rapidly as small changes are pushed into the repository due to the chunk size being less than 1/3 of the pushed pack size. Small chunks cause poor prefetch performance during reading, and require significantly longer prefetch lists inside of the chunk meta field to work around the small size. The DHT code is very complex (>17,000 lines of code) and is very sensitive to the underlying database round-trip time, as well as the way objects were written into the pack stream that was chunked and stored on the database. A poor pack layout (from any version of C Git prior to Junio reworking it) can cause the DHT code to be unable to enumerate the objects of the linux-2.6 repository in a completable time scale. Performing a clone from a DHT stored repository of 2 million objects takes 2 million row lookups in the DHT to locate the OBJECT_INDEX row for each object being cloned. This is very difficult for some DHTs to scale, even at 5000 rows/second the lookup stage alone takes 6 minutes (on local filesystem, this is almost too fast to bother measuring). Some servers like Apache Cassandra just fall over and cannot complete the 2 million lookups in rapid fire. On a ~400 MiB repository, the DHT schema has an extra 25 MiB of redundant data that gets downloaded to the JGit process, and that is before you consider the cost of the OBJECT_INDEX table also being fully loaded, which is at least 223 MiB of data for the linux kernel repository. In the DHT schema answering a `git clone` of the ~400 MiB linux kernel needs to load 248 MiB of "index" data from the DHT, in addition to the ~400 MiB of pack data that gets sent to the client. This is 193 MiB more data to be accessed than the native filesystem format, but it needs to come over a much smaller pipe (local Ethernet typically) than the local SATA disk drive. I also never got around to writing the "repack" support for the DHT schema, as it turns out to be fairly complex to safely repack data in the repository while also trying to minimize the amount of changes made to the database, due to very common limitations on database mutation rates.. This new DFS storage layer fixes a lot of those issues by taking the simple approach for storing relatively standard Git pack and index files on an abstract filesystem. Packs are accessed by an in-process buffer cache, similar to the WindowCache used by the local filesystem storage layer. Unlike the local file IO, there are some assumptions that the storage system has relatively high latency and no concept of "file handles". Instead it looks at the file more like HTTP byte range requests, where a read channel is a simply a thunk to trigger a read request over the network. The DFS code in this change is still abstract, it does not store on any particular filesystem, but is fairly well suited to the Amazon S3 or Apache Hadoop HDFS. Storing packs directly on HDFS rather than HBase removes a layer of abstraction, as most HBase row reads turn into an HDFS read. Most of the DFS code in this change was blatently copied from the local filesystem code. Most parts should be refactored to be shared between the two storage systems, but right now I am hesistent to do this due to how well tuned the local filesystem code currently is. Change-Id: Iec524abdf172e9ec5485d6c88ca6512cd8a6eafb
13 years ago
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928028128228328428528628728828929029129229329429529629729829930030130230330430530630730830931031131231331431531631731831932032132232332432532632732832933033133233333433533633733833934034134234334434534634734834935035135235335435535635735835936036136236336436536636736836937037137237337437537637737837938038138238338438538638738838939039139239339439539639739839940040140240340440540640740840941041141241341441541641741841942042142242342442542642742842943043143243343443543643743843944044144244344444544644744844945045145245345445545645745845946046146246346446546646746846947047147247347447547647747847948048148248348448548648748848949049149249349449549649749849950050150250350450550650750850951051151251351451551651751851952052152252352452552652752852953053153253353453553653753853954054154254354454554654754854955055155255355455555655755855956056156256356456556656756856957057157257357457557657757857958058158258358458558658758858959059159259359459559659759859960060160260360460560660760860961061161261361461561661761861962062162262362462562662762862963063163263363463563663763863964064164264364464564664764864965065165265365465565665765865966066166266366466566666766866967067167267367467567667767867968068168268368468568668768868969069169269369469569669769869970070170270370470570670770870971071171271371471571671771871972072172272372472572672772872973073173273373473573673773873974074174274374474574674774874975075175275375475575675775875976076176276376476576676776876977077177277377477577677777877978078178278378478578678778878979079179279379479579679779879980080180280380480580680780880981081181281381481581681781881982082182282382482582682782882983083183283383483583683783883984084184284384484584684784884985085185285385485585685785885986086186286386486586686786886987087187287387487587687787887988088188288388488588688788888989089189289389489589689789889990090190290390490590690790890991091191291391491591691791891992092192292392492592692792892993093193293393493593693793893994094194294394494594694794894995095195295395495595695795895996096196296396496596696796896997097197297397497597697797897998098198298398498598698798898999099199299399499599699799899910001001100210031004100510061007100810091010101110121013101410151016101710181019102010211022102310241025102610271028
  1. /*
  2. * Copyright (C) 2008-2011, Google Inc.
  3. * Copyright (C) 2007, Robin Rosenberg <robin.rosenberg@dewire.com>
  4. * Copyright (C) 2006-2008, Shawn O. Pearce <spearce@spearce.org>
  5. * and other copyright owners as documented in the project's IP log.
  6. *
  7. * This program and the accompanying materials are made available
  8. * under the terms of the Eclipse Distribution License v1.0 which
  9. * accompanies this distribution, is reproduced below, and is
  10. * available at http://www.eclipse.org/org/documents/edl-v10.php
  11. *
  12. * All rights reserved.
  13. *
  14. * Redistribution and use in source and binary forms, with or
  15. * without modification, are permitted provided that the following
  16. * conditions are met:
  17. *
  18. * - Redistributions of source code must retain the above copyright
  19. * notice, this list of conditions and the following disclaimer.
  20. *
  21. * - Redistributions in binary form must reproduce the above
  22. * copyright notice, this list of conditions and the following
  23. * disclaimer in the documentation and/or other materials provided
  24. * with the distribution.
  25. *
  26. * - Neither the name of the Eclipse Foundation, Inc. nor the
  27. * names of its contributors may be used to endorse or promote
  28. * products derived from this software without specific prior
  29. * written permission.
  30. *
  31. * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
  32. * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
  33. * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  34. * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  35. * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
  36. * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  37. * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  38. * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  39. * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
  40. * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
  41. * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
  42. * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
  43. * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  44. */
  45. package org.eclipse.jgit.storage.dfs;
  46. import java.io.BufferedInputStream;
  47. import java.io.EOFException;
  48. import java.io.IOException;
  49. import java.io.InputStream;
  50. import java.nio.channels.Channels;
  51. import java.text.MessageFormat;
  52. import java.util.Set;
  53. import java.util.zip.CRC32;
  54. import java.util.zip.DataFormatException;
  55. import java.util.zip.Inflater;
  56. import org.eclipse.jgit.errors.CorruptObjectException;
  57. import org.eclipse.jgit.errors.LargeObjectException;
  58. import org.eclipse.jgit.errors.MissingObjectException;
  59. import org.eclipse.jgit.errors.PackInvalidException;
  60. import org.eclipse.jgit.errors.StoredObjectRepresentationNotAvailableException;
  61. import org.eclipse.jgit.internal.JGitText;
  62. import org.eclipse.jgit.lib.AbbreviatedObjectId;
  63. import org.eclipse.jgit.lib.AnyObjectId;
  64. import org.eclipse.jgit.lib.Constants;
  65. import org.eclipse.jgit.lib.ObjectId;
  66. import org.eclipse.jgit.lib.ObjectLoader;
  67. import org.eclipse.jgit.storage.file.PackIndex;
  68. import org.eclipse.jgit.storage.file.PackReverseIndex;
  69. import org.eclipse.jgit.storage.pack.BinaryDelta;
  70. import org.eclipse.jgit.storage.pack.PackOutputStream;
  71. import org.eclipse.jgit.storage.pack.StoredObjectRepresentation;
  72. import org.eclipse.jgit.util.IO;
  73. import org.eclipse.jgit.util.LongList;
  74. /**
  75. * A Git version 2 pack file representation. A pack file contains Git objects in
  76. * delta packed format yielding high compression of lots of object where some
  77. * objects are similar.
  78. */
  79. public final class DfsPackFile {
  80. /**
  81. * File offset used to cache {@link #index} in {@link DfsBlockCache}.
  82. * <p>
  83. * To better manage memory, the forward index is stored as a single block in
  84. * the block cache under this file position. A negative value is used
  85. * because it cannot occur in a normal pack file, and it is less likely to
  86. * collide with a valid data block from the file as the high bits will all
  87. * be set when treated as an unsigned long by the cache code.
  88. */
  89. private static final long POS_INDEX = -1;
  90. /** Offset used to cache {@link #reverseIndex}. See {@link #POS_INDEX}. */
  91. private static final long POS_REVERSE_INDEX = -2;
  92. /** Cache that owns this pack file and its data. */
  93. private final DfsBlockCache cache;
  94. /** Description of the pack file's storage. */
  95. private final DfsPackDescription packDesc;
  96. /** Unique identity of this pack while in-memory. */
  97. final DfsPackKey key;
  98. /**
  99. * Total number of bytes in this pack file.
  100. * <p>
  101. * This field initializes to -1 and gets populated when a block is loaded.
  102. */
  103. volatile long length;
  104. /**
  105. * Preferred alignment for loading blocks from the backing file.
  106. * <p>
  107. * It is initialized to 0 and filled in on the first read made from the
  108. * file. Block sizes may be odd, e.g. 4091, caused by the underling DFS
  109. * storing 4091 user bytes and 5 bytes block metadata into a lower level
  110. * 4096 byte block on disk.
  111. */
  112. private volatile int blockSize;
  113. /** True once corruption has been detected that cannot be worked around. */
  114. private volatile boolean invalid;
  115. /**
  116. * Lock for initialization of {@link #index} and {@link #corruptObjects}.
  117. * <p>
  118. * This lock ensures only one thread can perform the initialization work.
  119. */
  120. private final Object initLock = new Object();
  121. /** Index mapping {@link ObjectId} to position within the pack stream. */
  122. private volatile DfsBlockCache.Ref<PackIndex> index;
  123. /** Reverse version of {@link #index} mapping position to {@link ObjectId}. */
  124. private volatile DfsBlockCache.Ref<PackReverseIndex> reverseIndex;
  125. /**
  126. * Objects we have tried to read, and discovered to be corrupt.
  127. * <p>
  128. * The list is allocated after the first corruption is found, and filled in
  129. * as more entries are discovered. Typically this list is never used, as
  130. * pack files do not usually contain corrupt objects.
  131. */
  132. private volatile LongList corruptObjects;
  133. /**
  134. * Construct a reader for an existing, packfile.
  135. *
  136. * @param cache
  137. * cache that owns the pack data.
  138. * @param desc
  139. * description of the pack within the DFS.
  140. * @param key
  141. * interned key used to identify blocks in the block cache.
  142. */
  143. DfsPackFile(DfsBlockCache cache, DfsPackDescription desc, DfsPackKey key) {
  144. this.cache = cache;
  145. this.packDesc = desc;
  146. this.key = key;
  147. length = desc.getPackSize();
  148. if (length <= 0)
  149. length = -1;
  150. }
  151. /** @return description that was originally used to configure this pack file. */
  152. public DfsPackDescription getPackDescription() {
  153. return packDesc;
  154. }
  155. /** @return bytes cached in memory for this pack, excluding the index. */
  156. public long getCachedSize() {
  157. return key.cachedSize.get();
  158. }
  159. private String getPackName() {
  160. return packDesc.getPackName();
  161. }
  162. void setBlockSize(int newSize) {
  163. blockSize = newSize;
  164. }
  165. void setPackIndex(PackIndex idx) {
  166. long objCnt = idx.getObjectCount();
  167. int recSize = Constants.OBJECT_ID_LENGTH + 8;
  168. int sz = (int) Math.min(objCnt * recSize, Integer.MAX_VALUE);
  169. index = cache.put(key, POS_INDEX, sz, idx);
  170. }
  171. PackIndex getPackIndex(DfsReader ctx) throws IOException {
  172. return idx(ctx);
  173. }
  174. private PackIndex idx(DfsReader ctx) throws IOException {
  175. DfsBlockCache.Ref<PackIndex> idxref = index;
  176. if (idxref != null) {
  177. PackIndex idx = idxref.get();
  178. if (idx != null)
  179. return idx;
  180. }
  181. if (invalid)
  182. throw new PackInvalidException(getPackName());
  183. synchronized (initLock) {
  184. idxref = index;
  185. if (idxref != null) {
  186. PackIndex idx = idxref.get();
  187. if (idx != null)
  188. return idx;
  189. }
  190. PackIndex idx;
  191. try {
  192. ReadableChannel rc = ctx.db.openPackIndex(packDesc);
  193. try {
  194. InputStream in = Channels.newInputStream(rc);
  195. int wantSize = 8192;
  196. int bs = rc.blockSize();
  197. if (0 < bs && bs < wantSize)
  198. bs = (wantSize / bs) * bs;
  199. else if (bs <= 0)
  200. bs = wantSize;
  201. in = new BufferedInputStream(in, bs);
  202. idx = PackIndex.read(in);
  203. } finally {
  204. rc.close();
  205. }
  206. } catch (EOFException e) {
  207. invalid = true;
  208. IOException e2 = new IOException(MessageFormat.format(
  209. DfsText.get().shortReadOfIndex, packDesc.getIndexName()));
  210. e2.initCause(e);
  211. throw e2;
  212. } catch (IOException e) {
  213. invalid = true;
  214. IOException e2 = new IOException(MessageFormat.format(
  215. DfsText.get().cannotReadIndex, packDesc.getIndexName()));
  216. e2.initCause(e);
  217. throw e2;
  218. }
  219. setPackIndex(idx);
  220. return idx;
  221. }
  222. }
  223. private PackReverseIndex getReverseIdx(DfsReader ctx) throws IOException {
  224. DfsBlockCache.Ref<PackReverseIndex> revref = reverseIndex;
  225. if (revref != null) {
  226. PackReverseIndex revidx = revref.get();
  227. if (revidx != null)
  228. return revidx;
  229. }
  230. synchronized (initLock) {
  231. revref = reverseIndex;
  232. if (revref != null) {
  233. PackReverseIndex revidx = revref.get();
  234. if (revidx != null)
  235. return revidx;
  236. }
  237. PackReverseIndex revidx = new PackReverseIndex(idx(ctx));
  238. reverseIndex = cache.put(key, POS_REVERSE_INDEX,
  239. packDesc.getReverseIndexSize(), revidx);
  240. return revidx;
  241. }
  242. }
  243. boolean hasObject(DfsReader ctx, AnyObjectId id) throws IOException {
  244. final long offset = idx(ctx).findOffset(id);
  245. return 0 < offset && !isCorrupt(offset);
  246. }
  247. /**
  248. * Get an object from this pack.
  249. *
  250. * @param ctx
  251. * temporary working space associated with the calling thread.
  252. * @param id
  253. * the object to obtain from the pack. Must not be null.
  254. * @return the object loader for the requested object if it is contained in
  255. * this pack; null if the object was not found.
  256. * @throws IOException
  257. * the pack file or the index could not be read.
  258. */
  259. ObjectLoader get(DfsReader ctx, AnyObjectId id)
  260. throws IOException {
  261. long offset = idx(ctx).findOffset(id);
  262. return 0 < offset && !isCorrupt(offset) ? load(ctx, offset) : null;
  263. }
  264. long findOffset(DfsReader ctx, AnyObjectId id) throws IOException {
  265. return idx(ctx).findOffset(id);
  266. }
  267. void resolve(DfsReader ctx, Set<ObjectId> matches, AbbreviatedObjectId id,
  268. int matchLimit) throws IOException {
  269. idx(ctx).resolve(matches, id, matchLimit);
  270. }
  271. /** Release all memory used by this DfsPackFile instance. */
  272. public void close() {
  273. cache.remove(this);
  274. index = null;
  275. reverseIndex = null;
  276. }
  277. /**
  278. * Obtain the total number of objects available in this pack. This method
  279. * relies on pack index, giving number of effectively available objects.
  280. *
  281. * @param ctx
  282. * current reader for the calling thread.
  283. * @return number of objects in index of this pack, likewise in this pack
  284. * @throws IOException
  285. * the index file cannot be loaded into memory.
  286. */
  287. long getObjectCount(DfsReader ctx) throws IOException {
  288. return idx(ctx).getObjectCount();
  289. }
  290. /**
  291. * Search for object id with the specified start offset in associated pack
  292. * (reverse) index.
  293. *
  294. * @param ctx
  295. * current reader for the calling thread.
  296. * @param offset
  297. * start offset of object to find
  298. * @return object id for this offset, or null if no object was found
  299. * @throws IOException
  300. * the index file cannot be loaded into memory.
  301. */
  302. ObjectId findObjectForOffset(DfsReader ctx, long offset) throws IOException {
  303. return getReverseIdx(ctx).findObject(offset);
  304. }
  305. private byte[] decompress(long position, int sz, DfsReader ctx)
  306. throws IOException, DataFormatException {
  307. byte[] dstbuf;
  308. try {
  309. dstbuf = new byte[sz];
  310. } catch (OutOfMemoryError noMemory) {
  311. // The size may be larger than our heap allows, return null to
  312. // let the caller know allocation isn't possible and it should
  313. // use the large object streaming approach instead.
  314. //
  315. // For example, this can occur when sz is 640 MB, and JRE
  316. // maximum heap size is only 256 MB. Even if the JRE has
  317. // 200 MB free, it cannot allocate a 640 MB byte array.
  318. return null;
  319. }
  320. if (ctx.inflate(this, position, dstbuf, false) != sz)
  321. throw new EOFException(MessageFormat.format(
  322. JGitText.get().shortCompressedStreamAt,
  323. Long.valueOf(position)));
  324. return dstbuf;
  325. }
  326. void copyPackAsIs(PackOutputStream out, boolean validate, DfsReader ctx)
  327. throws IOException {
  328. // Pin the first window, this ensures the length is accurate.
  329. ctx.pin(this, 0);
  330. ctx.copyPackAsIs(this, length, validate, out);
  331. }
  332. void copyAsIs(PackOutputStream out, DfsObjectToPack src,
  333. boolean validate, DfsReader ctx) throws IOException,
  334. StoredObjectRepresentationNotAvailableException {
  335. final CRC32 crc1 = validate ? new CRC32() : null;
  336. final CRC32 crc2 = validate ? new CRC32() : null;
  337. final byte[] buf = out.getCopyBuffer();
  338. // Rip apart the header so we can discover the size.
  339. //
  340. try {
  341. readFully(src.offset, buf, 0, 20, ctx);
  342. } catch (IOException ioError) {
  343. StoredObjectRepresentationNotAvailableException gone;
  344. gone = new StoredObjectRepresentationNotAvailableException(src);
  345. gone.initCause(ioError);
  346. throw gone;
  347. }
  348. int c = buf[0] & 0xff;
  349. final int typeCode = (c >> 4) & 7;
  350. long inflatedLength = c & 15;
  351. int shift = 4;
  352. int headerCnt = 1;
  353. while ((c & 0x80) != 0) {
  354. c = buf[headerCnt++] & 0xff;
  355. inflatedLength += ((long) (c & 0x7f)) << shift;
  356. shift += 7;
  357. }
  358. if (typeCode == Constants.OBJ_OFS_DELTA) {
  359. do {
  360. c = buf[headerCnt++] & 0xff;
  361. } while ((c & 128) != 0);
  362. if (validate) {
  363. crc1.update(buf, 0, headerCnt);
  364. crc2.update(buf, 0, headerCnt);
  365. }
  366. } else if (typeCode == Constants.OBJ_REF_DELTA) {
  367. if (validate) {
  368. crc1.update(buf, 0, headerCnt);
  369. crc2.update(buf, 0, headerCnt);
  370. }
  371. readFully(src.offset + headerCnt, buf, 0, 20, ctx);
  372. if (validate) {
  373. crc1.update(buf, 0, 20);
  374. crc2.update(buf, 0, 20);
  375. }
  376. headerCnt += 20;
  377. } else if (validate) {
  378. crc1.update(buf, 0, headerCnt);
  379. crc2.update(buf, 0, headerCnt);
  380. }
  381. final long dataOffset = src.offset + headerCnt;
  382. final long dataLength = src.length;
  383. final long expectedCRC;
  384. final DfsBlock quickCopy;
  385. // Verify the object isn't corrupt before sending. If it is,
  386. // we report it missing instead.
  387. //
  388. try {
  389. quickCopy = ctx.quickCopy(this, dataOffset, dataLength);
  390. if (validate && idx(ctx).hasCRC32Support()) {
  391. // Index has the CRC32 code cached, validate the object.
  392. //
  393. expectedCRC = idx(ctx).findCRC32(src);
  394. if (quickCopy != null) {
  395. quickCopy.crc32(crc1, dataOffset, (int) dataLength);
  396. } else {
  397. long pos = dataOffset;
  398. long cnt = dataLength;
  399. while (cnt > 0) {
  400. final int n = (int) Math.min(cnt, buf.length);
  401. readFully(pos, buf, 0, n, ctx);
  402. crc1.update(buf, 0, n);
  403. pos += n;
  404. cnt -= n;
  405. }
  406. }
  407. if (crc1.getValue() != expectedCRC) {
  408. setCorrupt(src.offset);
  409. throw new CorruptObjectException(MessageFormat.format(
  410. JGitText.get().objectAtHasBadZlibStream,
  411. Long.valueOf(src.offset), getPackName()));
  412. }
  413. } else if (validate) {
  414. // We don't have a CRC32 code in the index, so compute it
  415. // now while inflating the raw data to get zlib to tell us
  416. // whether or not the data is safe.
  417. //
  418. Inflater inf = ctx.inflater();
  419. byte[] tmp = new byte[1024];
  420. if (quickCopy != null) {
  421. quickCopy.check(inf, tmp, dataOffset, (int) dataLength);
  422. } else {
  423. long pos = dataOffset;
  424. long cnt = dataLength;
  425. while (cnt > 0) {
  426. final int n = (int) Math.min(cnt, buf.length);
  427. readFully(pos, buf, 0, n, ctx);
  428. crc1.update(buf, 0, n);
  429. inf.setInput(buf, 0, n);
  430. while (inf.inflate(tmp, 0, tmp.length) > 0)
  431. continue;
  432. pos += n;
  433. cnt -= n;
  434. }
  435. }
  436. if (!inf.finished() || inf.getBytesRead() != dataLength) {
  437. setCorrupt(src.offset);
  438. throw new EOFException(MessageFormat.format(
  439. JGitText.get().shortCompressedStreamAt,
  440. Long.valueOf(src.offset)));
  441. }
  442. expectedCRC = crc1.getValue();
  443. } else {
  444. expectedCRC = -1;
  445. }
  446. } catch (DataFormatException dataFormat) {
  447. setCorrupt(src.offset);
  448. CorruptObjectException corruptObject = new CorruptObjectException(
  449. MessageFormat.format(
  450. JGitText.get().objectAtHasBadZlibStream,
  451. Long.valueOf(src.offset), getPackName()));
  452. corruptObject.initCause(dataFormat);
  453. StoredObjectRepresentationNotAvailableException gone;
  454. gone = new StoredObjectRepresentationNotAvailableException(src);
  455. gone.initCause(corruptObject);
  456. throw gone;
  457. } catch (IOException ioError) {
  458. StoredObjectRepresentationNotAvailableException gone;
  459. gone = new StoredObjectRepresentationNotAvailableException(src);
  460. gone.initCause(ioError);
  461. throw gone;
  462. }
  463. if (quickCopy != null) {
  464. // The entire object fits into a single byte array window slice,
  465. // and we have it pinned. Write this out without copying.
  466. //
  467. out.writeHeader(src, inflatedLength);
  468. quickCopy.write(out, dataOffset, (int) dataLength, null);
  469. } else if (dataLength <= buf.length) {
  470. // Tiny optimization: Lots of objects are very small deltas or
  471. // deflated commits that are likely to fit in the copy buffer.
  472. //
  473. if (!validate) {
  474. long pos = dataOffset;
  475. long cnt = dataLength;
  476. while (cnt > 0) {
  477. final int n = (int) Math.min(cnt, buf.length);
  478. readFully(pos, buf, 0, n, ctx);
  479. pos += n;
  480. cnt -= n;
  481. }
  482. }
  483. out.writeHeader(src, inflatedLength);
  484. out.write(buf, 0, (int) dataLength);
  485. } else {
  486. // Now we are committed to sending the object. As we spool it out,
  487. // check its CRC32 code to make sure there wasn't corruption between
  488. // the verification we did above, and us actually outputting it.
  489. //
  490. out.writeHeader(src, inflatedLength);
  491. long pos = dataOffset;
  492. long cnt = dataLength;
  493. while (cnt > 0) {
  494. final int n = (int) Math.min(cnt, buf.length);
  495. readFully(pos, buf, 0, n, ctx);
  496. if (validate)
  497. crc2.update(buf, 0, n);
  498. out.write(buf, 0, n);
  499. pos += n;
  500. cnt -= n;
  501. }
  502. if (validate && crc2.getValue() != expectedCRC) {
  503. throw new CorruptObjectException(MessageFormat.format(
  504. JGitText.get().objectAtHasBadZlibStream,
  505. Long.valueOf(src.offset), getPackName()));
  506. }
  507. }
  508. }
  509. boolean invalid() {
  510. return invalid;
  511. }
  512. void setInvalid() {
  513. invalid = true;
  514. }
  515. private void readFully(long position, byte[] dstbuf, int dstoff, int cnt,
  516. DfsReader ctx) throws IOException {
  517. if (ctx.copy(this, position, dstbuf, dstoff, cnt) != cnt)
  518. throw new EOFException();
  519. }
  520. long alignToBlock(long pos) {
  521. int size = blockSize;
  522. if (size == 0)
  523. size = cache.getBlockSize();
  524. return (pos / size) * size;
  525. }
  526. DfsBlock getOrLoadBlock(long pos, DfsReader ctx) throws IOException {
  527. return cache.getOrLoad(this, pos, ctx);
  528. }
  529. DfsBlock readOneBlock(long pos, DfsReader ctx)
  530. throws IOException {
  531. if (invalid)
  532. throw new PackInvalidException(getPackName());
  533. boolean close = true;
  534. ReadableChannel rc = ctx.db.openPackFile(packDesc);
  535. try {
  536. // If the block alignment is not yet known, discover it. Prefer the
  537. // larger size from either the cache or the file itself.
  538. int size = blockSize;
  539. if (size == 0) {
  540. size = rc.blockSize();
  541. if (size <= 0)
  542. size = cache.getBlockSize();
  543. else if (size < cache.getBlockSize())
  544. size = (cache.getBlockSize() / size) * size;
  545. blockSize = size;
  546. pos = (pos / size) * size;
  547. }
  548. // If the size of the file is not yet known, try to discover it.
  549. // Channels may choose to return -1 to indicate they don't
  550. // know the length yet, in this case read up to the size unit
  551. // given by the caller, then recheck the length.
  552. long len = length;
  553. if (len < 0) {
  554. len = rc.size();
  555. if (0 <= len)
  556. length = len;
  557. }
  558. if (0 <= len && len < pos + size)
  559. size = (int) (len - pos);
  560. if (size <= 0)
  561. throw new EOFException(MessageFormat.format(
  562. DfsText.get().shortReadOfBlock, Long.valueOf(pos),
  563. getPackName(), Long.valueOf(0), Long.valueOf(0)));
  564. byte[] buf = new byte[size];
  565. rc.position(pos);
  566. int cnt = IO.read(rc, buf, 0, size);
  567. if (cnt != size) {
  568. if (0 <= len) {
  569. throw new EOFException(MessageFormat.format(
  570. DfsText.get().shortReadOfBlock,
  571. Long.valueOf(pos),
  572. getPackName(),
  573. Integer.valueOf(size),
  574. Integer.valueOf(cnt)));
  575. }
  576. // Assume the entire thing was read in a single shot, compact
  577. // the buffer to only the space required.
  578. byte[] n = new byte[cnt];
  579. System.arraycopy(buf, 0, n, 0, n.length);
  580. buf = n;
  581. } else if (len < 0) {
  582. // With no length at the start of the read, the channel should
  583. // have the length available at the end.
  584. length = len = rc.size();
  585. }
  586. DfsBlock v = new DfsBlock(key, pos, buf);
  587. if (v.end < len)
  588. close = !cache.readAhead(rc, key, size, v.end, len, ctx);
  589. return v;
  590. } finally {
  591. if (close)
  592. rc.close();
  593. }
  594. }
  595. ObjectLoader load(DfsReader ctx, long pos)
  596. throws IOException {
  597. try {
  598. final byte[] ib = ctx.tempId;
  599. Delta delta = null;
  600. byte[] data = null;
  601. int type = Constants.OBJ_BAD;
  602. boolean cached = false;
  603. SEARCH: for (;;) {
  604. readFully(pos, ib, 0, 20, ctx);
  605. int c = ib[0] & 0xff;
  606. final int typeCode = (c >> 4) & 7;
  607. long sz = c & 15;
  608. int shift = 4;
  609. int p = 1;
  610. while ((c & 0x80) != 0) {
  611. c = ib[p++] & 0xff;
  612. sz += ((long) (c & 0x7f)) << shift;
  613. shift += 7;
  614. }
  615. switch (typeCode) {
  616. case Constants.OBJ_COMMIT:
  617. case Constants.OBJ_TREE:
  618. case Constants.OBJ_BLOB:
  619. case Constants.OBJ_TAG: {
  620. if (delta != null) {
  621. data = decompress(pos + p, (int) sz, ctx);
  622. type = typeCode;
  623. break SEARCH;
  624. }
  625. if (sz < ctx.getStreamFileThreshold()) {
  626. data = decompress(pos + p, (int) sz, ctx);
  627. if (data != null)
  628. return new ObjectLoader.SmallObject(typeCode, data);
  629. }
  630. return new LargePackedWholeObject(typeCode, sz, pos, p, this, ctx.db);
  631. }
  632. case Constants.OBJ_OFS_DELTA: {
  633. c = ib[p++] & 0xff;
  634. long base = c & 127;
  635. while ((c & 128) != 0) {
  636. base += 1;
  637. c = ib[p++] & 0xff;
  638. base <<= 7;
  639. base += (c & 127);
  640. }
  641. base = pos - base;
  642. delta = new Delta(delta, pos, (int) sz, p, base);
  643. if (sz != delta.deltaSize)
  644. break SEARCH;
  645. DeltaBaseCache.Entry e = ctx.getDeltaBaseCache().get(key, base);
  646. if (e != null) {
  647. type = e.type;
  648. data = e.data;
  649. cached = true;
  650. break SEARCH;
  651. }
  652. pos = base;
  653. continue SEARCH;
  654. }
  655. case Constants.OBJ_REF_DELTA: {
  656. readFully(pos + p, ib, 0, 20, ctx);
  657. long base = findDeltaBase(ctx, ObjectId.fromRaw(ib));
  658. delta = new Delta(delta, pos, (int) sz, p + 20, base);
  659. if (sz != delta.deltaSize)
  660. break SEARCH;
  661. DeltaBaseCache.Entry e = ctx.getDeltaBaseCache().get(key, base);
  662. if (e != null) {
  663. type = e.type;
  664. data = e.data;
  665. cached = true;
  666. break SEARCH;
  667. }
  668. pos = base;
  669. continue SEARCH;
  670. }
  671. default:
  672. throw new IOException(MessageFormat.format(
  673. JGitText.get().unknownObjectType, Integer.valueOf(typeCode)));
  674. }
  675. }
  676. // At this point there is at least one delta to apply to data.
  677. // (Whole objects with no deltas to apply return early above.)
  678. if (data == null)
  679. throw new LargeObjectException();
  680. do {
  681. // Cache only the base immediately before desired object.
  682. if (cached)
  683. cached = false;
  684. else if (delta.next == null)
  685. ctx.getDeltaBaseCache().put(key, delta.basePos, type, data);
  686. pos = delta.deltaPos;
  687. byte[] cmds = decompress(pos + delta.hdrLen, delta.deltaSize, ctx);
  688. if (cmds == null) {
  689. data = null; // Discard base in case of OutOfMemoryError
  690. throw new LargeObjectException();
  691. }
  692. final long sz = BinaryDelta.getResultSize(cmds);
  693. if (Integer.MAX_VALUE <= sz)
  694. throw new LargeObjectException.ExceedsByteArrayLimit();
  695. final byte[] result;
  696. try {
  697. result = new byte[(int) sz];
  698. } catch (OutOfMemoryError tooBig) {
  699. data = null; // Discard base in case of OutOfMemoryError
  700. cmds = null;
  701. throw new LargeObjectException.OutOfMemory(tooBig);
  702. }
  703. BinaryDelta.apply(data, cmds, result);
  704. data = result;
  705. delta = delta.next;
  706. } while (delta != null);
  707. return new ObjectLoader.SmallObject(type, data);
  708. } catch (DataFormatException dfe) {
  709. CorruptObjectException coe = new CorruptObjectException(
  710. MessageFormat.format(
  711. JGitText.get().objectAtHasBadZlibStream, Long.valueOf(pos),
  712. getPackName()));
  713. coe.initCause(dfe);
  714. throw coe;
  715. }
  716. }
  717. private long findDeltaBase(DfsReader ctx, ObjectId baseId)
  718. throws IOException, MissingObjectException {
  719. long ofs = idx(ctx).findOffset(baseId);
  720. if (ofs < 0)
  721. throw new MissingObjectException(baseId,
  722. JGitText.get().missingDeltaBase);
  723. return ofs;
  724. }
  725. private static class Delta {
  726. /** Child that applies onto this object. */
  727. final Delta next;
  728. /** Offset of the delta object. */
  729. final long deltaPos;
  730. /** Size of the inflated delta stream. */
  731. final int deltaSize;
  732. /** Total size of the delta's pack entry header (including base). */
  733. final int hdrLen;
  734. /** Offset of the base object this delta applies onto. */
  735. final long basePos;
  736. Delta(Delta next, long ofs, int sz, int hdrLen, long baseOffset) {
  737. this.next = next;
  738. this.deltaPos = ofs;
  739. this.deltaSize = sz;
  740. this.hdrLen = hdrLen;
  741. this.basePos = baseOffset;
  742. }
  743. }
  744. byte[] getDeltaHeader(DfsReader wc, long pos)
  745. throws IOException, DataFormatException {
  746. // The delta stream starts as two variable length integers. If we
  747. // assume they are 64 bits each, we need 16 bytes to encode them,
  748. // plus 2 extra bytes for the variable length overhead. So 18 is
  749. // the longest delta instruction header.
  750. //
  751. final byte[] hdr = new byte[32];
  752. wc.inflate(this, pos, hdr, true /* header only */);
  753. return hdr;
  754. }
  755. int getObjectType(DfsReader ctx, long pos) throws IOException {
  756. final byte[] ib = ctx.tempId;
  757. for (;;) {
  758. readFully(pos, ib, 0, 20, ctx);
  759. int c = ib[0] & 0xff;
  760. final int type = (c >> 4) & 7;
  761. switch (type) {
  762. case Constants.OBJ_COMMIT:
  763. case Constants.OBJ_TREE:
  764. case Constants.OBJ_BLOB:
  765. case Constants.OBJ_TAG:
  766. return type;
  767. case Constants.OBJ_OFS_DELTA: {
  768. int p = 1;
  769. while ((c & 0x80) != 0)
  770. c = ib[p++] & 0xff;
  771. c = ib[p++] & 0xff;
  772. long ofs = c & 127;
  773. while ((c & 128) != 0) {
  774. ofs += 1;
  775. c = ib[p++] & 0xff;
  776. ofs <<= 7;
  777. ofs += (c & 127);
  778. }
  779. pos = pos - ofs;
  780. continue;
  781. }
  782. case Constants.OBJ_REF_DELTA: {
  783. int p = 1;
  784. while ((c & 0x80) != 0)
  785. c = ib[p++] & 0xff;
  786. readFully(pos + p, ib, 0, 20, ctx);
  787. pos = findDeltaBase(ctx, ObjectId.fromRaw(ib));
  788. continue;
  789. }
  790. default:
  791. throw new IOException(MessageFormat.format(
  792. JGitText.get().unknownObjectType, Integer.valueOf(type)));
  793. }
  794. }
  795. }
  796. long getObjectSize(DfsReader ctx, AnyObjectId id) throws IOException {
  797. final long offset = idx(ctx).findOffset(id);
  798. return 0 < offset ? getObjectSize(ctx, offset) : -1;
  799. }
  800. long getObjectSize(DfsReader ctx, long pos)
  801. throws IOException {
  802. final byte[] ib = ctx.tempId;
  803. readFully(pos, ib, 0, 20, ctx);
  804. int c = ib[0] & 0xff;
  805. final int type = (c >> 4) & 7;
  806. long sz = c & 15;
  807. int shift = 4;
  808. int p = 1;
  809. while ((c & 0x80) != 0) {
  810. c = ib[p++] & 0xff;
  811. sz += ((long) (c & 0x7f)) << shift;
  812. shift += 7;
  813. }
  814. long deltaAt;
  815. switch (type) {
  816. case Constants.OBJ_COMMIT:
  817. case Constants.OBJ_TREE:
  818. case Constants.OBJ_BLOB:
  819. case Constants.OBJ_TAG:
  820. return sz;
  821. case Constants.OBJ_OFS_DELTA:
  822. c = ib[p++] & 0xff;
  823. while ((c & 128) != 0)
  824. c = ib[p++] & 0xff;
  825. deltaAt = pos + p;
  826. break;
  827. case Constants.OBJ_REF_DELTA:
  828. deltaAt = pos + p + 20;
  829. break;
  830. default:
  831. throw new IOException(MessageFormat.format(
  832. JGitText.get().unknownObjectType, Integer.valueOf(type)));
  833. }
  834. try {
  835. return BinaryDelta.getResultSize(getDeltaHeader(ctx, deltaAt));
  836. } catch (DataFormatException dfe) {
  837. CorruptObjectException coe = new CorruptObjectException(
  838. MessageFormat.format(
  839. JGitText.get().objectAtHasBadZlibStream, Long.valueOf(pos),
  840. getPackName()));
  841. coe.initCause(dfe);
  842. throw coe;
  843. }
  844. }
  845. void representation(DfsReader ctx, DfsObjectRepresentation r)
  846. throws IOException {
  847. final long pos = r.offset;
  848. final byte[] ib = ctx.tempId;
  849. readFully(pos, ib, 0, 20, ctx);
  850. int c = ib[0] & 0xff;
  851. int p = 1;
  852. final int typeCode = (c >> 4) & 7;
  853. while ((c & 0x80) != 0)
  854. c = ib[p++] & 0xff;
  855. long len = (getReverseIdx(ctx).findNextOffset(pos, length - 20) - pos);
  856. switch (typeCode) {
  857. case Constants.OBJ_COMMIT:
  858. case Constants.OBJ_TREE:
  859. case Constants.OBJ_BLOB:
  860. case Constants.OBJ_TAG:
  861. r.format = StoredObjectRepresentation.PACK_WHOLE;
  862. r.length = len - p;
  863. return;
  864. case Constants.OBJ_OFS_DELTA: {
  865. c = ib[p++] & 0xff;
  866. long ofs = c & 127;
  867. while ((c & 128) != 0) {
  868. ofs += 1;
  869. c = ib[p++] & 0xff;
  870. ofs <<= 7;
  871. ofs += (c & 127);
  872. }
  873. ofs = pos - ofs;
  874. r.format = StoredObjectRepresentation.PACK_DELTA;
  875. r.baseId = findObjectForOffset(ctx, ofs);
  876. r.length = len - p;
  877. return;
  878. }
  879. case Constants.OBJ_REF_DELTA: {
  880. len -= p;
  881. len -= Constants.OBJECT_ID_LENGTH;
  882. readFully(pos + p, ib, 0, 20, ctx);
  883. ObjectId id = ObjectId.fromRaw(ib);
  884. r.format = StoredObjectRepresentation.PACK_DELTA;
  885. r.baseId = id;
  886. r.length = len;
  887. return;
  888. }
  889. default:
  890. throw new IOException(MessageFormat.format(
  891. JGitText.get().unknownObjectType, Integer.valueOf(typeCode)));
  892. }
  893. }
  894. private boolean isCorrupt(long offset) {
  895. LongList list = corruptObjects;
  896. if (list == null)
  897. return false;
  898. synchronized (list) {
  899. return list.contains(offset);
  900. }
  901. }
  902. private void setCorrupt(long offset) {
  903. LongList list = corruptObjects;
  904. if (list == null) {
  905. synchronized (initLock) {
  906. list = corruptObjects;
  907. if (list == null) {
  908. list = new LongList();
  909. corruptObjects = list;
  910. }
  911. }
  912. }
  913. synchronized (list) {
  914. list.add(offset);
  915. }
  916. }
  917. }