You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

ObjectReuseAsIs.java 8.5KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207
  1. /*
  2. * Copyright (C) 2010, Google Inc. and others
  3. *
  4. * This program and the accompanying materials are made available under the
  5. * terms of the Eclipse Distribution License v. 1.0 which is available at
  6. * https://www.eclipse.org/org/documents/edl-v10.php.
  7. *
  8. * SPDX-License-Identifier: BSD-3-Clause
  9. */
  10. package org.eclipse.jgit.internal.storage.pack;
  11. import java.io.IOException;
  12. import java.util.Collection;
  13. import java.util.List;
  14. import org.eclipse.jgit.errors.MissingObjectException;
  15. import org.eclipse.jgit.errors.StoredObjectRepresentationNotAvailableException;
  16. import org.eclipse.jgit.errors.StoredPackRepresentationNotAvailableException;
  17. import org.eclipse.jgit.lib.AnyObjectId;
  18. import org.eclipse.jgit.lib.BitmapIndex.BitmapBuilder;
  19. import org.eclipse.jgit.lib.ProgressMonitor;
  20. /**
  21. * Extension of {@link org.eclipse.jgit.lib.ObjectReader} that supports reusing
  22. * objects in packs.
  23. * <p>
  24. * {@code ObjectReader} implementations may also optionally implement this
  25. * interface to support
  26. * {@link org.eclipse.jgit.internal.storage.pack.PackWriter} with a means of
  27. * copying an object that is already in pack encoding format directly into the
  28. * output stream, without incurring decompression and recompression overheads.
  29. */
  30. public interface ObjectReuseAsIs {
  31. /**
  32. * Allocate a new {@code PackWriter} state structure for an object.
  33. * <p>
  34. * {@link org.eclipse.jgit.internal.storage.pack.PackWriter} allocates these
  35. * objects to keep track of the per-object state, and how to load the
  36. * objects efficiently into the generated stream. Implementers may subclass
  37. * this type with additional object state, such as to remember what file and
  38. * offset contains the object's pack encoded data.
  39. *
  40. * @param objectId
  41. * the id of the object that will be packed.
  42. * @param type
  43. * the Git type of the object that will be packed.
  44. * @return a new instance for this object.
  45. */
  46. ObjectToPack newObjectToPack(AnyObjectId objectId, int type);
  47. /**
  48. * Select the best object representation for a packer.
  49. * <p>
  50. * Implementations should iterate through all available representations of
  51. * an object, and pass them in turn to the PackWriter though
  52. * {@link org.eclipse.jgit.internal.storage.pack.PackWriter#select(ObjectToPack, StoredObjectRepresentation)}
  53. * so the writer can select the most suitable representation to reuse into
  54. * the output stream.
  55. * <p>
  56. * If the implementation returns CachedPack from
  57. * {@link #getCachedPacksAndUpdate(BitmapBuilder)} it must consider the
  58. * representation of any object that is stored in any of the offered
  59. * CachedPacks. PackWriter relies on this behavior to prune duplicate
  60. * objects out of the pack stream when it selects a CachedPack and the
  61. * object was also reached through the thin-pack enumeration.
  62. * <p>
  63. * The implementation may choose to consider multiple objects at once on
  64. * concurrent threads, but must evaluate all representations of an object
  65. * within the same thread.
  66. *
  67. * @param packer
  68. * the packer that will write the object in the near future.
  69. * @param monitor
  70. * progress monitor, implementation should update the monitor
  71. * once for each item in the iteration when selection is done.
  72. * @param objects
  73. * the objects that are being packed.
  74. * @throws org.eclipse.jgit.errors.MissingObjectException
  75. * there is no representation available for the object, as it is
  76. * no longer in the repository. Packing will abort.
  77. * @throws java.io.IOException
  78. * the repository cannot be accessed. Packing will abort.
  79. */
  80. void selectObjectRepresentation(PackWriter packer,
  81. ProgressMonitor monitor, Iterable<ObjectToPack> objects)
  82. throws IOException, MissingObjectException;
  83. /**
  84. * Write objects to the pack stream in roughly the order given.
  85. *
  86. * {@code PackWriter} invokes this method to write out one or more objects,
  87. * in approximately the order specified by the iteration over the list. A
  88. * simple implementation of this method would just iterate the list and
  89. * output each object:
  90. *
  91. * <pre>
  92. * for (ObjectToPack obj : list)
  93. * out.writeObject(obj)
  94. * </pre>
  95. *
  96. * However more sophisticated implementors may try to perform some (small)
  97. * reordering to access objects that are stored close to each other at
  98. * roughly the same time. Implementations may choose to write objects out of
  99. * order, but this may increase pack file size due to using a larger header
  100. * format to reach a delta base that is later in the stream. It may also
  101. * reduce data locality for the reader, slowing down data access.
  102. *
  103. * Invoking
  104. * {@link org.eclipse.jgit.internal.storage.pack.PackOutputStream#writeObject(ObjectToPack)}
  105. * will cause
  106. * {@link #copyObjectAsIs(PackOutputStream, ObjectToPack, boolean)} to be
  107. * invoked recursively on {@code this} if the current object is scheduled
  108. * for reuse.
  109. *
  110. * @param out
  111. * the stream to write each object to.
  112. * @param list
  113. * the list of objects to write. Objects should be written in
  114. * approximately this order. Implementors may resort the list
  115. * elements in-place during writing if desired.
  116. * @throws java.io.IOException
  117. * the stream cannot be written to, or one or more required
  118. * objects cannot be accessed from the object database.
  119. */
  120. void writeObjects(PackOutputStream out, List<ObjectToPack> list)
  121. throws IOException;
  122. /**
  123. * Output a previously selected representation.
  124. * <p>
  125. * {@code PackWriter} invokes this method only if a representation
  126. * previously given to it by {@code selectObjectRepresentation} was chosen
  127. * for reuse into the output stream. The {@code otp} argument is an instance
  128. * created by this reader's own {@code newObjectToPack}, and the
  129. * representation data saved within it also originated from this reader.
  130. * <p>
  131. * Implementors must write the object header before copying the raw data to
  132. * the output stream. The typical implementation is like:
  133. *
  134. * <pre>
  135. * MyToPack mtp = (MyToPack) otp;
  136. * byte[] raw;
  137. * if (validate)
  138. * raw = validate(mtp); // throw SORNAE here, if at all
  139. * else
  140. * raw = readFast(mtp);
  141. * out.writeHeader(mtp, mtp.inflatedSize);
  142. * out.write(raw);
  143. * </pre>
  144. *
  145. * @param out
  146. * stream the object should be written to.
  147. * @param otp
  148. * the object's saved representation information.
  149. * @param validate
  150. * if true the representation must be validated and not be
  151. * corrupt before being reused. If false, validation may be
  152. * skipped as it will be performed elsewhere in the processing
  153. * pipeline.
  154. * @throws org.eclipse.jgit.errors.StoredObjectRepresentationNotAvailableException
  155. * the previously selected representation is no longer
  156. * available. If thrown before {@code out.writeHeader} the pack
  157. * writer will try to find another representation, and write
  158. * that one instead. If throw after {@code out.writeHeader},
  159. * packing will abort.
  160. * @throws java.io.IOException
  161. * the stream's write method threw an exception. Packing will
  162. * abort.
  163. */
  164. void copyObjectAsIs(PackOutputStream out, ObjectToPack otp,
  165. boolean validate) throws IOException,
  166. StoredObjectRepresentationNotAvailableException;
  167. /**
  168. * Append an entire pack's contents onto the output stream.
  169. * <p>
  170. * The entire pack, excluding its header and trailing footer is sent.
  171. *
  172. * @param out
  173. * stream to append the pack onto.
  174. * @param pack
  175. * the cached pack to send.
  176. * @throws java.io.IOException
  177. * the pack cannot be read, or stream did not accept a write.
  178. */
  179. void copyPackAsIs(PackOutputStream out, CachedPack pack)
  180. throws IOException, StoredPackRepresentationNotAvailableException;
  181. /**
  182. * Obtain the available cached packs that match the bitmap and update
  183. * the bitmap by removing the items that are in the CachedPack.
  184. * <p>
  185. * A cached pack has known starting points and may be sent entirely as-is,
  186. * with almost no effort on the sender's part.
  187. *
  188. * @param needBitmap
  189. * the bitmap that contains all of the objects the client wants.
  190. * @return the available cached packs.
  191. * @throws java.io.IOException
  192. * the cached packs cannot be listed from the repository.
  193. * Callers may choose to ignore this and continue as-if there
  194. * were no cached packs.
  195. */
  196. Collection<CachedPack> getCachedPacksAndUpdate(
  197. BitmapBuilder needBitmap) throws IOException;
  198. }