You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

LocalCachedPack.java 4.0KB

PackWriter: Support reuse of entire packs The most expensive part of packing a repository for transport to another system is enumerating all of the objects in the repository. Once this gets to the size of the linux-2.6 repository (1.8 million objects), enumeration can take several CPU minutes and costs a lot of temporary working set memory. Teach PackWriter to efficiently reuse an existing "cached pack" by answering a clone request with a thin pack followed by a larger cached pack appended to the end. This requires the repository owner to first construct the cached pack by hand, and record the tip commits inside of $GIT_DIR/objects/info/cached-packs: cd $GIT_DIR root=$(git rev-parse master) tmp=objects/.tmp-$$ names=$(echo $root | git pack-objects --keep-true-parents --revs $tmp) for n in $names; do chmod a-w $tmp-$n.pack $tmp-$n.idx touch objects/pack/pack-$n.keep mv $tmp-$n.pack objects/pack/pack-$n.pack mv $tmp-$n.idx objects/pack/pack-$n.idx done (echo "+ $root"; for n in $names; do echo "P $n"; done; echo) >>objects/info/cached-packs git repack -a -d When a clone request needs to include $root, the corresponding cached pack will be copied as-is, rather than enumerating all of the objects that are reachable from $root. For a linux-2.6 kernel repository that should be about 376 MiB, the above process creates two packs of 368 MiB and 38 MiB[1]. This is a local disk usage increase of ~26 MiB, due to reduced delta compression between the large cached pack and the smaller recent activity pack. The overhead is similar to 1 full copy of the compressed project sources. With this cached pack in hand, JGit daemon completes a clone request in 1m17s less time, but a slightly larger data transfer (+2.39 MiB): Before: remote: Counting objects: 1861830, done remote: Finding sources: 100% (1861830/1861830) remote: Getting sizes: 100% (88243/88243) remote: Compressing objects: 100% (88184/88184) Receiving objects: 100% (1861830/1861830), 376.01 MiB | 19.01 MiB/s, done. remote: Total 1861830 (delta 4706), reused 1851053 (delta 1553844) Resolving deltas: 100% (1564621/1564621), done. real 3m19.005s After: remote: Counting objects: 1601, done remote: Counting objects: 1828460, done remote: Finding sources: 100% (50475/50475) remote: Getting sizes: 100% (18843/18843) remote: Compressing objects: 100% (7585/7585) remote: Total 1861830 (delta 2407), reused 1856197 (delta 37510) Receiving objects: 100% (1861830/1861830), 378.40 MiB | 31.31 MiB/s, done. Resolving deltas: 100% (1559477/1559477), done. real 2m2.938s Repository owners can periodically refresh their cached packs by repacking their repository, folding all newer objects into a larger cached pack. Since repacking is already considered to be a normal Git maintenance activity, this isn't a very big burden. [1] In this test $root was set back about two weeks. Change-Id: Ib87131d5c4b5e8c5cacb0f4fe16ff4ece554734b Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126
  1. /*
  2. * Copyright (C) 2011, Google Inc.
  3. * and other copyright owners as documented in the project's IP log.
  4. *
  5. * This program and the accompanying materials are made available
  6. * under the terms of the Eclipse Distribution License v1.0 which
  7. * accompanies this distribution, is reproduced below, and is
  8. * available at http://www.eclipse.org/org/documents/edl-v10.php
  9. *
  10. * All rights reserved.
  11. *
  12. * Redistribution and use in source and binary forms, with or
  13. * without modification, are permitted provided that the following
  14. * conditions are met:
  15. *
  16. * - Redistributions of source code must retain the above copyright
  17. * notice, this list of conditions and the following disclaimer.
  18. *
  19. * - Redistributions in binary form must reproduce the above
  20. * copyright notice, this list of conditions and the following
  21. * disclaimer in the documentation and/or other materials provided
  22. * with the distribution.
  23. *
  24. * - Neither the name of the Eclipse Foundation, Inc. nor the
  25. * names of its contributors may be used to endorse or promote
  26. * products derived from this software without specific prior
  27. * written permission.
  28. *
  29. * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
  30. * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
  31. * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  32. * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  33. * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
  34. * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  35. * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  36. * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  37. * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
  38. * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
  39. * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
  40. * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
  41. * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  42. */
  43. package org.eclipse.jgit.storage.file;
  44. import java.io.File;
  45. import java.io.FileNotFoundException;
  46. import java.io.IOException;
  47. import java.util.Collections;
  48. import java.util.HashSet;
  49. import java.util.List;
  50. import java.util.Set;
  51. import org.eclipse.jgit.lib.ObjectId;
  52. import org.eclipse.jgit.storage.pack.CachedPack;
  53. import org.eclipse.jgit.storage.pack.PackOutputStream;
  54. class LocalCachedPack extends CachedPack {
  55. private final ObjectDirectory odb;
  56. private final Set<ObjectId> tips;
  57. private final String[] packNames;
  58. LocalCachedPack(ObjectDirectory odb, Set<ObjectId> tips,
  59. List<String> packNames) {
  60. this.odb = odb;
  61. if (tips.size() == 1)
  62. this.tips = Collections.singleton(tips.iterator().next());
  63. else
  64. this.tips = Collections.unmodifiableSet(tips);
  65. this.packNames = packNames.toArray(new String[packNames.size()]);
  66. }
  67. @Override
  68. public Set<ObjectId> getTips() {
  69. return tips;
  70. }
  71. @Override
  72. public long getObjectCount() throws IOException {
  73. long cnt = 0;
  74. for (String packName : packNames)
  75. cnt += getPackFile(packName).getObjectCount();
  76. return cnt;
  77. }
  78. void copyAsIs(PackOutputStream out, WindowCursor wc) throws IOException {
  79. for (String packName : packNames)
  80. getPackFile(packName).copyPackAsIs(out, wc);
  81. }
  82. @Override
  83. public <T extends ObjectId> Set<ObjectId> hasObject(Iterable<T> toFind)
  84. throws IOException {
  85. PackFile[] packs = new PackFile[packNames.length];
  86. for (int i = 0; i < packNames.length; i++)
  87. packs[i] = getPackFile(packNames[i]);
  88. Set<ObjectId> have = new HashSet<ObjectId>();
  89. for (ObjectId id : toFind) {
  90. for (PackFile pack : packs) {
  91. if (pack.hasObject(id)) {
  92. have.add(id);
  93. break;
  94. }
  95. }
  96. }
  97. return have;
  98. }
  99. private PackFile getPackFile(String packName) throws FileNotFoundException {
  100. for (PackFile pack : odb.getPacks()) {
  101. if (packName.equals(pack.getPackName()))
  102. return pack;
  103. }
  104. throw new FileNotFoundException(getPackFilePath(packName));
  105. }
  106. private String getPackFilePath(String packName) {
  107. final File packDir = new File(odb.getDirectory(), "pack");
  108. return new File(packDir, "pack-" + packName + ".pack").getPath();
  109. }
  110. }