You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

Limit the range of commits for which bitmaps are created. A bitmap index contains bitmaps for a set of commits in a pack file. Creating a bitmap for every commit is too expensive, so heuristics select the most "important" commits. The most recent commits are the most valuable. To clone a repository only those for the branch tips are needed. When fetching, only commits since the last fetch are needed. The commit selection heuristics generally work, but for some repositories the number of selected commits is prohibitively high. One example is the MSM 3.10 Linux kernel. With over 1 million commits on 2820 branches, the current heuristics resulted in +36k selected commits. Each uncompressed bitmap for that repository is ~413k, making it difficult to complete a GC operation in available memory. The benefit of creating bitmaps over the entire history of a repository like the MSM 3.10 Linux kernel isn't clear. For that repository, most history for the last year appears to be in the last 100k commits. Limiting bitmap commit selection to just those commits reduces the count of selected commits from ~36k to ~10.5k. Dropping bitmaps for older commits does not affect object counting times for clones or for fetches on clients that are reasonably up-to-date. This patch defines a new "bitmapCommitRange" PackConfig parameter to limit the commit selection process when building bitmaps. The range starts with the most recent commit and walks backwards. A range of 10k considers only the 10000 most recent commits. A range of zero creates bitmaps only for branch tips. A range of -1 (the default) does not limit the range--all commits in the pack are used in the commit selection process. Change-Id: Ied92c70cfa0778facc670e0f14a0980bed5e3bfb Signed-off-by: Terry Parker <tparker@google.com>
8 jaren geleden
Limit the range of commits for which bitmaps are created. A bitmap index contains bitmaps for a set of commits in a pack file. Creating a bitmap for every commit is too expensive, so heuristics select the most "important" commits. The most recent commits are the most valuable. To clone a repository only those for the branch tips are needed. When fetching, only commits since the last fetch are needed. The commit selection heuristics generally work, but for some repositories the number of selected commits is prohibitively high. One example is the MSM 3.10 Linux kernel. With over 1 million commits on 2820 branches, the current heuristics resulted in +36k selected commits. Each uncompressed bitmap for that repository is ~413k, making it difficult to complete a GC operation in available memory. The benefit of creating bitmaps over the entire history of a repository like the MSM 3.10 Linux kernel isn't clear. For that repository, most history for the last year appears to be in the last 100k commits. Limiting bitmap commit selection to just those commits reduces the count of selected commits from ~36k to ~10.5k. Dropping bitmaps for older commits does not affect object counting times for clones or for fetches on clients that are reasonably up-to-date. This patch defines a new "bitmapCommitRange" PackConfig parameter to limit the commit selection process when building bitmaps. The range starts with the most recent commit and walks backwards. A range of 10k considers only the 10000 most recent commits. A range of zero creates bitmaps only for branch tips. A range of -1 (the default) does not limit the range--all commits in the pack are used in the commit selection process. Change-Id: Ied92c70cfa0778facc670e0f14a0980bed5e3bfb Signed-off-by: Terry Parker <tparker@google.com>
8 jaren geleden
Limit the range of commits for which bitmaps are created. A bitmap index contains bitmaps for a set of commits in a pack file. Creating a bitmap for every commit is too expensive, so heuristics select the most "important" commits. The most recent commits are the most valuable. To clone a repository only those for the branch tips are needed. When fetching, only commits since the last fetch are needed. The commit selection heuristics generally work, but for some repositories the number of selected commits is prohibitively high. One example is the MSM 3.10 Linux kernel. With over 1 million commits on 2820 branches, the current heuristics resulted in +36k selected commits. Each uncompressed bitmap for that repository is ~413k, making it difficult to complete a GC operation in available memory. The benefit of creating bitmaps over the entire history of a repository like the MSM 3.10 Linux kernel isn't clear. For that repository, most history for the last year appears to be in the last 100k commits. Limiting bitmap commit selection to just those commits reduces the count of selected commits from ~36k to ~10.5k. Dropping bitmaps for older commits does not affect object counting times for clones or for fetches on clients that are reasonably up-to-date. This patch defines a new "bitmapCommitRange" PackConfig parameter to limit the commit selection process when building bitmaps. The range starts with the most recent commit and walks backwards. A range of 10k considers only the 10000 most recent commits. A range of zero creates bitmaps only for branch tips. A range of -1 (the default) does not limit the range--all commits in the pack are used in the commit selection process. Change-Id: Ied92c70cfa0778facc670e0f14a0980bed5e3bfb Signed-off-by: Terry Parker <tparker@google.com>
8 jaren geleden
Limit the range of commits for which bitmaps are created. A bitmap index contains bitmaps for a set of commits in a pack file. Creating a bitmap for every commit is too expensive, so heuristics select the most "important" commits. The most recent commits are the most valuable. To clone a repository only those for the branch tips are needed. When fetching, only commits since the last fetch are needed. The commit selection heuristics generally work, but for some repositories the number of selected commits is prohibitively high. One example is the MSM 3.10 Linux kernel. With over 1 million commits on 2820 branches, the current heuristics resulted in +36k selected commits. Each uncompressed bitmap for that repository is ~413k, making it difficult to complete a GC operation in available memory. The benefit of creating bitmaps over the entire history of a repository like the MSM 3.10 Linux kernel isn't clear. For that repository, most history for the last year appears to be in the last 100k commits. Limiting bitmap commit selection to just those commits reduces the count of selected commits from ~36k to ~10.5k. Dropping bitmaps for older commits does not affect object counting times for clones or for fetches on clients that are reasonably up-to-date. This patch defines a new "bitmapCommitRange" PackConfig parameter to limit the commit selection process when building bitmaps. The range starts with the most recent commit and walks backwards. A range of 10k considers only the 10000 most recent commits. A range of zero creates bitmaps only for branch tips. A range of -1 (the default) does not limit the range--all commits in the pack are used in the commit selection process. Change-Id: Ied92c70cfa0778facc670e0f14a0980bed5e3bfb Signed-off-by: Terry Parker <tparker@google.com>
8 jaren geleden
Limit the range of commits for which bitmaps are created. A bitmap index contains bitmaps for a set of commits in a pack file. Creating a bitmap for every commit is too expensive, so heuristics select the most "important" commits. The most recent commits are the most valuable. To clone a repository only those for the branch tips are needed. When fetching, only commits since the last fetch are needed. The commit selection heuristics generally work, but for some repositories the number of selected commits is prohibitively high. One example is the MSM 3.10 Linux kernel. With over 1 million commits on 2820 branches, the current heuristics resulted in +36k selected commits. Each uncompressed bitmap for that repository is ~413k, making it difficult to complete a GC operation in available memory. The benefit of creating bitmaps over the entire history of a repository like the MSM 3.10 Linux kernel isn't clear. For that repository, most history for the last year appears to be in the last 100k commits. Limiting bitmap commit selection to just those commits reduces the count of selected commits from ~36k to ~10.5k. Dropping bitmaps for older commits does not affect object counting times for clones or for fetches on clients that are reasonably up-to-date. This patch defines a new "bitmapCommitRange" PackConfig parameter to limit the commit selection process when building bitmaps. The range starts with the most recent commit and walks backwards. A range of 10k considers only the 10000 most recent commits. A range of zero creates bitmaps only for branch tips. A range of -1 (the default) does not limit the range--all commits in the pack are used in the commit selection process. Change-Id: Ied92c70cfa0778facc670e0f14a0980bed5e3bfb Signed-off-by: Terry Parker <tparker@google.com>
8 jaren geleden
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014101510161017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041104210431044104510461047104810491050105110521053105410551056105710581059106010611062106310641065106610671068106910701071107210731074107510761077107810791080108110821083108410851086108710881089109010911092109310941095109610971098109911001101110211031104110511061107110811091110111111121113111411151116111711181119112011211122112311241125112611271128112911301131113211331134113511361137113811391140114111421143114411451146114711481149115011511152115311541155115611571158115911601161116211631164116511661167116811691170117111721173117411751176117711781179118011811182118311841185118611871188118911901191119211931194119511961197119811991200120112021203120412051206120712081209121012111212121312141215121612171218121912201221122212231224122512261227122812291230123112321233123412351236123712381239124012411242124312441245124612471248124912501251125212531254125512561257125812591260126112621263126412651266126712681269127012711272127312741275127612771278127912801281128212831284128512861287128812891290129112921293129412951296129712981299130013011302130313041305130613071308130913101311131213131314131513161317131813191320132113221323132413251326132713281329133013311332133313341335133613371338133913401341134213431344134513461347134813491350135113521353135413551356135713581359136013611362136313641365136613671368136913701371137213731374137513761377137813791380138113821383138413851386138713881389139013911392139313941395139613971398139914001401140214031404140514061407140814091410141114121413141414151416141714181419142014211422142314241425142614271428142914301431143214331434143514361437143814391440144114421443144414451446144714481449145014511452145314541455145614571458145914601461146214631464146514661467146814691470147114721473147414751476147714781479148014811482148314841485148614871488148914901491149214931494149514961497149814991500150115021503150415051506150715081509151015111512151315141515151615171518151915201521152215231524152515261527152815291530153115321533153415351536153715381539154015411542154315441545154615471548154915501551155215531554155515561557155815591560156115621563156415651566156715681569157015711572157315741575157615771578157915801581158215831584158515861587158815891590
  1. /*
  2. * Copyright (C) 2012, Christian Halstrick <christian.halstrick@sap.com>
  3. * Copyright (C) 2011, Shawn O. Pearce <spearce@spearce.org>
  4. * and other copyright owners as documented in the project's IP log.
  5. *
  6. * This program and the accompanying materials are made available
  7. * under the terms of the Eclipse Distribution License v1.0 which
  8. * accompanies this distribution, is reproduced below, and is
  9. * available at http://www.eclipse.org/org/documents/edl-v10.php
  10. *
  11. * All rights reserved.
  12. *
  13. * Redistribution and use in source and binary forms, with or
  14. * without modification, are permitted provided that the following
  15. * conditions are met:
  16. *
  17. * - Redistributions of source code must retain the above copyright
  18. * notice, this list of conditions and the following disclaimer.
  19. *
  20. * - Redistributions in binary form must reproduce the above
  21. * copyright notice, this list of conditions and the following
  22. * disclaimer in the documentation and/or other materials provided
  23. * with the distribution.
  24. *
  25. * - Neither the name of the Eclipse Foundation, Inc. nor the
  26. * names of its contributors may be used to endorse or promote
  27. * products derived from this software without specific prior
  28. * written permission.
  29. *
  30. * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
  31. * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
  32. * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  33. * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  34. * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
  35. * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  36. * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  37. * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  38. * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
  39. * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
  40. * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
  41. * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
  42. * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  43. */
  44. package org.eclipse.jgit.internal.storage.file;
  45. import static org.eclipse.jgit.internal.storage.pack.PackExt.BITMAP_INDEX;
  46. import static org.eclipse.jgit.internal.storage.pack.PackExt.INDEX;
  47. import java.io.File;
  48. import java.io.FileOutputStream;
  49. import java.io.IOException;
  50. import java.io.OutputStream;
  51. import java.io.PrintWriter;
  52. import java.io.StringWriter;
  53. import java.nio.channels.Channels;
  54. import java.nio.channels.FileChannel;
  55. import java.nio.file.DirectoryStream;
  56. import java.nio.file.Files;
  57. import java.nio.file.Path;
  58. import java.nio.file.StandardCopyOption;
  59. import java.text.MessageFormat;
  60. import java.text.ParseException;
  61. import java.time.Instant;
  62. import java.time.temporal.ChronoUnit;
  63. import java.util.ArrayList;
  64. import java.util.Collection;
  65. import java.util.Collections;
  66. import java.util.Comparator;
  67. import java.util.Date;
  68. import java.util.HashMap;
  69. import java.util.HashSet;
  70. import java.util.Iterator;
  71. import java.util.LinkedList;
  72. import java.util.List;
  73. import java.util.Map;
  74. import java.util.Objects;
  75. import java.util.Set;
  76. import java.util.TreeMap;
  77. import java.util.concurrent.Callable;
  78. import java.util.concurrent.ExecutionException;
  79. import java.util.concurrent.ExecutorService;
  80. import java.util.concurrent.Future;
  81. import java.util.regex.Pattern;
  82. import java.util.stream.Collectors;
  83. import java.util.stream.Stream;
  84. import org.eclipse.jgit.annotations.NonNull;
  85. import org.eclipse.jgit.api.errors.JGitInternalException;
  86. import org.eclipse.jgit.dircache.DirCacheIterator;
  87. import org.eclipse.jgit.errors.CancelledException;
  88. import org.eclipse.jgit.errors.CorruptObjectException;
  89. import org.eclipse.jgit.errors.IncorrectObjectTypeException;
  90. import org.eclipse.jgit.errors.MissingObjectException;
  91. import org.eclipse.jgit.errors.NoWorkTreeException;
  92. import org.eclipse.jgit.internal.JGitText;
  93. import org.eclipse.jgit.internal.storage.pack.PackExt;
  94. import org.eclipse.jgit.internal.storage.pack.PackWriter;
  95. import org.eclipse.jgit.internal.storage.reftree.RefTreeNames;
  96. import org.eclipse.jgit.lib.ConfigConstants;
  97. import org.eclipse.jgit.lib.Constants;
  98. import org.eclipse.jgit.lib.FileMode;
  99. import org.eclipse.jgit.lib.NullProgressMonitor;
  100. import org.eclipse.jgit.lib.ObjectId;
  101. import org.eclipse.jgit.lib.ObjectIdSet;
  102. import org.eclipse.jgit.lib.ObjectLoader;
  103. import org.eclipse.jgit.lib.ObjectReader;
  104. import org.eclipse.jgit.lib.ProgressMonitor;
  105. import org.eclipse.jgit.lib.Ref;
  106. import org.eclipse.jgit.lib.Ref.Storage;
  107. import org.eclipse.jgit.lib.RefDatabase;
  108. import org.eclipse.jgit.lib.ReflogEntry;
  109. import org.eclipse.jgit.lib.ReflogReader;
  110. import org.eclipse.jgit.lib.internal.WorkQueue;
  111. import org.eclipse.jgit.revwalk.ObjectWalk;
  112. import org.eclipse.jgit.revwalk.RevObject;
  113. import org.eclipse.jgit.revwalk.RevWalk;
  114. import org.eclipse.jgit.storage.pack.PackConfig;
  115. import org.eclipse.jgit.treewalk.TreeWalk;
  116. import org.eclipse.jgit.treewalk.filter.TreeFilter;
  117. import org.eclipse.jgit.util.FileUtils;
  118. import org.eclipse.jgit.util.GitDateParser;
  119. import org.eclipse.jgit.util.SystemReader;
  120. import org.slf4j.Logger;
  121. import org.slf4j.LoggerFactory;
  122. /**
  123. * A garbage collector for git
  124. * {@link org.eclipse.jgit.internal.storage.file.FileRepository}. Instances of
  125. * this class are not thread-safe. Don't use the same instance from multiple
  126. * threads.
  127. *
  128. * This class started as a copy of DfsGarbageCollector from Shawn O. Pearce
  129. * adapted to FileRepositories.
  130. */
  131. public class GC {
  132. private final static Logger LOG = LoggerFactory
  133. .getLogger(GC.class);
  134. private static final String PRUNE_EXPIRE_DEFAULT = "2.weeks.ago"; //$NON-NLS-1$
  135. private static final String PRUNE_PACK_EXPIRE_DEFAULT = "1.hour.ago"; //$NON-NLS-1$
  136. private static final Pattern PATTERN_LOOSE_OBJECT = Pattern
  137. .compile("[0-9a-fA-F]{38}"); //$NON-NLS-1$
  138. private static final String PACK_EXT = "." + PackExt.PACK.getExtension();//$NON-NLS-1$
  139. private static final String BITMAP_EXT = "." //$NON-NLS-1$
  140. + PackExt.BITMAP_INDEX.getExtension();
  141. private static final String INDEX_EXT = "." + PackExt.INDEX.getExtension(); //$NON-NLS-1$
  142. private static final int DEFAULT_AUTOPACKLIMIT = 50;
  143. private static final int DEFAULT_AUTOLIMIT = 6700;
  144. private static volatile ExecutorService executor;
  145. /**
  146. * Set the executor for running auto-gc in the background. If no executor is
  147. * set JGit's own WorkQueue will be used.
  148. *
  149. * @param e
  150. * the executor to be used for running auto-gc
  151. * @since 4.8
  152. */
  153. public static void setExecutor(ExecutorService e) {
  154. executor = e;
  155. }
  156. private final FileRepository repo;
  157. private ProgressMonitor pm;
  158. private long expireAgeMillis = -1;
  159. private Date expire;
  160. private long packExpireAgeMillis = -1;
  161. private Date packExpire;
  162. private PackConfig pconfig = null;
  163. /**
  164. * the refs which existed during the last call to {@link #repack()}. This is
  165. * needed during {@link #prune(Set)} where we can optimize by looking at the
  166. * difference between the current refs and the refs which existed during
  167. * last {@link #repack()}.
  168. */
  169. private Collection<Ref> lastPackedRefs;
  170. /**
  171. * Holds the starting time of the last repack() execution. This is needed in
  172. * prune() to inspect only those reflog entries which have been added since
  173. * last repack().
  174. */
  175. private long lastRepackTime;
  176. /**
  177. * Whether gc should do automatic housekeeping
  178. */
  179. private boolean automatic;
  180. /**
  181. * Whether to run gc in a background thread
  182. */
  183. private boolean background;
  184. /**
  185. * Creates a new garbage collector with default values. An expirationTime of
  186. * two weeks and <code>null</code> as progress monitor will be used.
  187. *
  188. * @param repo
  189. * the repo to work on
  190. */
  191. public GC(FileRepository repo) {
  192. this.repo = repo;
  193. this.pm = NullProgressMonitor.INSTANCE;
  194. }
  195. /**
  196. * Runs a garbage collector on a
  197. * {@link org.eclipse.jgit.internal.storage.file.FileRepository}. It will
  198. * <ul>
  199. * <li>pack loose references into packed-refs</li>
  200. * <li>repack all reachable objects into new pack files and delete the old
  201. * pack files</li>
  202. * <li>prune all loose objects which are now reachable by packs</li>
  203. * </ul>
  204. *
  205. * If {@link #setAuto(boolean)} was set to {@code true} {@code gc} will
  206. * first check whether any housekeeping is required; if not, it exits
  207. * without performing any work.
  208. *
  209. * If {@link #setBackground(boolean)} was set to {@code true}
  210. * {@code collectGarbage} will start the gc in the background, and then
  211. * return immediately. In this case, errors will not be reported except in
  212. * gc.log.
  213. *
  214. * @return the collection of
  215. * {@link org.eclipse.jgit.internal.storage.file.PackFile}'s which
  216. * are newly created
  217. * @throws java.io.IOException
  218. * @throws java.text.ParseException
  219. * If the configuration parameter "gc.pruneexpire" couldn't be
  220. * parsed
  221. */
  222. // TODO(ms): in 5.0 change signature and return Future<Collection<PackFile>>
  223. public Collection<PackFile> gc() throws IOException, ParseException {
  224. final GcLog gcLog = background ? new GcLog(repo) : null;
  225. if (gcLog != null && !gcLog.lock(background)) {
  226. // there is already a background gc running
  227. return Collections.emptyList();
  228. }
  229. Callable<Collection<PackFile>> gcTask = () -> {
  230. try {
  231. Collection<PackFile> newPacks = doGc();
  232. if (automatic && tooManyLooseObjects() && gcLog != null) {
  233. String message = JGitText.get().gcTooManyUnpruned;
  234. gcLog.write(message);
  235. gcLog.commit();
  236. }
  237. return newPacks;
  238. } catch (IOException | ParseException e) {
  239. if (background) {
  240. if (gcLog == null) {
  241. // Lacking a log, there's no way to report this.
  242. return Collections.emptyList();
  243. }
  244. try {
  245. gcLog.write(e.getMessage());
  246. StringWriter sw = new StringWriter();
  247. e.printStackTrace(new PrintWriter(sw));
  248. gcLog.write(sw.toString());
  249. gcLog.commit();
  250. } catch (IOException e2) {
  251. e2.addSuppressed(e);
  252. LOG.error(e2.getMessage(), e2);
  253. }
  254. } else {
  255. throw new JGitInternalException(e.getMessage(), e);
  256. }
  257. } finally {
  258. if (gcLog != null) {
  259. gcLog.unlock();
  260. }
  261. }
  262. return Collections.emptyList();
  263. };
  264. Future<Collection<PackFile>> result = executor().submit(gcTask);
  265. if (background) {
  266. // TODO(ms): in 5.0 change signature and return the Future
  267. return Collections.emptyList();
  268. }
  269. try {
  270. return result.get();
  271. } catch (InterruptedException | ExecutionException e) {
  272. throw new IOException(e);
  273. }
  274. }
  275. private ExecutorService executor() {
  276. return (executor != null) ? executor : WorkQueue.getExecutor();
  277. }
  278. private Collection<PackFile> doGc() throws IOException, ParseException {
  279. if (automatic && !needGc()) {
  280. return Collections.emptyList();
  281. }
  282. pm.start(6 /* tasks */);
  283. packRefs();
  284. // TODO: implement reflog_expire(pm, repo);
  285. Collection<PackFile> newPacks = repack();
  286. prune(Collections.emptySet());
  287. // TODO: implement rerere_gc(pm);
  288. return newPacks;
  289. }
  290. /**
  291. * Loosen objects in a pack file which are not also in the newly-created
  292. * pack files.
  293. *
  294. * @param inserter
  295. * @param reader
  296. * @param pack
  297. * @param existing
  298. * @throws IOException
  299. */
  300. private void loosen(ObjectDirectoryInserter inserter, ObjectReader reader, PackFile pack, HashSet<ObjectId> existing)
  301. throws IOException {
  302. for (PackIndex.MutableEntry entry : pack) {
  303. ObjectId oid = entry.toObjectId();
  304. if (existing.contains(oid)) {
  305. continue;
  306. }
  307. existing.add(oid);
  308. ObjectLoader loader = reader.open(oid);
  309. inserter.insert(loader.getType(),
  310. loader.getSize(),
  311. loader.openStream(),
  312. true /* create this object even though it's a duplicate */);
  313. }
  314. }
  315. /**
  316. * Delete old pack files. What is 'old' is defined by specifying a set of
  317. * old pack files and a set of new pack files. Each pack file contained in
  318. * old pack files but not contained in new pack files will be deleted. If
  319. * preserveOldPacks is set, keep a copy of the pack file in the preserve
  320. * directory. If an expirationDate is set then pack files which are younger
  321. * than the expirationDate will not be deleted nor preserved.
  322. * <p>
  323. * If we're not immediately expiring loose objects, loosen any objects
  324. * in the old pack files which aren't in the new pack files.
  325. *
  326. * @param oldPacks
  327. * @param newPacks
  328. * @throws ParseException
  329. * @throws IOException
  330. */
  331. private void deleteOldPacks(Collection<PackFile> oldPacks,
  332. Collection<PackFile> newPacks) throws ParseException, IOException {
  333. HashSet<ObjectId> ids = new HashSet<>();
  334. for (PackFile pack : newPacks) {
  335. for (PackIndex.MutableEntry entry : pack) {
  336. ids.add(entry.toObjectId());
  337. }
  338. }
  339. ObjectReader reader = repo.newObjectReader();
  340. ObjectDirectory dir = repo.getObjectDatabase();
  341. ObjectDirectoryInserter inserter = dir.newInserter();
  342. boolean shouldLoosen = !"now".equals(getPruneExpireStr()) && //$NON-NLS-1$
  343. getExpireDate() < Long.MAX_VALUE;
  344. prunePreserved();
  345. long packExpireDate = getPackExpireDate();
  346. oldPackLoop: for (PackFile oldPack : oldPacks) {
  347. checkCancelled();
  348. String oldName = oldPack.getPackName();
  349. // check whether an old pack file is also among the list of new
  350. // pack files. Then we must not delete it.
  351. for (PackFile newPack : newPacks)
  352. if (oldName.equals(newPack.getPackName()))
  353. continue oldPackLoop;
  354. if (!oldPack.shouldBeKept()
  355. && repo.getFS().lastModified(
  356. oldPack.getPackFile()) < packExpireDate) {
  357. oldPack.close();
  358. if (shouldLoosen) {
  359. loosen(inserter, reader, oldPack, ids);
  360. }
  361. prunePack(oldName);
  362. }
  363. }
  364. // close the complete object database. That's my only chance to force
  365. // rescanning and to detect that certain pack files are now deleted.
  366. repo.getObjectDatabase().close();
  367. }
  368. /**
  369. * Deletes old pack file, unless 'preserve-oldpacks' is set, in which case it
  370. * moves the pack file to the preserved directory
  371. *
  372. * @param packFile
  373. * @param packName
  374. * @param ext
  375. * @param deleteOptions
  376. * @throws IOException
  377. */
  378. private void removeOldPack(File packFile, String packName, PackExt ext,
  379. int deleteOptions) throws IOException {
  380. if (pconfig != null && pconfig.isPreserveOldPacks()) {
  381. File oldPackDir = repo.getObjectDatabase().getPreservedDirectory();
  382. FileUtils.mkdir(oldPackDir, true);
  383. String oldPackName = "pack-" + packName + ".old-" + ext.getExtension(); //$NON-NLS-1$ //$NON-NLS-2$
  384. File oldPackFile = new File(oldPackDir, oldPackName);
  385. FileUtils.rename(packFile, oldPackFile);
  386. } else {
  387. FileUtils.delete(packFile, deleteOptions);
  388. }
  389. }
  390. /**
  391. * Delete the preserved directory including all pack files within
  392. */
  393. private void prunePreserved() {
  394. if (pconfig != null && pconfig.isPrunePreserved()) {
  395. try {
  396. FileUtils.delete(repo.getObjectDatabase().getPreservedDirectory(),
  397. FileUtils.RECURSIVE | FileUtils.RETRY | FileUtils.SKIP_MISSING);
  398. } catch (IOException e) {
  399. // Deletion of the preserved pack files failed. Silently return.
  400. }
  401. }
  402. }
  403. /**
  404. * Delete files associated with a single pack file. First try to delete the
  405. * ".pack" file because on some platforms the ".pack" file may be locked and
  406. * can't be deleted. In such a case it is better to detect this early and
  407. * give up on deleting files for this packfile. Otherwise we may delete the
  408. * ".index" file and when failing to delete the ".pack" file we are left
  409. * with a ".pack" file without a ".index" file.
  410. *
  411. * @param packName
  412. */
  413. private void prunePack(String packName) {
  414. PackExt[] extensions = PackExt.values();
  415. try {
  416. // Delete the .pack file first and if this fails give up on deleting
  417. // the other files
  418. int deleteOptions = FileUtils.RETRY | FileUtils.SKIP_MISSING;
  419. for (PackExt ext : extensions)
  420. if (PackExt.PACK.equals(ext)) {
  421. File f = nameFor(packName, "." + ext.getExtension()); //$NON-NLS-1$
  422. removeOldPack(f, packName, ext, deleteOptions);
  423. break;
  424. }
  425. // The .pack file has been deleted. Delete as many as the other
  426. // files as you can.
  427. deleteOptions |= FileUtils.IGNORE_ERRORS;
  428. for (PackExt ext : extensions) {
  429. if (!PackExt.PACK.equals(ext)) {
  430. File f = nameFor(packName, "." + ext.getExtension()); //$NON-NLS-1$
  431. removeOldPack(f, packName, ext, deleteOptions);
  432. }
  433. }
  434. } catch (IOException e) {
  435. // Deletion of the .pack file failed. Silently return.
  436. }
  437. }
  438. /**
  439. * Like "git prune-packed" this method tries to prune all loose objects
  440. * which can be found in packs. If certain objects can't be pruned (e.g.
  441. * because the filesystem delete operation fails) this is silently ignored.
  442. *
  443. * @throws java.io.IOException
  444. */
  445. public void prunePacked() throws IOException {
  446. ObjectDirectory objdb = repo.getObjectDatabase();
  447. Collection<PackFile> packs = objdb.getPacks();
  448. File objects = repo.getObjectsDirectory();
  449. String[] fanout = objects.list();
  450. if (fanout != null && fanout.length > 0) {
  451. pm.beginTask(JGitText.get().pruneLoosePackedObjects, fanout.length);
  452. try {
  453. for (String d : fanout) {
  454. checkCancelled();
  455. pm.update(1);
  456. if (d.length() != 2)
  457. continue;
  458. String[] entries = new File(objects, d).list();
  459. if (entries == null)
  460. continue;
  461. for (String e : entries) {
  462. checkCancelled();
  463. if (e.length() != Constants.OBJECT_ID_STRING_LENGTH - 2)
  464. continue;
  465. ObjectId id;
  466. try {
  467. id = ObjectId.fromString(d + e);
  468. } catch (IllegalArgumentException notAnObject) {
  469. // ignoring the file that does not represent loose
  470. // object
  471. continue;
  472. }
  473. boolean found = false;
  474. for (PackFile p : packs) {
  475. checkCancelled();
  476. if (p.hasObject(id)) {
  477. found = true;
  478. break;
  479. }
  480. }
  481. if (found)
  482. FileUtils.delete(objdb.fileFor(id), FileUtils.RETRY
  483. | FileUtils.SKIP_MISSING
  484. | FileUtils.IGNORE_ERRORS);
  485. }
  486. }
  487. } finally {
  488. pm.endTask();
  489. }
  490. }
  491. }
  492. /**
  493. * Like "git prune" this method tries to prune all loose objects which are
  494. * unreferenced. If certain objects can't be pruned (e.g. because the
  495. * filesystem delete operation fails) this is silently ignored.
  496. *
  497. * @param objectsToKeep
  498. * a set of objects which should explicitly not be pruned
  499. * @throws java.io.IOException
  500. * @throws java.text.ParseException
  501. * If the configuration parameter "gc.pruneexpire" couldn't be
  502. * parsed
  503. */
  504. public void prune(Set<ObjectId> objectsToKeep) throws IOException,
  505. ParseException {
  506. long expireDate = getExpireDate();
  507. // Collect all loose objects which are old enough, not referenced from
  508. // the index and not in objectsToKeep
  509. Map<ObjectId, File> deletionCandidates = new HashMap<>();
  510. Set<ObjectId> indexObjects = null;
  511. File objects = repo.getObjectsDirectory();
  512. String[] fanout = objects.list();
  513. if (fanout == null || fanout.length == 0) {
  514. return;
  515. }
  516. pm.beginTask(JGitText.get().pruneLooseUnreferencedObjects,
  517. fanout.length);
  518. try {
  519. for (String d : fanout) {
  520. checkCancelled();
  521. pm.update(1);
  522. if (d.length() != 2)
  523. continue;
  524. File[] entries = new File(objects, d).listFiles();
  525. if (entries == null)
  526. continue;
  527. for (File f : entries) {
  528. checkCancelled();
  529. String fName = f.getName();
  530. if (fName.length() != Constants.OBJECT_ID_STRING_LENGTH - 2)
  531. continue;
  532. if (repo.getFS().lastModified(f) >= expireDate)
  533. continue;
  534. try {
  535. ObjectId id = ObjectId.fromString(d + fName);
  536. if (objectsToKeep.contains(id))
  537. continue;
  538. if (indexObjects == null)
  539. indexObjects = listNonHEADIndexObjects();
  540. if (indexObjects.contains(id))
  541. continue;
  542. deletionCandidates.put(id, f);
  543. } catch (IllegalArgumentException notAnObject) {
  544. // ignoring the file that does not represent loose
  545. // object
  546. }
  547. }
  548. }
  549. } finally {
  550. pm.endTask();
  551. }
  552. if (deletionCandidates.isEmpty()) {
  553. return;
  554. }
  555. checkCancelled();
  556. // From the set of current refs remove all those which have been handled
  557. // during last repack(). Only those refs will survive which have been
  558. // added or modified since the last repack. Only these can save existing
  559. // loose refs from being pruned.
  560. Collection<Ref> newRefs;
  561. if (lastPackedRefs == null || lastPackedRefs.isEmpty())
  562. newRefs = getAllRefs();
  563. else {
  564. Map<String, Ref> last = new HashMap<>();
  565. for (Ref r : lastPackedRefs) {
  566. last.put(r.getName(), r);
  567. }
  568. newRefs = new ArrayList<>();
  569. for (Ref r : getAllRefs()) {
  570. Ref old = last.get(r.getName());
  571. if (!equals(r, old)) {
  572. newRefs.add(r);
  573. }
  574. }
  575. }
  576. if (!newRefs.isEmpty()) {
  577. // There are new/modified refs! Check which loose objects are now
  578. // referenced by these modified refs (or their reflogentries).
  579. // Remove these loose objects
  580. // from the deletionCandidates. When the last candidate is removed
  581. // leave this method.
  582. ObjectWalk w = new ObjectWalk(repo);
  583. try {
  584. for (Ref cr : newRefs) {
  585. checkCancelled();
  586. w.markStart(w.parseAny(cr.getObjectId()));
  587. }
  588. if (lastPackedRefs != null)
  589. for (Ref lpr : lastPackedRefs) {
  590. w.markUninteresting(w.parseAny(lpr.getObjectId()));
  591. }
  592. removeReferenced(deletionCandidates, w);
  593. } finally {
  594. w.dispose();
  595. }
  596. }
  597. if (deletionCandidates.isEmpty())
  598. return;
  599. // Since we have not left the method yet there are still
  600. // deletionCandidates. Last chance for these objects not to be pruned is
  601. // that they are referenced by reflog entries. Even refs which currently
  602. // point to the same object as during last repack() may have
  603. // additional reflog entries not handled during last repack()
  604. ObjectWalk w = new ObjectWalk(repo);
  605. try {
  606. for (Ref ar : getAllRefs())
  607. for (ObjectId id : listRefLogObjects(ar, lastRepackTime)) {
  608. checkCancelled();
  609. w.markStart(w.parseAny(id));
  610. }
  611. if (lastPackedRefs != null)
  612. for (Ref lpr : lastPackedRefs) {
  613. checkCancelled();
  614. w.markUninteresting(w.parseAny(lpr.getObjectId()));
  615. }
  616. removeReferenced(deletionCandidates, w);
  617. } finally {
  618. w.dispose();
  619. }
  620. if (deletionCandidates.isEmpty())
  621. return;
  622. checkCancelled();
  623. // delete all candidates which have survived: these are unreferenced
  624. // loose objects. Make a last check, though, to avoid deleting objects
  625. // that could have been referenced while the candidates list was being
  626. // built (by an incoming push, for example).
  627. Set<File> touchedFanout = new HashSet<>();
  628. for (File f : deletionCandidates.values()) {
  629. if (f.lastModified() < expireDate) {
  630. f.delete();
  631. touchedFanout.add(f.getParentFile());
  632. }
  633. }
  634. for (File f : touchedFanout) {
  635. FileUtils.delete(f,
  636. FileUtils.EMPTY_DIRECTORIES_ONLY | FileUtils.IGNORE_ERRORS);
  637. }
  638. repo.getObjectDatabase().close();
  639. }
  640. private long getExpireDate() throws ParseException {
  641. long expireDate = Long.MAX_VALUE;
  642. if (expire == null && expireAgeMillis == -1) {
  643. String pruneExpireStr = getPruneExpireStr();
  644. if (pruneExpireStr == null)
  645. pruneExpireStr = PRUNE_EXPIRE_DEFAULT;
  646. expire = GitDateParser.parse(pruneExpireStr, null, SystemReader
  647. .getInstance().getLocale());
  648. expireAgeMillis = -1;
  649. }
  650. if (expire != null)
  651. expireDate = expire.getTime();
  652. if (expireAgeMillis != -1)
  653. expireDate = System.currentTimeMillis() - expireAgeMillis;
  654. return expireDate;
  655. }
  656. private String getPruneExpireStr() {
  657. return repo.getConfig().getString(
  658. ConfigConstants.CONFIG_GC_SECTION, null,
  659. ConfigConstants.CONFIG_KEY_PRUNEEXPIRE);
  660. }
  661. private long getPackExpireDate() throws ParseException {
  662. long packExpireDate = Long.MAX_VALUE;
  663. if (packExpire == null && packExpireAgeMillis == -1) {
  664. String prunePackExpireStr = repo.getConfig().getString(
  665. ConfigConstants.CONFIG_GC_SECTION, null,
  666. ConfigConstants.CONFIG_KEY_PRUNEPACKEXPIRE);
  667. if (prunePackExpireStr == null)
  668. prunePackExpireStr = PRUNE_PACK_EXPIRE_DEFAULT;
  669. packExpire = GitDateParser.parse(prunePackExpireStr, null,
  670. SystemReader.getInstance().getLocale());
  671. packExpireAgeMillis = -1;
  672. }
  673. if (packExpire != null)
  674. packExpireDate = packExpire.getTime();
  675. if (packExpireAgeMillis != -1)
  676. packExpireDate = System.currentTimeMillis() - packExpireAgeMillis;
  677. return packExpireDate;
  678. }
  679. /**
  680. * Remove all entries from a map which key is the id of an object referenced
  681. * by the given ObjectWalk
  682. *
  683. * @param id2File
  684. * @param w
  685. * @throws MissingObjectException
  686. * @throws IncorrectObjectTypeException
  687. * @throws IOException
  688. */
  689. private void removeReferenced(Map<ObjectId, File> id2File,
  690. ObjectWalk w) throws MissingObjectException,
  691. IncorrectObjectTypeException, IOException {
  692. RevObject ro = w.next();
  693. while (ro != null) {
  694. checkCancelled();
  695. if (id2File.remove(ro.getId()) != null && id2File.isEmpty()) {
  696. return;
  697. }
  698. ro = w.next();
  699. }
  700. ro = w.nextObject();
  701. while (ro != null) {
  702. checkCancelled();
  703. if (id2File.remove(ro.getId()) != null && id2File.isEmpty()) {
  704. return;
  705. }
  706. ro = w.nextObject();
  707. }
  708. }
  709. private static boolean equals(Ref r1, Ref r2) {
  710. if (r1 == null || r2 == null) {
  711. return false;
  712. }
  713. if (r1.isSymbolic()) {
  714. return r2.isSymbolic() && r1.getTarget().getName()
  715. .equals(r2.getTarget().getName());
  716. }
  717. return !r2.isSymbolic()
  718. && Objects.equals(r1.getObjectId(), r2.getObjectId());
  719. }
  720. /**
  721. * Packs all non-symbolic, loose refs into packed-refs.
  722. *
  723. * @throws java.io.IOException
  724. */
  725. public void packRefs() throws IOException {
  726. Collection<Ref> refs = repo.getRefDatabase()
  727. .getRefsByPrefix(Constants.R_REFS);
  728. List<String> refsToBePacked = new ArrayList<>(refs.size());
  729. pm.beginTask(JGitText.get().packRefs, refs.size());
  730. try {
  731. for (Ref ref : refs) {
  732. checkCancelled();
  733. if (!ref.isSymbolic() && ref.getStorage().isLoose())
  734. refsToBePacked.add(ref.getName());
  735. pm.update(1);
  736. }
  737. ((RefDirectory) repo.getRefDatabase()).pack(refsToBePacked);
  738. } finally {
  739. pm.endTask();
  740. }
  741. }
  742. /**
  743. * Packs all objects which reachable from any of the heads into one pack
  744. * file. Additionally all objects which are not reachable from any head but
  745. * which are reachable from any of the other refs (e.g. tags), special refs
  746. * (e.g. FETCH_HEAD) or index are packed into a separate pack file. Objects
  747. * included in pack files which have a .keep file associated are never
  748. * repacked. All old pack files which existed before are deleted.
  749. *
  750. * @return a collection of the newly created pack files
  751. * @throws java.io.IOException
  752. * when during reading of refs, index, packfiles, objects,
  753. * reflog-entries or during writing to the packfiles
  754. * {@link java.io.IOException} occurs
  755. */
  756. public Collection<PackFile> repack() throws IOException {
  757. Collection<PackFile> toBeDeleted = repo.getObjectDatabase().getPacks();
  758. long time = System.currentTimeMillis();
  759. Collection<Ref> refsBefore = getAllRefs();
  760. Set<ObjectId> allHeadsAndTags = new HashSet<>();
  761. Set<ObjectId> allHeads = new HashSet<>();
  762. Set<ObjectId> allTags = new HashSet<>();
  763. Set<ObjectId> nonHeads = new HashSet<>();
  764. Set<ObjectId> txnHeads = new HashSet<>();
  765. Set<ObjectId> tagTargets = new HashSet<>();
  766. Set<ObjectId> indexObjects = listNonHEADIndexObjects();
  767. RefDatabase refdb = repo.getRefDatabase();
  768. for (Ref ref : refsBefore) {
  769. checkCancelled();
  770. nonHeads.addAll(listRefLogObjects(ref, 0));
  771. if (ref.isSymbolic() || ref.getObjectId() == null) {
  772. continue;
  773. }
  774. if (isHead(ref)) {
  775. allHeads.add(ref.getObjectId());
  776. } else if (isTag(ref)) {
  777. allTags.add(ref.getObjectId());
  778. } else if (RefTreeNames.isRefTree(refdb, ref.getName())) {
  779. txnHeads.add(ref.getObjectId());
  780. } else {
  781. nonHeads.add(ref.getObjectId());
  782. }
  783. if (ref.getPeeledObjectId() != null) {
  784. tagTargets.add(ref.getPeeledObjectId());
  785. }
  786. }
  787. List<ObjectIdSet> excluded = new LinkedList<>();
  788. for (PackFile f : repo.getObjectDatabase().getPacks()) {
  789. checkCancelled();
  790. if (f.shouldBeKept())
  791. excluded.add(f.getIndex());
  792. }
  793. // Don't exclude tags that are also branch tips
  794. allTags.removeAll(allHeads);
  795. allHeadsAndTags.addAll(allHeads);
  796. allHeadsAndTags.addAll(allTags);
  797. // Hoist all branch tips and tags earlier in the pack file
  798. tagTargets.addAll(allHeadsAndTags);
  799. nonHeads.addAll(indexObjects);
  800. // Combine the GC_REST objects into the GC pack if requested
  801. if (pconfig != null && pconfig.getSinglePack()) {
  802. allHeadsAndTags.addAll(nonHeads);
  803. nonHeads.clear();
  804. }
  805. List<PackFile> ret = new ArrayList<>(2);
  806. PackFile heads = null;
  807. if (!allHeadsAndTags.isEmpty()) {
  808. heads = writePack(allHeadsAndTags, PackWriter.NONE, allTags,
  809. tagTargets, excluded);
  810. if (heads != null) {
  811. ret.add(heads);
  812. excluded.add(0, heads.getIndex());
  813. }
  814. }
  815. if (!nonHeads.isEmpty()) {
  816. PackFile rest = writePack(nonHeads, allHeadsAndTags, PackWriter.NONE,
  817. tagTargets, excluded);
  818. if (rest != null)
  819. ret.add(rest);
  820. }
  821. if (!txnHeads.isEmpty()) {
  822. PackFile txn = writePack(txnHeads, PackWriter.NONE, PackWriter.NONE,
  823. null, excluded);
  824. if (txn != null)
  825. ret.add(txn);
  826. }
  827. try {
  828. deleteOldPacks(toBeDeleted, ret);
  829. } catch (ParseException e) {
  830. // TODO: the exception has to be wrapped into an IOException because
  831. // throwing the ParseException directly would break the API, instead
  832. // we should throw a ConfigInvalidException
  833. throw new IOException(e);
  834. }
  835. prunePacked();
  836. deleteEmptyRefsFolders();
  837. deleteOrphans();
  838. deleteTempPacksIdx();
  839. lastPackedRefs = refsBefore;
  840. lastRepackTime = time;
  841. return ret;
  842. }
  843. private static boolean isHead(Ref ref) {
  844. return ref.getName().startsWith(Constants.R_HEADS);
  845. }
  846. private static boolean isTag(Ref ref) {
  847. return ref.getName().startsWith(Constants.R_TAGS);
  848. }
  849. private void deleteEmptyRefsFolders() throws IOException {
  850. Path refs = repo.getDirectory().toPath().resolve("refs"); //$NON-NLS-1$
  851. try (Stream<Path> entries = Files.list(refs)) {
  852. Iterator<Path> iterator = entries.iterator();
  853. while (iterator.hasNext()) {
  854. try (Stream<Path> s = Files.list(iterator.next())) {
  855. s.forEach(this::deleteDir);
  856. }
  857. }
  858. }
  859. }
  860. private void deleteDir(Path dir) {
  861. try (Stream<Path> dirs = Files.walk(dir)) {
  862. dirs.filter(this::isDirectory).sorted(Comparator.reverseOrder())
  863. .forEach(this::delete);
  864. } catch (IOException e) {
  865. LOG.error(e.getMessage(), e);
  866. }
  867. }
  868. private boolean isDirectory(Path p) {
  869. return p.toFile().isDirectory();
  870. }
  871. private boolean delete(Path d) {
  872. try {
  873. // Avoid deleting a folder that was just created so that concurrent
  874. // operations trying to create a reference are not impacted
  875. Instant threshold = Instant.now().minus(30, ChronoUnit.SECONDS);
  876. Instant lastModified = Files.getLastModifiedTime(d).toInstant();
  877. if (lastModified.isBefore(threshold)) {
  878. // If the folder is not empty, the delete operation will fail
  879. // silently. This is a cheaper alternative to filtering the
  880. // stream in the calling method.
  881. return d.toFile().delete();
  882. }
  883. } catch (IOException e) {
  884. LOG.error(e.getMessage(), e);
  885. }
  886. return false;
  887. }
  888. /**
  889. * Deletes orphans
  890. * <p>
  891. * A file is considered an orphan if it is either a "bitmap" or an index
  892. * file, and its corresponding pack file is missing in the list.
  893. * </p>
  894. */
  895. private void deleteOrphans() {
  896. Path packDir = repo.getObjectDatabase().getPackDirectory().toPath();
  897. List<String> fileNames = null;
  898. try (Stream<Path> files = Files.list(packDir)) {
  899. fileNames = files.map(path -> path.getFileName().toString())
  900. .filter(name -> (name.endsWith(PACK_EXT)
  901. || name.endsWith(BITMAP_EXT)
  902. || name.endsWith(INDEX_EXT)))
  903. .sorted(Collections.reverseOrder())
  904. .collect(Collectors.toList());
  905. } catch (IOException e1) {
  906. // ignore
  907. }
  908. if (fileNames == null) {
  909. return;
  910. }
  911. String base = null;
  912. for (String n : fileNames) {
  913. if (n.endsWith(PACK_EXT)) {
  914. base = n.substring(0, n.lastIndexOf('.'));
  915. } else {
  916. if (base == null || !n.startsWith(base)) {
  917. try {
  918. Files.delete(packDir.resolve(n));
  919. } catch (IOException e) {
  920. LOG.error(e.getMessage(), e);
  921. }
  922. }
  923. }
  924. }
  925. }
  926. private void deleteTempPacksIdx() {
  927. Path packDir = repo.getObjectDatabase().getPackDirectory().toPath();
  928. Instant threshold = Instant.now().minus(1, ChronoUnit.DAYS);
  929. try (DirectoryStream<Path> stream =
  930. Files.newDirectoryStream(packDir, "gc_*_tmp")) { //$NON-NLS-1$
  931. stream.forEach(t -> {
  932. try {
  933. Instant lastModified = Files.getLastModifiedTime(t)
  934. .toInstant();
  935. if (lastModified.isBefore(threshold)) {
  936. Files.deleteIfExists(t);
  937. }
  938. } catch (IOException e) {
  939. LOG.error(e.getMessage(), e);
  940. }
  941. });
  942. } catch (IOException e) {
  943. LOG.error(e.getMessage(), e);
  944. }
  945. }
  946. /**
  947. * @param ref
  948. * the ref which log should be inspected
  949. * @param minTime only reflog entries not older then this time are processed
  950. * @return the {@link ObjectId}s contained in the reflog
  951. * @throws IOException
  952. */
  953. private Set<ObjectId> listRefLogObjects(Ref ref, long minTime) throws IOException {
  954. ReflogReader reflogReader = repo.getReflogReader(ref.getName());
  955. if (reflogReader == null) {
  956. return Collections.emptySet();
  957. }
  958. List<ReflogEntry> rlEntries = reflogReader
  959. .getReverseEntries();
  960. if (rlEntries == null || rlEntries.isEmpty())
  961. return Collections.emptySet();
  962. Set<ObjectId> ret = new HashSet<>();
  963. for (ReflogEntry e : rlEntries) {
  964. if (e.getWho().getWhen().getTime() < minTime)
  965. break;
  966. ObjectId newId = e.getNewId();
  967. if (newId != null && !ObjectId.zeroId().equals(newId))
  968. ret.add(newId);
  969. ObjectId oldId = e.getOldId();
  970. if (oldId != null && !ObjectId.zeroId().equals(oldId))
  971. ret.add(oldId);
  972. }
  973. return ret;
  974. }
  975. /**
  976. * Returns a collection of all refs and additional refs.
  977. *
  978. * Additional refs which don't start with "refs/" are not returned because
  979. * they should not save objects from being garbage collected. Examples for
  980. * such references are ORIG_HEAD, MERGE_HEAD, FETCH_HEAD and
  981. * CHERRY_PICK_HEAD.
  982. *
  983. * @return a collection of refs pointing to live objects.
  984. * @throws IOException
  985. */
  986. private Collection<Ref> getAllRefs() throws IOException {
  987. RefDatabase refdb = repo.getRefDatabase();
  988. Collection<Ref> refs = refdb.getRefs();
  989. List<Ref> addl = refdb.getAdditionalRefs();
  990. if (!addl.isEmpty()) {
  991. List<Ref> all = new ArrayList<>(refs.size() + addl.size());
  992. all.addAll(refs);
  993. // add additional refs which start with refs/
  994. for (Ref r : addl) {
  995. checkCancelled();
  996. if (r.getName().startsWith(Constants.R_REFS)) {
  997. all.add(r);
  998. }
  999. }
  1000. return all;
  1001. }
  1002. return refs;
  1003. }
  1004. /**
  1005. * Return a list of those objects in the index which differ from whats in
  1006. * HEAD
  1007. *
  1008. * @return a set of ObjectIds of changed objects in the index
  1009. * @throws IOException
  1010. * @throws CorruptObjectException
  1011. * @throws NoWorkTreeException
  1012. */
  1013. private Set<ObjectId> listNonHEADIndexObjects()
  1014. throws CorruptObjectException, IOException {
  1015. if (repo.isBare()) {
  1016. return Collections.emptySet();
  1017. }
  1018. try (TreeWalk treeWalk = new TreeWalk(repo)) {
  1019. treeWalk.addTree(new DirCacheIterator(repo.readDirCache()));
  1020. ObjectId headID = repo.resolve(Constants.HEAD);
  1021. if (headID != null) {
  1022. try (RevWalk revWalk = new RevWalk(repo)) {
  1023. treeWalk.addTree(revWalk.parseTree(headID));
  1024. }
  1025. }
  1026. treeWalk.setFilter(TreeFilter.ANY_DIFF);
  1027. treeWalk.setRecursive(true);
  1028. Set<ObjectId> ret = new HashSet<>();
  1029. while (treeWalk.next()) {
  1030. checkCancelled();
  1031. ObjectId objectId = treeWalk.getObjectId(0);
  1032. switch (treeWalk.getRawMode(0) & FileMode.TYPE_MASK) {
  1033. case FileMode.TYPE_MISSING:
  1034. case FileMode.TYPE_GITLINK:
  1035. continue;
  1036. case FileMode.TYPE_TREE:
  1037. case FileMode.TYPE_FILE:
  1038. case FileMode.TYPE_SYMLINK:
  1039. ret.add(objectId);
  1040. continue;
  1041. default:
  1042. throw new IOException(MessageFormat.format(
  1043. JGitText.get().corruptObjectInvalidMode3,
  1044. String.format("%o", //$NON-NLS-1$
  1045. Integer.valueOf(treeWalk.getRawMode(0))),
  1046. (objectId == null) ? "null" : objectId.name(), //$NON-NLS-1$
  1047. treeWalk.getPathString(), //
  1048. repo.getIndexFile()));
  1049. }
  1050. }
  1051. return ret;
  1052. }
  1053. }
  1054. private PackFile writePack(@NonNull Set<? extends ObjectId> want,
  1055. @NonNull Set<? extends ObjectId> have, @NonNull Set<ObjectId> tags,
  1056. Set<ObjectId> tagTargets, List<ObjectIdSet> excludeObjects)
  1057. throws IOException {
  1058. checkCancelled();
  1059. File tmpPack = null;
  1060. Map<PackExt, File> tmpExts = new TreeMap<>((o1, o2) -> {
  1061. // INDEX entries must be returned last, so the pack
  1062. // scanner does pick up the new pack until all the
  1063. // PackExt entries have been written.
  1064. if (o1 == o2) {
  1065. return 0;
  1066. }
  1067. if (o1 == PackExt.INDEX) {
  1068. return 1;
  1069. }
  1070. if (o2 == PackExt.INDEX) {
  1071. return -1;
  1072. }
  1073. return Integer.signum(o1.hashCode() - o2.hashCode());
  1074. });
  1075. try (PackWriter pw = new PackWriter(
  1076. (pconfig == null) ? new PackConfig(repo) : pconfig,
  1077. repo.newObjectReader())) {
  1078. // prepare the PackWriter
  1079. pw.setDeltaBaseAsOffset(true);
  1080. pw.setReuseDeltaCommits(false);
  1081. if (tagTargets != null) {
  1082. pw.setTagTargets(tagTargets);
  1083. }
  1084. if (excludeObjects != null)
  1085. for (ObjectIdSet idx : excludeObjects)
  1086. pw.excludeObjects(idx);
  1087. pw.preparePack(pm, want, have, PackWriter.NONE, tags);
  1088. if (pw.getObjectCount() == 0)
  1089. return null;
  1090. checkCancelled();
  1091. // create temporary files
  1092. String id = pw.computeName().getName();
  1093. File packdir = repo.getObjectDatabase().getPackDirectory();
  1094. tmpPack = File.createTempFile("gc_", ".pack_tmp", packdir); //$NON-NLS-1$ //$NON-NLS-2$
  1095. final String tmpBase = tmpPack.getName()
  1096. .substring(0, tmpPack.getName().lastIndexOf('.'));
  1097. File tmpIdx = new File(packdir, tmpBase + ".idx_tmp"); //$NON-NLS-1$
  1098. tmpExts.put(INDEX, tmpIdx);
  1099. if (!tmpIdx.createNewFile())
  1100. throw new IOException(MessageFormat.format(
  1101. JGitText.get().cannotCreateIndexfile, tmpIdx.getPath()));
  1102. // write the packfile
  1103. try (FileOutputStream fos = new FileOutputStream(tmpPack);
  1104. FileChannel channel = fos.getChannel();
  1105. OutputStream channelStream = Channels
  1106. .newOutputStream(channel)) {
  1107. pw.writePack(pm, pm, channelStream);
  1108. channel.force(true);
  1109. }
  1110. // write the packindex
  1111. try (FileOutputStream fos = new FileOutputStream(tmpIdx);
  1112. FileChannel idxChannel = fos.getChannel();
  1113. OutputStream idxStream = Channels
  1114. .newOutputStream(idxChannel)) {
  1115. pw.writeIndex(idxStream);
  1116. idxChannel.force(true);
  1117. }
  1118. if (pw.prepareBitmapIndex(pm)) {
  1119. File tmpBitmapIdx = new File(packdir, tmpBase + ".bitmap_tmp"); //$NON-NLS-1$
  1120. tmpExts.put(BITMAP_INDEX, tmpBitmapIdx);
  1121. if (!tmpBitmapIdx.createNewFile())
  1122. throw new IOException(MessageFormat.format(
  1123. JGitText.get().cannotCreateIndexfile,
  1124. tmpBitmapIdx.getPath()));
  1125. try (FileOutputStream fos = new FileOutputStream(tmpBitmapIdx);
  1126. FileChannel idxChannel = fos.getChannel();
  1127. OutputStream idxStream = Channels
  1128. .newOutputStream(idxChannel)) {
  1129. pw.writeBitmapIndex(idxStream);
  1130. idxChannel.force(true);
  1131. }
  1132. }
  1133. // rename the temporary files to real files
  1134. File realPack = nameFor(id, ".pack"); //$NON-NLS-1$
  1135. repo.getObjectDatabase().closeAllPackHandles(realPack);
  1136. tmpPack.setReadOnly();
  1137. FileUtils.rename(tmpPack, realPack, StandardCopyOption.ATOMIC_MOVE);
  1138. for (Map.Entry<PackExt, File> tmpEntry : tmpExts.entrySet()) {
  1139. File tmpExt = tmpEntry.getValue();
  1140. tmpExt.setReadOnly();
  1141. File realExt = nameFor(id,
  1142. "." + tmpEntry.getKey().getExtension()); //$NON-NLS-1$
  1143. try {
  1144. FileUtils.rename(tmpExt, realExt,
  1145. StandardCopyOption.ATOMIC_MOVE);
  1146. } catch (IOException e) {
  1147. File newExt = new File(realExt.getParentFile(),
  1148. realExt.getName() + ".new"); //$NON-NLS-1$
  1149. try {
  1150. FileUtils.rename(tmpExt, newExt,
  1151. StandardCopyOption.ATOMIC_MOVE);
  1152. } catch (IOException e2) {
  1153. newExt = tmpExt;
  1154. e = e2;
  1155. }
  1156. throw new IOException(MessageFormat.format(
  1157. JGitText.get().panicCantRenameIndexFile, newExt,
  1158. realExt), e);
  1159. }
  1160. }
  1161. return repo.getObjectDatabase().openPack(realPack);
  1162. } finally {
  1163. if (tmpPack != null && tmpPack.exists())
  1164. tmpPack.delete();
  1165. for (File tmpExt : tmpExts.values()) {
  1166. if (tmpExt.exists())
  1167. tmpExt.delete();
  1168. }
  1169. }
  1170. }
  1171. private File nameFor(String name, String ext) {
  1172. File packdir = repo.getObjectDatabase().getPackDirectory();
  1173. return new File(packdir, "pack-" + name + ext); //$NON-NLS-1$
  1174. }
  1175. private void checkCancelled() throws CancelledException {
  1176. if (pm.isCancelled()) {
  1177. throw new CancelledException(JGitText.get().operationCanceled);
  1178. }
  1179. }
  1180. /**
  1181. * A class holding statistical data for a FileRepository regarding how many
  1182. * objects are stored as loose or packed objects
  1183. */
  1184. public static class RepoStatistics {
  1185. /**
  1186. * The number of objects stored in pack files. If the same object is
  1187. * stored in multiple pack files then it is counted as often as it
  1188. * occurs in pack files.
  1189. */
  1190. public long numberOfPackedObjects;
  1191. /**
  1192. * The number of pack files
  1193. */
  1194. public long numberOfPackFiles;
  1195. /**
  1196. * The number of objects stored as loose objects.
  1197. */
  1198. public long numberOfLooseObjects;
  1199. /**
  1200. * The sum of the sizes of all files used to persist loose objects.
  1201. */
  1202. public long sizeOfLooseObjects;
  1203. /**
  1204. * The sum of the sizes of all pack files.
  1205. */
  1206. public long sizeOfPackedObjects;
  1207. /**
  1208. * The number of loose refs.
  1209. */
  1210. public long numberOfLooseRefs;
  1211. /**
  1212. * The number of refs stored in pack files.
  1213. */
  1214. public long numberOfPackedRefs;
  1215. /**
  1216. * The number of bitmaps in the bitmap indices.
  1217. */
  1218. public long numberOfBitmaps;
  1219. @Override
  1220. public String toString() {
  1221. final StringBuilder b = new StringBuilder();
  1222. b.append("numberOfPackedObjects=").append(numberOfPackedObjects); //$NON-NLS-1$
  1223. b.append(", numberOfPackFiles=").append(numberOfPackFiles); //$NON-NLS-1$
  1224. b.append(", numberOfLooseObjects=").append(numberOfLooseObjects); //$NON-NLS-1$
  1225. b.append(", numberOfLooseRefs=").append(numberOfLooseRefs); //$NON-NLS-1$
  1226. b.append(", numberOfPackedRefs=").append(numberOfPackedRefs); //$NON-NLS-1$
  1227. b.append(", sizeOfLooseObjects=").append(sizeOfLooseObjects); //$NON-NLS-1$
  1228. b.append(", sizeOfPackedObjects=").append(sizeOfPackedObjects); //$NON-NLS-1$
  1229. b.append(", numberOfBitmaps=").append(numberOfBitmaps); //$NON-NLS-1$
  1230. return b.toString();
  1231. }
  1232. }
  1233. /**
  1234. * Returns information about objects and pack files for a FileRepository.
  1235. *
  1236. * @return information about objects and pack files for a FileRepository
  1237. * @throws java.io.IOException
  1238. */
  1239. public RepoStatistics getStatistics() throws IOException {
  1240. RepoStatistics ret = new RepoStatistics();
  1241. Collection<PackFile> packs = repo.getObjectDatabase().getPacks();
  1242. for (PackFile f : packs) {
  1243. ret.numberOfPackedObjects += f.getIndex().getObjectCount();
  1244. ret.numberOfPackFiles++;
  1245. ret.sizeOfPackedObjects += f.getPackFile().length();
  1246. if (f.getBitmapIndex() != null)
  1247. ret.numberOfBitmaps += f.getBitmapIndex().getBitmapCount();
  1248. }
  1249. File objDir = repo.getObjectsDirectory();
  1250. String[] fanout = objDir.list();
  1251. if (fanout != null && fanout.length > 0) {
  1252. for (String d : fanout) {
  1253. if (d.length() != 2)
  1254. continue;
  1255. File[] entries = new File(objDir, d).listFiles();
  1256. if (entries == null)
  1257. continue;
  1258. for (File f : entries) {
  1259. if (f.getName().length() != Constants.OBJECT_ID_STRING_LENGTH - 2)
  1260. continue;
  1261. ret.numberOfLooseObjects++;
  1262. ret.sizeOfLooseObjects += f.length();
  1263. }
  1264. }
  1265. }
  1266. RefDatabase refDb = repo.getRefDatabase();
  1267. for (Ref r : refDb.getRefs()) {
  1268. Storage storage = r.getStorage();
  1269. if (storage == Storage.LOOSE || storage == Storage.LOOSE_PACKED)
  1270. ret.numberOfLooseRefs++;
  1271. if (storage == Storage.PACKED || storage == Storage.LOOSE_PACKED)
  1272. ret.numberOfPackedRefs++;
  1273. }
  1274. return ret;
  1275. }
  1276. /**
  1277. * Set the progress monitor used for garbage collection methods.
  1278. *
  1279. * @param pm a {@link org.eclipse.jgit.lib.ProgressMonitor} object.
  1280. * @return this
  1281. */
  1282. public GC setProgressMonitor(ProgressMonitor pm) {
  1283. this.pm = (pm == null) ? NullProgressMonitor.INSTANCE : pm;
  1284. return this;
  1285. }
  1286. /**
  1287. * During gc() or prune() each unreferenced, loose object which has been
  1288. * created or modified in the last <code>expireAgeMillis</code> milliseconds
  1289. * will not be pruned. Only older objects may be pruned. If set to 0 then
  1290. * every object is a candidate for pruning.
  1291. *
  1292. * @param expireAgeMillis
  1293. * minimal age of objects to be pruned in milliseconds.
  1294. */
  1295. public void setExpireAgeMillis(long expireAgeMillis) {
  1296. this.expireAgeMillis = expireAgeMillis;
  1297. expire = null;
  1298. }
  1299. /**
  1300. * During gc() or prune() packfiles which are created or modified in the
  1301. * last <code>packExpireAgeMillis</code> milliseconds will not be deleted.
  1302. * Only older packfiles may be deleted. If set to 0 then every packfile is a
  1303. * candidate for deletion.
  1304. *
  1305. * @param packExpireAgeMillis
  1306. * minimal age of packfiles to be deleted in milliseconds.
  1307. */
  1308. public void setPackExpireAgeMillis(long packExpireAgeMillis) {
  1309. this.packExpireAgeMillis = packExpireAgeMillis;
  1310. expire = null;
  1311. }
  1312. /**
  1313. * Set the PackConfig used when (re-)writing packfiles. This allows to
  1314. * influence how packs are written and to implement something similar to
  1315. * "git gc --aggressive"
  1316. *
  1317. * @param pconfig
  1318. * the {@link org.eclipse.jgit.storage.pack.PackConfig} used when
  1319. * writing packs
  1320. */
  1321. public void setPackConfig(PackConfig pconfig) {
  1322. this.pconfig = pconfig;
  1323. }
  1324. /**
  1325. * During gc() or prune() each unreferenced, loose object which has been
  1326. * created or modified after or at <code>expire</code> will not be pruned.
  1327. * Only older objects may be pruned. If set to null then every object is a
  1328. * candidate for pruning.
  1329. *
  1330. * @param expire
  1331. * instant in time which defines object expiration
  1332. * objects with modification time before this instant are expired
  1333. * objects with modification time newer or equal to this instant
  1334. * are not expired
  1335. */
  1336. public void setExpire(Date expire) {
  1337. this.expire = expire;
  1338. expireAgeMillis = -1;
  1339. }
  1340. /**
  1341. * During gc() or prune() packfiles which are created or modified after or
  1342. * at <code>packExpire</code> will not be deleted. Only older packfiles may
  1343. * be deleted. If set to null then every packfile is a candidate for
  1344. * deletion.
  1345. *
  1346. * @param packExpire
  1347. * instant in time which defines packfile expiration
  1348. */
  1349. public void setPackExpire(Date packExpire) {
  1350. this.packExpire = packExpire;
  1351. packExpireAgeMillis = -1;
  1352. }
  1353. /**
  1354. * Set the {@code gc --auto} option.
  1355. *
  1356. * With this option, gc checks whether any housekeeping is required; if not,
  1357. * it exits without performing any work. Some JGit commands run
  1358. * {@code gc --auto} after performing operations that could create many
  1359. * loose objects.
  1360. * <p>
  1361. * Housekeeping is required if there are too many loose objects or too many
  1362. * packs in the repository. If the number of loose objects exceeds the value
  1363. * of the gc.auto option JGit GC consolidates all existing packs into a
  1364. * single pack (equivalent to {@code -A} option), whereas git-core would
  1365. * combine all loose objects into a single pack using {@code repack -d -l}.
  1366. * Setting the value of {@code gc.auto} to 0 disables automatic packing of
  1367. * loose objects.
  1368. * <p>
  1369. * If the number of packs exceeds the value of {@code gc.autoPackLimit},
  1370. * then existing packs (except those marked with a .keep file) are
  1371. * consolidated into a single pack by using the {@code -A} option of repack.
  1372. * Setting {@code gc.autoPackLimit} to 0 disables automatic consolidation of
  1373. * packs.
  1374. * <p>
  1375. * Like git the following jgit commands run auto gc:
  1376. * <ul>
  1377. * <li>fetch</li>
  1378. * <li>merge</li>
  1379. * <li>rebase</li>
  1380. * <li>receive-pack</li>
  1381. * </ul>
  1382. * The auto gc for receive-pack can be suppressed by setting the config
  1383. * option {@code receive.autogc = false}
  1384. *
  1385. * @param auto
  1386. * defines whether gc should do automatic housekeeping
  1387. */
  1388. public void setAuto(boolean auto) {
  1389. this.automatic = auto;
  1390. }
  1391. /**
  1392. * @param background
  1393. * whether to run the gc in a background thread.
  1394. */
  1395. void setBackground(boolean background) {
  1396. this.background = background;
  1397. }
  1398. private boolean needGc() {
  1399. if (tooManyPacks()) {
  1400. addRepackAllOption();
  1401. } else {
  1402. return tooManyLooseObjects();
  1403. }
  1404. // TODO run pre-auto-gc hook, if it fails return false
  1405. return true;
  1406. }
  1407. private void addRepackAllOption() {
  1408. // TODO: if JGit GC is enhanced to support repack's option -l this
  1409. // method needs to be implemented
  1410. }
  1411. /**
  1412. * @return {@code true} if number of packs > gc.autopacklimit (default 50)
  1413. */
  1414. boolean tooManyPacks() {
  1415. int autopacklimit = repo.getConfig().getInt(
  1416. ConfigConstants.CONFIG_GC_SECTION,
  1417. ConfigConstants.CONFIG_KEY_AUTOPACKLIMIT,
  1418. DEFAULT_AUTOPACKLIMIT);
  1419. if (autopacklimit <= 0) {
  1420. return false;
  1421. }
  1422. // JGit always creates two packfiles, one for the objects reachable from
  1423. // branches, and another one for the rest
  1424. return repo.getObjectDatabase().getPacks().size() > (autopacklimit + 1);
  1425. }
  1426. /**
  1427. * Quickly estimate number of loose objects, SHA1 is distributed evenly so
  1428. * counting objects in one directory (bucket 17) is sufficient
  1429. *
  1430. * @return {@code true} if number of loose objects > gc.auto (default 6700)
  1431. */
  1432. boolean tooManyLooseObjects() {
  1433. int auto = getLooseObjectLimit();
  1434. if (auto <= 0) {
  1435. return false;
  1436. }
  1437. int n = 0;
  1438. int threshold = (auto + 255) / 256;
  1439. Path dir = repo.getObjectsDirectory().toPath().resolve("17"); //$NON-NLS-1$
  1440. if (!dir.toFile().exists()) {
  1441. return false;
  1442. }
  1443. try (DirectoryStream<Path> stream = Files.newDirectoryStream(dir, file -> {
  1444. Path fileName = file.getFileName();
  1445. return file.toFile().isFile() && fileName != null
  1446. && PATTERN_LOOSE_OBJECT.matcher(fileName.toString())
  1447. .matches();
  1448. })) {
  1449. for (Iterator<Path> iter = stream.iterator(); iter.hasNext(); iter
  1450. .next()) {
  1451. if (++n > threshold) {
  1452. return true;
  1453. }
  1454. }
  1455. } catch (IOException e) {
  1456. LOG.error(e.getMessage(), e);
  1457. }
  1458. return false;
  1459. }
  1460. private int getLooseObjectLimit() {
  1461. return repo.getConfig().getInt(ConfigConstants.CONFIG_GC_SECTION,
  1462. ConfigConstants.CONFIG_KEY_AUTO, DEFAULT_AUTOLIMIT);
  1463. }
  1464. }