You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

BitmappedObjectReachabilityChecker.java 2.5KB

Move reachability checker generation into the ObjectReader object Reachability checkers are retrieved from RevWalk and ObjectWalk objects: * RevWalk.createReachabilityChecker() * ObjectWalk.createObjectReachabilityChecker() Since RevWalks and ObjectWalks are themselves directly instantiated in hundreds of places (e.g. UploadPack...) overriding them in a consistent way requires overloading 100s of methods, which isn't feasible. Moving reachability checker generation to a more central place solves that problem. The ObjectReader object seems a good place from which to get reachability checkers, because reachability checkers return information about relationships between objects. ObjectDatabases delegate many operations to ObjectReaders, and reachability bitmaps are attached to ObjectReaders. The Bitmapped and Pedestrian reachability checker objects were package private in the org.eclipse.jgit.revwalk package. This change makes them public and moves them to the org.eclipse.jgit.internal.revwalk package. Corresponding tests are also moved. Motivation: 1) Reachability checking algorithms need to scale. One of the internal Android repositories has ~2.4 million refs/changes/* references, causing bad long tail performance in reachability checks. 2) Reachability check performance is impacted by repository topography: number of refs, number of objects, amounts of related vs. unrelated history. 3) Reachability check performance is also affected by per-branch access (Gerrit branch permissions) since different users can see different branches. 4) Reachability check performance isn't affected by any state in a RevWalk or ObjectWalk. I don't yet know if a single algorithm will work for all cases in #2 and #3. We may need to evolve the ReachabilityChecker interfaces over time to solve the Gerrit branch permissions case, or use Gerrit-specific identity information to solve that in an efficient way. This change takes the existing public API and moves it to the ObjectReader/whole repository level, which is where we can do consistent customizations for #2 and #3. We intend to upstream the best of whatever works, but anticipate the need for multiple rounds of experimentation. Change-Id: I9185feff43551fb387957c436112d5250486833d Signed-off-by: Terry Parker <tparker@google.com>
3 years ago
Move reachability checker generation into the ObjectReader object Reachability checkers are retrieved from RevWalk and ObjectWalk objects: * RevWalk.createReachabilityChecker() * ObjectWalk.createObjectReachabilityChecker() Since RevWalks and ObjectWalks are themselves directly instantiated in hundreds of places (e.g. UploadPack...) overriding them in a consistent way requires overloading 100s of methods, which isn't feasible. Moving reachability checker generation to a more central place solves that problem. The ObjectReader object seems a good place from which to get reachability checkers, because reachability checkers return information about relationships between objects. ObjectDatabases delegate many operations to ObjectReaders, and reachability bitmaps are attached to ObjectReaders. The Bitmapped and Pedestrian reachability checker objects were package private in the org.eclipse.jgit.revwalk package. This change makes them public and moves them to the org.eclipse.jgit.internal.revwalk package. Corresponding tests are also moved. Motivation: 1) Reachability checking algorithms need to scale. One of the internal Android repositories has ~2.4 million refs/changes/* references, causing bad long tail performance in reachability checks. 2) Reachability check performance is impacted by repository topography: number of refs, number of objects, amounts of related vs. unrelated history. 3) Reachability check performance is also affected by per-branch access (Gerrit branch permissions) since different users can see different branches. 4) Reachability check performance isn't affected by any state in a RevWalk or ObjectWalk. I don't yet know if a single algorithm will work for all cases in #2 and #3. We may need to evolve the ReachabilityChecker interfaces over time to solve the Gerrit branch permissions case, or use Gerrit-specific identity information to solve that in an efficient way. This change takes the existing public API and moves it to the ObjectReader/whole repository level, which is where we can do consistent customizations for #2 and #3. We intend to upstream the best of whatever works, but anticipate the need for multiple rounds of experimentation. Change-Id: I9185feff43551fb387957c436112d5250486833d Signed-off-by: Terry Parker <tparker@google.com>
3 years ago
Move reachability checker generation into the ObjectReader object Reachability checkers are retrieved from RevWalk and ObjectWalk objects: * RevWalk.createReachabilityChecker() * ObjectWalk.createObjectReachabilityChecker() Since RevWalks and ObjectWalks are themselves directly instantiated in hundreds of places (e.g. UploadPack...) overriding them in a consistent way requires overloading 100s of methods, which isn't feasible. Moving reachability checker generation to a more central place solves that problem. The ObjectReader object seems a good place from which to get reachability checkers, because reachability checkers return information about relationships between objects. ObjectDatabases delegate many operations to ObjectReaders, and reachability bitmaps are attached to ObjectReaders. The Bitmapped and Pedestrian reachability checker objects were package private in the org.eclipse.jgit.revwalk package. This change makes them public and moves them to the org.eclipse.jgit.internal.revwalk package. Corresponding tests are also moved. Motivation: 1) Reachability checking algorithms need to scale. One of the internal Android repositories has ~2.4 million refs/changes/* references, causing bad long tail performance in reachability checks. 2) Reachability check performance is impacted by repository topography: number of refs, number of objects, amounts of related vs. unrelated history. 3) Reachability check performance is also affected by per-branch access (Gerrit branch permissions) since different users can see different branches. 4) Reachability check performance isn't affected by any state in a RevWalk or ObjectWalk. I don't yet know if a single algorithm will work for all cases in #2 and #3. We may need to evolve the ReachabilityChecker interfaces over time to solve the Gerrit branch permissions case, or use Gerrit-specific identity information to solve that in an efficient way. This change takes the existing public API and moves it to the ObjectReader/whole repository level, which is where we can do consistent customizations for #2 and #3. We intend to upstream the best of whatever works, but anticipate the need for multiple rounds of experimentation. Change-Id: I9185feff43551fb387957c436112d5250486833d Signed-off-by: Terry Parker <tparker@google.com>
3 years ago
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283
  1. /*
  2. * Copyright (C) 2020, Google LLC and others
  3. *
  4. * This program and the accompanying materials are made available under the
  5. * terms of the Eclipse Distribution License v. 1.0 which is available at
  6. * https://www.eclipse.org/org/documents/edl-v10.php.
  7. *
  8. * SPDX-License-Identifier: BSD-3-Clause
  9. */
  10. package org.eclipse.jgit.internal.revwalk;
  11. import java.io.IOException;
  12. import java.util.ArrayList;
  13. import java.util.Arrays;
  14. import java.util.Collection;
  15. import java.util.Iterator;
  16. import java.util.List;
  17. import java.util.Optional;
  18. import java.util.stream.Stream;
  19. import org.eclipse.jgit.errors.IncorrectObjectTypeException;
  20. import org.eclipse.jgit.errors.MissingObjectException;
  21. import org.eclipse.jgit.lib.BitmapIndex.BitmapBuilder;
  22. import org.eclipse.jgit.revwalk.BitmapWalker;
  23. import org.eclipse.jgit.revwalk.ObjectReachabilityChecker;
  24. import org.eclipse.jgit.revwalk.ObjectWalk;
  25. import org.eclipse.jgit.revwalk.RevObject;
  26. /**
  27. * Checks if all objects are reachable from certain starting points using
  28. * bitmaps.
  29. */
  30. public class BitmappedObjectReachabilityChecker
  31. implements ObjectReachabilityChecker {
  32. private final ObjectWalk walk;
  33. /**
  34. * New instance of the reachability checker using a existing walk.
  35. *
  36. * @param walk
  37. * ObjectWalk instance to reuse. Caller retains ownership.
  38. */
  39. public BitmappedObjectReachabilityChecker(ObjectWalk walk) {
  40. this.walk = walk;
  41. }
  42. /**
  43. * {@inheritDoc}
  44. *
  45. * This implementation tries to shortcut the check adding starters
  46. * incrementally. Ordering the starters by relevance can improve performance
  47. * in the average case.
  48. */
  49. @Override
  50. public Optional<RevObject> areAllReachable(Collection<RevObject> targets,
  51. Stream<RevObject> starters) throws IOException {
  52. try {
  53. List<RevObject> remainingTargets = new ArrayList<>(targets);
  54. BitmapWalker bitmapWalker = new BitmapWalker(walk,
  55. walk.getObjectReader().getBitmapIndex(), null);
  56. Iterator<RevObject> starterIt = starters.iterator();
  57. BitmapBuilder seen = null;
  58. while (starterIt.hasNext()) {
  59. List<RevObject> asList = Arrays.asList(starterIt.next());
  60. BitmapBuilder visited = bitmapWalker.findObjects(asList, seen,
  61. true);
  62. seen = seen == null ? visited : seen.or(visited);
  63. remainingTargets.removeIf(seen::contains);
  64. if (remainingTargets.isEmpty()) {
  65. return Optional.empty();
  66. }
  67. }
  68. return Optional.of(remainingTargets.get(0));
  69. } catch (MissingObjectException | IncorrectObjectTypeException e) {
  70. throw new IllegalStateException(e);
  71. }
  72. }
  73. }