diff options
author | Dariusz Luksza <dariusz.luksza@gmail.com> | 2023-11-03 11:52:03 +0000 |
---|---|---|
committer | Matthias Sohn <matthias.sohn@sap.com> | 2023-11-07 01:29:39 +0100 |
commit | 3937300f3eb4dd557ec2d195f21793f737d6cb4e (patch) | |
tree | 9fe2d9492e028061cab1b1c38c109a51ed3dc4c7 | |
parent | 1c320d0d41a944fe7a28158103d0318a01785e46 (diff) | |
download | jgit-3937300f3eb4dd557ec2d195f21793f737d6cb4e.tar.gz jgit-3937300f3eb4dd557ec2d195f21793f737d6cb4e.zip |
Optimise Git protocol v2 `ref-prefix` scanning
Currenty JGit will go over all refs in the repository for each
`ref-prefix`. This means that refs will be read multiple
times, which leads to subpar performance.
Native git, uses a different approach, where all refs are read once
and then for each ref, all `ref-prefix` filter values are checked in
one pass.
This change implements this approach in JGit. And makes `ref-prefix`
filtering ~28% faster for a repository with fully packed refs
and ~5% when RefTable is used instead of refdir.
Different implementations were tested on a synthetic file repository
with 300k refs. Different implementations were tested for unpacked and
fully packed refs (results are in seconds).
Unpacked refs:
Current Impl: 54.838 57.234 56.138
Nested for loops: 36.094 37.025 36.502
Nested stream's: 36.154 35.989 37.262
Parallel stream + stream: 36.923 37.272 35.362
Nested parallel stream's: 35.512 38.395 36.745
Stream + for loop: 34.950 36.164 37.191
Parallel stream + for loop: 37.695 35.511 35.378
Packed refs:
Current Impl: 39.713 39.954 38.653
Nested for loops: 29.891 29.753 29.377
Nested stream's: 30.340 29.637 30.412
Parallel stream + stream: 28.653 28.254 29.138
Nested parallel stream's: 29.942 28.850 31.030
Stream + for loop: 29.405 29.576 30.539
Parallel stream + for loop: 29.012 29.215 29.380
RefTable:
Current Impl: 0.273 0.294 0.330
Nested for loops: 0.252 0.169 0.215
Nested stream's: 0.252 0.228 0.213
Parallel stream + stream: 0.233 0.259 0.247
Nested parallel stream's: 0.416 0.309 0.340
Stream + for loop: 0.224 0.247 0.242
Parallel stream + for loop: 0.347 0.246 0.346
The elapsed time was measured around `getRefsByPrefix` call in
`UploadPack.getFilteredRefs(Collection<String>)` (around lines 952 and
954).
Based on the above results, the implementation with parallel stream and
stream was selected.
Bug: 578550
Change-Id: Iac3a3aacf897b87b3448c1d528cdac64ad312199
Signed-off-by: Dariusz Luksza <dariusz.luksza@gmail.com>
-rw-r--r-- | org.eclipse.jgit/src/org/eclipse/jgit/lib/RefDatabase.java | 8 |
1 files changed, 3 insertions, 5 deletions
diff --git a/org.eclipse.jgit/src/org/eclipse/jgit/lib/RefDatabase.java b/org.eclipse.jgit/src/org/eclipse/jgit/lib/RefDatabase.java index 9e05a39731..f3726868cb 100644 --- a/org.eclipse.jgit/src/org/eclipse/jgit/lib/RefDatabase.java +++ b/org.eclipse.jgit/src/org/eclipse/jgit/lib/RefDatabase.java @@ -474,11 +474,9 @@ public abstract class RefDatabase { */ @NonNull public List<Ref> getRefsByPrefix(String... prefixes) throws IOException { - List<Ref> result = new ArrayList<>(); - for (String prefix : prefixes) { - result.addAll(getRefsByPrefix(prefix)); - } - return Collections.unmodifiableList(result); + return getRefsByPrefix(ALL).parallelStream().filter( + ref -> Stream.of(prefixes).anyMatch(ref.getName()::startsWith)) + .collect(Collectors.toUnmodifiableList()); } |