aboutsummaryrefslogtreecommitdiffstats
path: root/org.eclipse.jgit.test
Commit message (Collapse)AuthorAgeFilesLines
* PackWriter: offer to write an object-size index for the packIvan Frade2023-02-241-0/+37
| | | | | | | | | | | PackWriter callers tell the writer what do the want to include in the pack and invoke #writePack(). Afterwards, they can invoke #writeIndex() to write the corresponding pack index. Mirror this for the object-size index, adding a #writeObjectSizeIndex() method. Change-Id: Ic319975c72c239cd6488303f7d4cced797e6fe00
* Merge branch 'stable-6.4'Matthias Sohn2023-02-221-5/+3
|\ | | | | | | | | | | | | | | | | | | * stable-6.4: If tryLock fails to get the lock another gc has it Fix GcConcurrentTest#testInterruptGc Don't swallow IOException in GC.PidLock#lock Check if FileLock is valid before using or releasing it Change-Id: Ia2797b44a60342eb9df53f0b3d674cba92a512fc
| * Merge branch 'stable-6.3' into stable-6.4Matthias Sohn2023-02-221-5/+3
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | * stable-6.3: If tryLock fails to get the lock another gc has it Fix GcConcurrentTest#testInterruptGc Don't swallow IOException in GC.PidLock#lock Check if FileLock is valid before using or releasing it Change-Id: I5af34c92e423a651db53b4dc45ed844d5f39910d
| | * Merge branch 'stable-6.2' into stable-6.3Matthias Sohn2023-02-221-5/+3
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stable-6.2: If tryLock fails to get the lock another gc has it Fix GcConcurrentTest#testInterruptGc Don't swallow IOException in GC.PidLock#lock Check if FileLock is valid before using or releasing it Change-Id: I5b6b10622b61fde3f0f10455a74ae159a0b69082
| | | * Merge branch 'stable-6.1' into stable-6.2Matthias Sohn2023-02-221-5/+3
| | | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stable-6.1: If tryLock fails to get the lock another gc has it Fix GcConcurrentTest#testInterruptGc Don't swallow IOException in GC.PidLock#lock Check if FileLock is valid before using or releasing it Change-Id: I3ffe92566cc145053bb762f612dd96bc6d542c62
| | | | * Merge branch 'stable-6.0' into stable-6.1Matthias Sohn2023-02-221-5/+3
| | | | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stable-6.0: If tryLock fails to get the lock another gc has it Fix GcConcurrentTest#testInterruptGc Don't swallow IOException in GC.PidLock#lock Check if FileLock is valid before using or releasing it Change-Id: Idea23e555c024557d7e39a86efe25f609400b962
| | | | | * Merge branch 'stable-5.13' into stable-6.0Matthias Sohn2023-02-221-5/+3
| | | | | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stable-5.13: If tryLock fails to get the lock another gc has it Fix GcConcurrentTest#testInterruptGc Don't swallow IOException in GC.PidLock#lock Check if FileLock is valid before using or releasing it Change-Id: I708d0936fa86b028e4da4e7e21f332f8b48ad293
| | | | | | * Fix GcConcurrentTest#testInterruptGcMatthias Sohn2023-02-221-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the new GC.PidLock interrupting a running GC throws a ClosedByInterruptException. Change-Id: I7ccea1ae9a43d4edfdab2fcfd1324c64cc22b38f
* | | | | | | UploadPack: use allow-any-sha1-in-want configurationkylezhao2023-02-211-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | C git 2.11 supports setting the equivalent of RequestPolicy.ANY with uploadpack.allowAnySHA1InWant[1]. Parse this into TransportConfig and use it from UploadPack. Add additional tests for [2] and this change. We can execute "git clone --filter=blob:none --no-checkout" successfully with config uploadPack.allowFilter is true. But when we checkout, the git will fetch other missing objects required by the checkout(this is why we need this config). When both uploadPack.allowFilter and uploadPack.allowAnySHA1InWant are true, jgit will support partial clone. If you are using an extremely large monorepo, this feature can help. It allows users to work on an incomplete repo which reduces disk usage. [1] https://github.com/git/git/commit/f8edeaa05d8623a9f6dad408237496c51101aad8 [2] change Id39771a6e42d8082099acde11249306828a053c0 Bug: 573390 Change-Id: I8fe75f03bf1fea7c11e0d67c8637bd05dd1f9b89 Signed-off-by: kylezhao <kylezhao@tencent.com>
* | | | | | | PackObjectSizeIndex: interface and impl for the object-size indexIvan Frade2023-02-141-0/+392
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Operations like "clone --filter=blob:limit=N" or the "object-info" command need to read the size of the objects from the storage. An index would provide those sizes at once rather than having to seek in the packfile. Introduce an interface for the Object-size index. This index returns the inflated size of an object. Not all objects could be indexed (to limit memory usage). This implementation indexes only blobs (no trees, nor commits) *above* certain size threshold (configurable). Lower threshold adds more objects to the index, consumes more memory and provides better performance. 0 means "all blobs" and -1 "disabled". If we don't index everything, for the filter use case is more efficient to index the biggest objects first: the set is small and most objects are filtered by NOT being in the index. For the object-size, the more objects in the index the better, regardless their size. All together, it is more helpful to index above threshold. Change-Id: I9ed608ac240677e199b90ca40d420bcad9231489
* | | | | | | UInt24Array: Array of unsigned ints encoded in 3 bytes.Ivan Frade2023-02-141-0/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The object size index stores positions of objects in the main index (when ordered by sha1). These positions are per-pack and usually a pack has <16 million objects (there are exceptions but rather rare). It could save some memory storing these positions in three bytes instead of four. Note that these positions are sorted and always positive. Implement a wrapper around a byte[] to access and search "ints" while they are stored as unsigned 3 bytes. Change-Id: Iaa26ce8e2272e706e35fe4cdb648fb6ca7591972
* | | | | | | PackIndex: expose the position of an object-id in the indexIvan Frade2023-02-141-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The primary index returns the offset in the pack for an objectId. Internally it keeps the object-ids in lexicographical order, but doesn't expose an API to find the position of an object-id in that list. This is needed for the object-size index, that we want to store as "position-in-idx, size". Add a #findPosition(object-id) method to the PackIndex interface to know where an object-id sits in the ordered list of ids in the pack. Note that this index position is over the list of ordered object-ids, while reverse-index position is over the list of objects in packed order. Change-Id: I89fa146599e347a26d3012d3477d7f5bbbda7ba4
* | | | | | | DfsPackFile/DfsGC: Write commit graphs and expose in packXing Huang2023-02-072-1/+146
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | JGit knows how to read/write commit graphs but the DFS stack is not using it yet. The DFS garbage collector generates a commit-graph with commits reachable from any ref. The pack is stored as extra stream in the GC pack. DfsPackFile mimicks how other indices are loaded storing the reference in DFS cache. Signed-off-by: Xing Huang <xingkhuang@google.com> Change-Id: I3f94997377986d21a56b300d8358dd27be37f5de
* | | | | | | Merge "UploadPack: consume delimiter in object-info command"Han-Wen NIenhuys2023-02-021-1/+3
|\ \ \ \ \ \ \
| * | | | | | | UploadPack: consume delimiter in object-info commandHan-Wen Nienhuys2023-02-021-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'size' packet line is an argument, so it must be preceeded by a 0001 delimiter. See also git's t5701-git-serve.sh test, https://github.com/git/git/blob/8b8d9a2/t/t5701-git-serve.sh#L329 Without this fix, the server will choke on the delimiter line, saying PackProtocolException: unexpected <empty string> To test, I ran Gerrit locally with this fix $ curl -X POST -H 'git-protocol: version=2' -H 'content-type: application/x-git-upload-pack-request' -H 'accept: application/x-git-upload-pack-result' --data $'0018command=object-info\n00010009size\n0031oid d38b1b92bdb2893eb4505667375563f2d6d4086b\n0000' http://localhost:8080/git.git/git-upload-pack => 0008size0032d38b1b92bdb2893eb4505667375563f2d6d4086b 268590000 The same command completes identically on Gitlab (which supports the object-info command) $ curl -X POST -H 'git-protocol: version=2' -H 'content-type: application/x-git-upload-pack-request' -H 'accept: application/x-git-upload-pack-result' --data $'0018command=object-info\n00010009size\n0031oid d38b1b92bdb2893eb4505667375563f2d6d4086b\n0000' https://gitlab.com/gitlab-org/git.git/git-upload-pack => 0008size0032d38b1b92bdb2893eb4505667375563f2d6d4086b 268590000 In this case, the blob is for the COPYING file in the Git source tree, which is 26859 bytes long. Change-Id: Ief4ce1eb9303a3b2479547d7950ef01c7c28f472
* | | | | | | | Merge "PatchApplier fix - init cache with provided tree"Han-Wen NIenhuys2023-02-024-21/+82
|\ \ \ \ \ \ \ \
| * | | | | | | | PatchApplier fix - init cache with provided treeNitzan Gur-Furman2023-02-024-21/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change only affects inCore repositories. Before this change, any file that wasn't part of the patch wasn't read, and therefore wasn't part of the output tree. Change-Id: I246ef957088f17aaf367143f7a0b3af0f8264ffb Bug: Google b/267270348
* | | | | | | | | Avoid error-prone warningHan-Wen Nienhuys2023-02-013-10/+10
| |/ / / / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GC.gc() returns a Future, which should not be discarded. See also https://errorprone.info/bugpattern/FutureReturnValueIgnored Change-Id: I343cc3cfe74a564ad7f8d53f0fe9d96a23aaed00
* | | | | | | | Merge branch 'stable-6.4'Matthias Sohn2023-02-013-0/+149
|\ \ \ \ \ \ \ \ | |/ / / / / / / |/| / / / / / / | |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stable-6.4: Shortcut during git fetch for avoiding looping through all local refs FetchCommand: fix fetchSubmodules to work on a Ref to a blob Silence API warnings introduced by I466dcde6 Allow the exclusions of refs prefixes from bitmap PackWriterBitmapPreparer: do not include annotated tags in bitmap BatchingProgressMonitor: avoid int overflow when computing percentage Speedup GC listing objects referenced from reflogs FileSnapshotTest: Add more MISSING_FILE coverage Change-Id: Id0ebfbd85eb815716383b9495eb7dd1f54cf4d74
| * | | | | | Merge branch 'stable-6.3' into stable-6.4Matthias Sohn2023-02-013-0/+149
| |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stable-6.3: Shortcut during git fetch for avoiding looping through all local refs FetchCommand: fix fetchSubmodules to work on a Ref to a blob Silence API warnings introduced by I466dcde6 Allow the exclusions of refs prefixes from bitmap PackWriterBitmapPreparer: do not include annotated tags in bitmap BatchingProgressMonitor: avoid int overflow when computing percentage Speedup GC listing objects referenced from reflogs FileSnapshotTest: Add more MISSING_FILE coverage Change-Id: Iefcf5d832bd0087c1027876f2200689e1150abce
| | * | | | | Merge branch 'stable-6.2' into stable-6.3Matthias Sohn2023-02-013-0/+149
| | |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stable-6.2: Shortcut during git fetch for avoiding looping through all local refs FetchCommand: fix fetchSubmodules to work on a Ref to a blob Silence API warnings introduced by I466dcde6 Allow the exclusions of refs prefixes from bitmap PackWriterBitmapPreparer: do not include annotated tags in bitmap BatchingProgressMonitor: avoid int overflow when computing percentage Speedup GC listing objects referenced from reflogs FileSnapshotTest: Add more MISSING_FILE coverage Change-Id: I2ff386d9a096277360e6c7bd5535b49984620fb3
| | | * | | | Merge branch 'stable-6.1' into stable-6.2Matthias Sohn2023-02-013-0/+149
| | | |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stable-6.1: Shortcut during git fetch for avoiding looping through all local refs FetchCommand: fix fetchSubmodules to work on a Ref to a blob Silence API warnings introduced by I466dcde6 Allow the exclusions of refs prefixes from bitmap PackWriterBitmapPreparer: do not include annotated tags in bitmap BatchingProgressMonitor: avoid int overflow when computing percentage Speedup GC listing objects referenced from reflogs FileSnapshotTest: Add more MISSING_FILE coverage Change-Id: Iff2fba026b49463016015b2fae1a42cf76ee2dbb
| | | | * | | Merge branch 'stable-6.0' into stable-6.1Matthias Sohn2023-02-013-0/+149
| | | | |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stable-6.0: Shortcut during git fetch for avoiding looping through all local refs FetchCommand: fix fetchSubmodules to work on a Ref to a blob Silence API warnings introduced by I466dcde6 Allow the exclusions of refs prefixes from bitmap PackWriterBitmapPreparer: do not include annotated tags in bitmap BatchingProgressMonitor: avoid int overflow when computing percentage Speedup GC listing objects referenced from reflogs FileSnapshotTest: Add more MISSING_FILE coverage Change-Id: Ib5055f2f3b8a313c178d6f6c7c5630285ad5a726
| | | | | * | Merge branch 'stable-5.13' into stable-6.0Matthias Sohn2023-02-013-0/+149
| | | | | |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stable-5.13: Shortcut during git fetch for avoiding looping through all local refs FetchCommand: fix fetchSubmodules to work on a Ref to a blob Silence API warnings introduced by I466dcde6 Allow the exclusions of refs prefixes from bitmap PackWriterBitmapPreparer: do not include annotated tags in bitmap BatchingProgressMonitor: avoid int overflow when computing percentage Speedup GC listing objects referenced from reflogs FileSnapshotTest: Add more MISSING_FILE coverage Change-Id: I58ad4c210a5e7e5a1ba6b22315b04211c8909950
| | | | | | * Allow the exclusions of refs prefixes from bitmapLuca Milanesio2023-01-311-0/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running a GC.repack() against a repository with over one thousands of refs/heads and tens of millions of ObjectIds, the calculation of all bitmaps associated with all the refs would result in an unreasonable big file that would take up to several hours to compute. Test scenario: repo with 2500 heads / 10M obj Intel Xeon E5-2680 2.5GHz Before this change: 20 mins After this change and 2300 heads excluded: 10 mins (90s for bitmap) Having such a large bitmap file is also slow in the runtime processing and have negligible or even negative benefits, because the time lost in reading and decompressing the bitmap in memory would not be compensated by the time saved by using it. It is key to preserve the bitmaps for those refs that are mostly used in clone/fetch and give the ability to exlude some refs prefixes that are known to be less frequently accessed, even though they may actually be actively written. Example: Gerrit sandbox branches may even be actively used and selected automatically because its commits are very recent, however, they may bloat the bitmap, making it ineffective. A mono-repo with tens of thousands of developers may have a relatively small number of active branches where the CI/CD jobs are continuously fetching/cloning the code. However, because Gerrit allows the use of sandbox branches, the total number of refs/heads may be even tens to hundred thousands. Change-Id: I466dcde69fa008e7f7785735c977f6e150e3b644 Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com>
| | | | | | * PackWriterBitmapPreparer: do not include annotated tags in bitmapLuca Milanesio2023-01-311-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The annotated tags should be excluded from the bitmap associated with the heads-only packfile. However, this was not happening because of the check of exclusion of the peeled object instead of the objectId to be excluded from the bitmap. Sample use-case: refs/heads/main ^ | commit1 <-- commit2 <- annotated-tag1 <- tag1 ^ | commit0 When creating a bitmap for the above commit graph, before this change all the commits are included (3 bitmaps), which is incorrect, because all commits reachable from annotated tags should not be included. The heads-only bitmap should include only commit0 and commit1 but because PackWriterBitPreparer was checking for the peeled pointer of tag1 to be excluded (commit2) which was not found in the list of tags to exclude (annotated-tag1), the commit2 was included, even if it wasn't reachable only from the head. Add an additional check for exclusion of the original objectId for allowing the exclusion of annotated tags and their pointed commits. Add one specific test associated with an annotated tag for making sure that this use-case is covered also. Example repository benchmark for measuring the improvement: # refs: 400k (2k heads, 88k tags, 310k changes) # objects: 11M (88k of them are annotate tags) # packfiles: 2.7G Before this change: GC time: 5h clone --bare time: 7 mins After this change: GC time: 20 mins clone --bare time: 3 mins Bug: 581267 Signed-off-by: Luca Milanesio <luca.milanesio@gmail.com> Change-Id: Iff2bfc6587153001837220189a120ead9ac649dc
| | | | | | * BatchingProgressMonitor: avoid int overflow when computing percentageMatthias Sohn2023-01-311-0/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When cloning huge repositories I observed percentage of object counts turning negative. This happened if lastWork * 100 exceeded Integer.MAX_VALUE. Change-Id: Ic5f5cf5a911a91338267aace4daba4b873ab3900
| | | | | | * FileSnapshotTest: Add more MISSING_FILE coverageNasser Grainawi2023-01-061-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a couple tests that confirm what the docs say about isModified() and equals(MISSING_FILE) behavior. Change-Id: I6093040ba3594934c3270331405a44b2634b97c5 Signed-off-by: Nasser Grainawi <quic_nasserg@quicinc.com>
| * | | | | | Prepare 6.4.1-SNAPSHOT buildsMatthias Sohn2022-11-302-58/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Change-Id: I860bfde113c05015c41304c4a77c44c224bd0923
| * | | | | | JGit v6.4.0.202211300538-rv6.4.0.202211300538-rMatthias Sohn2022-11-302-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Matthias Sohn <matthias.sohn@sap.com> Change-Id: If4001b255a209849b4acabd2083164d0794f00c4
| * | | | | | Fix crashes on rare combination of file namesDmitrii Filippov2022-11-292-0/+197
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The NameConflictTreeWalk class is used in merge for iterating over entries in commits. The class uses a separate iterator for each commit's tree. In rare cases it can incorrectly report the same entry twice. As a result, duplicated entries are added to the merge result and later jgit throws an exception when it tries to process merge result. The problem appears only when there is a directory-file conflict for the last item in trees. Example from the bug: Commit 1: * subtree - file * subtree-0 - file Commit 2: * subtree - directory * subtree-0 - file Here the names are ordered like this: "subtree" file <"subtree-0" file < "subtree" directory. The NameConflictTreeWalk handles similar cases correctly if there are other files after subtree... in commits - this is processed in the AbstractTreeIterator.min function. Existing code has a special optimization for the case, when all trees are pointed to the same entry name - it skips additional checks. However, this optimization incorrectly skips checks if one of trees reached the end. The fix processes a situation when some trees reached the end, while others are still point to an entry. bug: 535919 Change-Id: I62fde3dd89779fac282479c093400448b4ac5c86
| * | | | | | Prepare 6.4.0-SNAPSHOT buildMatthias Sohn2022-11-232-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Change-Id: I41c4f73472bb47d8f9d2d117d17e11bba4802928
| * | | | | | JGit v6.4.0.202211231055-rc1v6.4.0.202211231055-rc1Matthias Sohn2022-11-232-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Matthias Sohn <matthias.sohn@sap.com> Change-Id: Ia34696d07568b298544ee2cdc6f4b6746774bb82
* | | | | | | RevWalk: integrate commit-graph with commit parsingkylezhao2023-01-101-0/+319
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RevWalk#createCommit() will inspect the commit-graph file to find the specified object's graph position and then return a new RevCommitCG instance. RevCommitGC is a RevCommit with an additional "pointer" (the position) to the commit-graph, so it can load the headers and metadata from there instead of the pack. This saves IO access in walks where the body is not needed (i.e. #isRetainBody is false and #parseBody is not invoked). RevWalk uses automatically the commit-graph if available, no action needed from callers. The commit-graph is fetched on first access from the reader (that internally can keep it loaded and reuse it between walks). The startup cost of reading the entire commit graph is small. After testing, reading a commit-graph with 1 million commits takes less than 50ms. If we use RepositoryCache, it will not be initialized util the commit-graph is rewritten. Bug: 574368 Change-Id: I90d0f64af24f3acc3eae6da984eae302d338f5ee Signed-off-by: kylezhao <kylezhao@tencent.com>
* | | | | | | GC: disable writing commit-graph for shallow reposkylezhao2023-01-061-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In shallow repos, GC writes to the commit-graph that shallow commits do not have parents. This won't be true after a "git fetch --unshallow" (and before another GC). Do not write the commit-graph from shallow clones of a repo. The commit-graph must have the real metadata of commits and that is not available in a shallow view of the repo. Change-Id: Ic9f2358ddaa607c74f4dbf289c9bf2a2f0af9ce0 Signed-off-by: kylezhao <kylezhao@tencent.com>
* | | | | | | Add TernarySearchTreeMatthias Sohn2023-01-042-0/+331
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A ternary search tree is a type of tree where nodes are arranged in a manner similar to a binary search tree, but with up to three children rather than the binary tree's limit of two. Each node of a ternary search tree stores a single character, a reference to a value object and references to its three children named equal kid, lo kid and hi kid. The lo kid pointer must point to a node whose character value is less than the current node. The hi kid pointer must point to a node whose character is greater than the current node.[1] The equal kid points to the next character in the word. Each node in a ternary search tree represents a prefix of the stored strings. All strings in the middle subtree of a node start with that prefix. Like other prefix trees, a ternary search tree can be used as an associative map with the ability for incremental string search. Ternary search trees are more space efficient compared to standard prefix trees, at the cost of speed. They allow efficient prefix search which is important to implement searching refs by prefix in a RefDatabase. Searching by prefix returns all keys if the prefix is an empty string. Bug: 576165 Change-Id: If160df70151a8e1c1bd6716ee4968e4c45b2c7ac
* | | | | | | CommitGraph: teach ObjectReader to get commit-graphkylezhao2023-01-041-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | FileRepository's ObjectReader#getCommitGraph will return commit-graph when it exists and core.commitGraph is true. DfsRepository is not supported currently. Change-Id: I992d43d104cf542797e6949470e95e56de025107 Signed-off-by: kylezhao <kylezhao@tencent.com>
* | | | | | | Merge "CommitGraph: add commit-graph for FileObjectDatabase"Ivan Frade2023-01-031-0/+45
|\ \ \ \ \ \ \
| * | | | | | | CommitGraph: add commit-graph for FileObjectDatabasekylezhao2022-12-231-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change makes JGit can read .git/objects/info/commit-graph file and then get CommitGraph. Loading a new commit-graph into memory requires additional time. After testing, loading a copy of the Linux's commit-graph(1039139 commits) is under 50ms. Bug: 574368 Change-Id: Iadfdd6ed437945d3cdfdbe988cf541198140a8bf Signed-off-by: kylezhao <kylezhao@tencent.com>
* | | | | | | | PatchApplier: fix handling of last newline in text patchThomas Wolf2022-12-2637-1/+1333
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the last line came from the patch, use the patch to determine whether or not there should be a trailing newline. Otherwise use the old text. Add test cases for - no newline at end, last line not in patch hunk - no newline at end, last line in patch hunk - patch removing the last newline - patch adding a newline at the end of file not having one all for core.autocrlf false, true, and input. Add a test case where the "no newline" indicator line is not the last line of the last hunk. This can happen if the patch ends with removals at the file end. Bug: 581234 Change-Id: I09d079b51479b89400ad300d0662c1dcb50deab6 Also-by: Yuriy Mitrofanov <a2terminator@mail.ru> Signed-off-by: Thomas Wolf <twolf@apache.org>
* | | | | | | | Reformat PatchApplier and PatchApplierTestThomas Wolf2022-12-221-28/+38
|/ / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some lines were too long, unnecessary fully qualified class names, and an assertEquals(actual, expected) when it should have been assertEquals(expected, actual). Change-Id: I3b3c46c963afe2fb82a79c1e93970e73778877e5 Signed-off-by: Thomas Wolf <twolf@apache.org>
* | | | | | | IO#readFully: provide overload that fills the full arrayAnna Papitto2022-12-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IO#readFully is often called with the intent to fill the destination array from beginning to end. The redundant arguments for where to start and stop filling are opportunities for bugs if specified incorrectly or if not changed to match a changed array length. Provide a overloaded method for filling the full destination array. Change-Id: I964f18f4a061189cce1ca00ff0258669277ff499 Signed-off-by: Anna Papitto <annapapitto@google.com>
* | | | | | | GC: Write commit-graph files when gckylezhao2022-12-161-0/+154
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If 'core.commitGraph' and 'gc.writeCommitGraph' are both true, then gc will rewrite the commit-graph file when 'git gc' is run. Defaults to false while the commit-graph feature matures. Bug: 574368 Change-Id: Ic94cd69034c524285c938414610f2e152198e06e Signed-off-by: kylezhao <kylezhao@tencent.com>
* | | | | | | CommitGraph: add core.commitGraph configkylezhao2022-12-161-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change-Id: I3b5e735ebafba09ca18fd83da479c7950fa3ea8d Signed-off-by: kylezhao <kylezhao@tencent.com>
* | | | | | | Merge "Gc#deleteOrphans: avoid dependence on PackExt alphabetical ordering"Ivan Frade2022-12-161-2/+33
|\ \ \ \ \ \ \
| * | | | | | | Gc#deleteOrphans: avoid dependence on PackExt alphabetical orderingAnna Papitto2022-12-151-2/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Deleting orphan files depends on .pack and .keep being reverse-sorted to before the corresponding index files that could be orphans. The new reverse index file extension (.rev) will break that frail dependency. Rewrite Gc#deleteOrphans to avoid that dependence by tracking which pack names have a .pack or .keep file and then deleting any index files that without a corresponding one. This approach takes linear time instead of the O(n logn) time needed for sorting. Change-Id: If83c378ea070b8871d4b01ae008e7bf8270de763 Signed-off-by: Anna Papitto <annapapitto@google.com>
* | | | | | | | CommitGraph: implement commit-graph readkylezhao2022-12-164-0/+412
|/ / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Git introduced a new file storing the topology and some metadata of the commits in the repo (commitGraph). With this data, git can browse commit history without parsing the pack, speeding up e.g. reachability checks. This change teaches JGit to read commit-graph-format file, following the upstream format([1]). JGit can read a commit-graph file from a buffered stream, which means that we can provide this feature for both FileRepository and DfsRepository. [1] https://git-scm.com/docs/commit-graph-format/2.21.0 Bug: 574368 Change-Id: Ib5c0d6678cb242870a0f5841bd413ad3885e95f6 Signed-off-by: kylezhao <kylezhao@tencent.com>
* | | | | | | commitgraph package: fix exports/imports, add @since tag for new APIMatthias Sohn2022-12-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Change-Id: I9175b1d796f91f5ba4e21d3418550ae451c054b0
* | | | | | | CommitGraph: implement commit-graph writerkylezhao2022-12-062-0/+232
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Teach JGit to write a commit-graph formatted file by walking commit graph from specified commit objects. See: https://git-scm.com/docs/commit-graph-format/2.21.0 Bug: 574368 Change-Id: I34f9f28f8729080c275f86215ebf30b2d05af41d Signed-off-by: kylezhao <kylezhao@tencent.com>
* | | | | | | Merge "Fix crashes on rare combination of file names"Han-Wen NIenhuys2022-11-282-0/+197
|\ \ \ \ \ \ \