summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* JGit 3.0: move internal classes into an internal subpackageShawn Pearce2013-03-18255-664/+717
| | | | | | | | This breaks all existing callers once. Applications are not supposed to build against the internal storage API unless they can accept API churn and make necessary updates as versions change. Change-Id: I2ab1327c202ef2003565e1b0770a583970e432e9
* Merge changes I2645d482,Ic81fefb1,Id64ab38dShawn Pearce2013-03-1813-326/+22
|\ | | | | | | | | | | | | * changes: Remove cached_packs support in favor of bitmaps Remove objects before optimization from DfsGarbageCollector Simplfy caching of DfsPackDescription from PackWriter.Statistics
| * Remove cached_packs support in favor of bitmapsShawn Pearce2013-03-1412-299/+5
| | | | | | | | | | | | | | | | | | | | The bitmap code in PackWriter knows exactly when to use a pack as a "cached pack". It enables cached pack usage only when the pack has a bitmap and its entire closure of objects needs to be sent. This is a much simpler code path to maintain, and JGit actually has a way to write the necessary index. Change-Id: I2645d482f8733fdf0c4120cc59ba9aa4d4ba6881
| * Remove objects before optimization from DfsGarbageCollectorShawn Pearce2013-03-141-18/+7
| | | | | | | | | | | | | | | | | | | | | | | | Just counting objects is not sufficient. There are some race conditions with receive packs and delta base completion that may confuse such a simple algorithm. Instead always do the larger set computations, and rely on the PackWriter having no objects pending as the way to avoid creating an empty pack file. Change-Id: Ic81fefb158ed6ef8d6522062f2be0338a49f6bc4
| * Simplfy caching of DfsPackDescription from PackWriter.StatisticsShawn Pearce2013-03-143-9/+10
| | | | | | | | | | | | | | | | | | | | | | Let the pack description copy the relevant stats values. This moves it out of the garbage collector and compactor algorithms, co-locating with something that might care. Remove some unnecessary code from the DfsPackCompactor, the stats tracks the same information and can supply it. Change-Id: Id64ab38d507c0ed19ae0d106862d175b7364eba3
* | Use RawParseUtils.prevLF in RebaseCommandRobin Stocker2013-03-161-4/+2
| | | | | | | | | | | | | | | | As noticed by Robin Rosenberg in review of I4eb87c850078ca187b38b81cc91c92afb1176945. Change-Id: If96d66b6c025ad8f2f47829c933f3c65ab6cbeef Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* | Support aborting non-interactive rebase started from C GitRobin Stocker2013-03-162-54/+120
| | | | | | | | | | | | | | | | | | Continuing is trickier, as .git/rebase-apply contains no message file and no git-rebase-todo. Bug: 336820 Change-Id: I4eb87c850078ca187b38b81cc91c92afb1176945 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* | NameRevCommand: Don't use merge cost for first parentDave Borowitz2013-03-152-18/+22
| | | | | | | | | | | | | | | | Treat first parent traversals as 1 and higher parents as MERGE_COST, to match git name-rev. Allow overriding the merge cost during tests to avoid creating 2^16 commits on the fly. Change-Id: I0175e0c3ab1abe6722e4241abe2f106d1fe92a69
* | Merge "A folder does not constitute a dirty work tree"Robin Rosenberg2013-03-153-2/+83
|\ \
| * | A folder does not constitute a dirty work treeRobin Rosenberg2013-03-103-2/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes two cases: - A folder without tracked content exist both in the workdir and merged commit, as long as there names within that folder does not conflict. - An empty folder structure exists with the same name as a file in the merged commit. Bug: 402834 Change-Id: I4c5b9f11313dd1665fcbdae2d0755fdb64deb3ef
* | | Merge "Add toString() for PackConfig"Christian Halstrick2013-03-151-0/+18
|\ \ \
| * | | Add toString() for PackConfigEdwin Kempin2013-03-151-0/+18
| | |/ | |/| | | | | | | | | | | | | | | | This is helpful for writing the pack configuration into a log file. Change-Id: I5e7f5ff7e01c9538ca12a1860844ba9b467bdf05 Signed-off-by: Edwin Kempin <edwin.kempin@sap.com>
* / | Add toString() for RepoStatisticsEdwin Kempin2013-03-151-0/+12
|/ / | | | | | | | | | | | | This is helpful for writing the repository statistics into a log file. Change-Id: I0e8cd9ad05f123ab3851960890a50213f353a373 Signed-off-by: Edwin Kempin <edwin.kempin@sap.com>
* | NameRevCommand: Use ~ notation for first parents of mergesDave Borowitz2013-03-142-4/+17
| | | | | | | | | | | | | | Prefer ~(N+1) to ^1~N. Although both are correct, the former is cleaner and matches "git name-rev". Change-Id: I772001a219e5eb346f5552c92e6d98c70b2cfa98
* | Allow adding single refs or all tags to NameRevCommandDave Borowitz2013-03-132-21/+94
| | | | | | | | Change-Id: I90e85bc835d11278631afd0e801425a292578bba
* | Merge "Cluster UNREACHABLE_GARBAGE packs at the end of the search list"Shawn Pearce2013-03-122-5/+20
|\ \
| * | Cluster UNREACHABLE_GARBAGE packs at the end of the search listShawn Pearce2013-03-082-5/+20
| | | | | | | | | | | | | | | | | | | | | | | | Garbage is unlikely to be used by a reader. Ensure they always cluster at the end of the search list, no matter what timestamp was used on the pack files. Change-Id: I3bed89e9569ee3363c36bb3f73fcd34057a3883f
* | | Merge "Avoid repacking unreachable garbage in DfsGarbageCollector"Shawn Pearce2013-03-121-5/+52
|\ \ \
| * | | Avoid repacking unreachable garbage in DfsGarbageCollectorShawn Pearce2013-03-081-5/+52
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | If a repository has significant amounts of unreachable garbage the final phase to coalesce it can take longer than any other part of the garbage collection phase. Provide a setting for applications to tweak the threshold where coalescing ends and files just remain on disk. Change-Id: I5f11a998a7185c75ece3271d8bc6181bb83f54c1
* | | Merge changes Icd550359,If7aad533Shawn Pearce2013-03-124-87/+99
|\ \ \ | | | | | | | | | | | | | | | | | | | | * changes: Avoid looking at UNREACHABLE_GARBAGE for client have lines Simplify UploadPack by parsing wants separately from haves
| * | | Avoid looking at UNREACHABLE_GARBAGE for client have linesShawn Pearce2013-03-084-4/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clients send a bunch of unknown objects to UploadPack on each round of negotiation. Many of these are not known to the server, which leads the implementation to be looking at indexes for garbage packs. Disable examining the index of a garbage pack, allowing servers to avoid reading them from disk during negotiation. The effect of this change is the server will only ACK a have line if the object was reachable during the last garbage collection, or was recently added to the repository. For most repositories there is no impact in this behavior change. If a repository rewinds a branch, runs GC, and then resets the branch back to where it was before, the now current tip is going to be skipped by this change. A client that has the commit may wind up getting a slightly larger data transfer from the server as an older common ancestor will be chosen during negotiation. This is fixable on the server side by running GC again to correct the layout of objects in pack files. Change-Id: Icd550359ef70fc7b701980f9b13d923fd13c744b
| * | | Simplify UploadPack by parsing wants separately from havesShawn Pearce2013-03-081-83/+60
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The DHT backend was very slow at parsing objects. To work around that performance limitation I obfuscated UploadPack by folding both the want and have sets together in a single parse queue. Since DHT was removed the complexity is no longer constructive to JGit. Doing this refactoring prepares the code for a slightly future change where the have lines need to be handled specially from the want lines. Splitting the parsing up into two phases makes such a modification trivial. Change-Id: If7aad533b82448bbb688278e21f709282e5ccf4b
* | | Merge "Add a NameRevCommand for describing IDs in terms of refnames"Shawn Pearce2013-03-113-0/+511
|\ \ \ | |_|/ |/| |
| * | Add a NameRevCommand for describing IDs in terms of refnamesDave Borowitz2013-03-113-0/+511
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The walk logic does not use RevWalk because it needs to walk all paths to each of the requested commits, keeping track of each path along which the commit was found in the RevCommit subclass. From these paths, a single "best" path is chosen based on the total path length, with a penalty applied for paths that traverse merges. This functionality parallels "git name-rev". Change-Id: I92bfb47dd16c898313d2ee525395609c3bf72ebe
* | | Add isRebasing to RepositoryStateRobin Stocker2013-03-091-0/+48
| | | | | | | | | | | | | | | | | | | | | See EGit change Ic69f5c952a49f023c0949f04b3e976be1b267fbe where this could be used. Change-Id: I9ec8568fa1100d2e9c8d4ca0e347bf77ec6d8734
* | | Include the number of ms in timeout error messageRobin Stocker2013-03-083-8/+10
| |/ |/| | | | | | | | | Noticed that while analyzing bug 402131. Change-Id: If3fd40b64d5088c4579946271a67346cbd9e6556
* | Do not cherry-pick merge commits during rebaseRobin Rosenberg2013-03-082-37/+52
| | | | | | | | | | | | | | | | | | | | | | Rebase computes the list of commits that are included in the merges, just like Git does, so do not try to include the merge commits. Re-recreating merges during rebase is a bit more complicated and might be a useful future extension, but for now just linearize during rebase. Change-Id: I61239d265f395e5ead580df2528e46393dc6bdbd Signed-off-by: Robin Stocker <robin@nibor.org>
* | Extend FileUtils.delete with option to delete empty directories onlyRobin Rosenberg2013-03-082-1/+70
|/ | | | | | | | | | The new option EMPTY_DIRECTORIES_ONLY will make delete() only delete empty directories. Any attempt to delete files will fail. Can be combined with RECURSIVE to wipe out entire tree structures and IGNORE_ERRORS to silently ignore any files or non-empty directories. Change-Id: Icaa9a30e5302ee5c0ba23daad11c7b93e26b7445 Signed-off-by: Robin Stocker <robin@nibor.org>
* Add javaewah bundle to features using itMatthias Sohn2013-03-072-0/+16
| | | | | | | This ensures that OSGi consumers can retrieve this dependency from the JGit or EGit p2 repository. Change-Id: I6f88a4914a19e4e18aa60d59b0cc8a33b61f7fc2 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* Do not attempt to read bitmap from invalid packShawn Pearce2013-03-062-0/+4
| | | | | | | | | If a pack file has been marked invalid due to a prior IOException accessing its contents, do not offer its bitmap index to callers. The pack cannot be used so its bitmap should be off limits from any reader trying to work from a bitmap. Change-Id: Ia44e46558abdddee560bb184158b1e0af9437eee
* Rename DfsPackFile getBitmap method to match PackFileShawn Pearce2013-03-062-3/+3
| | | | | | | There is no reason for these to differ in name. Match the shorter name used by PackFile. Change-Id: I2d3a299069acc5ce276b1b5439ff2258903c6ff3
* Write the bitmap index correctly in DFS GC.Colby Ranger2013-03-061-1/+1
| | | | | | A bug caused the .bitmap to actually have the .idx contents. Change-Id: I428bb27d419e8b1b69b6f3e2fd07cd29703669ad
* Enable writing bitmaps during GC by default.Colby Ranger2013-03-051-1/+1
| | | | | | | Bitmaps provide a huge performance boost for counting objects and they play nice with the cgit implementation. Change-Id: I33b05a6c8f1ee2df7770f0b9fdc50d0b4bbf1029
* Enable writing pack indexes with bitmaps in the GC.Colby Ranger2013-03-056-31/+114
| | | | | | | | Update the dfs and file GC implementations to prepare and write bitmaps on the packs that contain the full closure of the object graph. Update the DfsPackDescription to include the index version. Change-Id: I3f1421e9cd90fe93e7e2ef2b8179ae2f1ba819ed
* Enable serving upload requests using bitmaps.Colby Ranger2013-03-052-0/+2
| | | | | | | If the pack index has bitmaps, allow the PackWriter to use the bitmaps for upload requests. Change-Id: Iefa995fe927a11e4fd78afb34530995614221fc0
* Support creating pack bitmap indexes in PackWriter.Colby Ranger2013-03-0514-26/+1078
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the PackWriter to support writing out pack bitmap indexes, a parallel ".bitmap" file to the ".pack" file. Bitmaps are selected at commits every 1 to 5,000 commits for each unique path from the start. The most recent 100 commits are all bitmapped. The next 19,000 commits have a bitmaps every 100 commits. The remaining commits have a bitmap every 5,000 commits. Commits with more than 1 parent are prefered over ones with 1 or less. Furthermore, previously computed bitmaps are reused, if the previous entry had the reuse flag set, which is set when the bitmap was placed at the max allowed distance. Bitmaps are used to speed up the counting phase when packing, for requests that are not shallow. The PackWriterBitmapWalker uses a RevFilter to proactively mark commits with RevFlag.SEEN, when they appear in a bitmap. The walker produces the full closure of reachable ObjectIds, given the collection of starting ObjectIds. For fetch request, two ObjectWalks are executed to compute the ObjectIds reachable from the haves and from the wants. The ObjectIds needed to be written are determined by taking all the resulting wants AND NOT the haves. For clone requests, we get cached pack support for "free" since it is possible to determine if all of the ObjectIds in a pack file are included in the resulting list of ObjectIds to write. On my machine, the best times for clones and fetches of the linux kernel repository (with about 2.6M objects and 300K commits) are tabulated below: Operation Index V2 Index VE003 Clone 37530ms (524.06 MiB) 82ms (524.06 MiB) Fetch (1 commit back) 75ms 107ms Fetch (10 commits back) 456ms (269.51 KiB) 341ms (265.19 KiB) Fetch (100 commits back) 449ms (269.91 KiB) 337ms (267.28 KiB) Fetch (1000 commits back) 2229ms ( 14.75 MiB) 189ms ( 14.42 MiB) Fetch (10000 commits back) 2177ms ( 16.30 MiB) 254ms ( 15.88 MiB) Fetch (100000 commits back) 14340ms (185.83 MiB) 1655ms (189.39 MiB) Change-Id: Icdb0cdd66ff168917fb9ef17b96093990cc6a98d
* Added read/write support for pack bitmap index.Colby Ranger2013-03-0532-6/+2790
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f
* Merge "Break the dependency on RevObject when creating a newObjectToPack()."Shawn Pearce2013-03-048-35/+25
|\
| * Break the dependency on RevObject when creating a newObjectToPack().Colby Ranger2013-03-048-35/+25
| | | | | | | | | | | | | | | | | | | | Update the ObjectReuseAsIs API to support creating new ObjectToPack with only the AnyObjectId and Git object type. This is needed to support the future pack index bitmaps, which only contain this information and do not want the overhead of creating a temporary object for every ObjectId. Change-Id: I906360b471412688bf429ecef74fd988f47875dc
* | Merge "Fix RefUpdate performance for existing Refs"Shawn Pearce2013-03-041-1/+2
|\ \
| * | Fix RefUpdate performance for existing RefsRoberto Tyley2013-03-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No longer invoke the expensive RefDatabase.isNameConflicting() check on updating existing refs, reducing batch ref update time by ~97%. The RefDirectory implementation of isNameConflicting() is quite slow (it has to do an expensive loose-ref scan) but it's only necessary to perform this check on ref update if the ref is being *created* - if the ref already exists, we can already guarantee that it does not conflict with any other refs. C-Git seems to use a similar condition before making the is_refname_available() check: https://github.com/git/git/blob/v1.8.1.4/refs.c#L1660-L1670 As an example of the effects on performance, here's a simple timing experiment using The BFG to remove one file from the JGit repo: --- $ wget http://repo1.maven.org/maven2/com/madgag/bfg-repo-cleaner/1.0.1/bfg-1.0.1.jar $ git clone --mirror https://git.eclipse.org/r/p/jgit/jgit.git $ java -jar bfg-1.0.1.jar -D make_jgit.sh jgit.git .... Updating references: 100% (5760/5760) ...Ref update completed in 148,949 ms. BFG run is complete! --- The execution time for the run is completely dominated by the batch ref update at the end. Repeating the experiment with BFG v1.0.2 (using JGit patched with this change), the refs update is dramatically reduced: --- Updating references: 100% (5760/5760) ...Ref update completed in 4,327 ms. --- Change-Id: I9057bc4ee22f9cc269b1cc00c493841c71527cd6
* | | Merge "Fix corrupted CloneCommand bare-repo fetch-refspec (#402031)"Shawn Pearce2013-03-042-5/+31
|\ \ \ | |_|/ |/| |
| * | Fix corrupted CloneCommand bare-repo fetch-refspec (#402031)Roberto Tyley2013-03-042-5/+31
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CloneCommand has been creating fetch refspecs like this on bare clones: [remote "origin"] url = ssh://example.com/my-repo.git fetch = +refs/heads/*:refs/heads//* As you can see, the destination ref pattern has a superfluous slash. It looks like this behaviour has always been the case for CloneCommand, at least since cc2197ed when code catering to bare-clone fetch refspecs was added. That was released with JGit v1.0 almost 2 years ago, so there will probably be some bare repos in the wild which will have been cloned with JGit and have these corrupted refspecs. The effect of the corrupted fetch refspec is quite interesting. Up to and including JGit 2.0, the corrupt refspec was tolerated and fetches would work as intended with no indication to the user that anything was amiss. With JGit 2.1, a change was introduced which made JGit less tolerant, and fetches now attempt to update the non-existing ref "refs/heads//master". No exception is raised, but the real ref - "refs/heads/master" - is not updated. This behaviour was noticed by a user of Agit (which does bare clones by default and recently updated from JGit v2.0 to v2.2), reported here: https://github.com/rtyley/agit/issues/92 If you run C-Git fetch on a bare-repo cloned by JGit, it flat-out rejects the refspec (checked against v1.7.10.4): fatal: Invalid refspec '+refs/heads/*:refs/heads//*' Incidentally, C-Git does not create an explicit fetch refspec at all when performing a bare clone - the full remote config generated by C-Git looks like this: [remote "origin"] url = ssh://example.com/my-repo.git Using JGit on such a repository works fine, so omitting the fetch refspec entirely is also an option. Change-Id: I14b0d359dc69b8908f68e02cea7a756ac34bf881
* | Merge "Remove the unused method PackFile.hasExt()."Colby Ranger2013-03-041-7/+0
|\ \ | |/ |/|
| * Remove the unused method PackFile.hasExt().Colby Ranger2013-03-041-7/+0
| | | | | | | | | | | | It will be used in a future change, so just include it with that change. Change-Id: I7db28d86f8e8b282a403acd9a4c4defaae828f94
* | Merge "Improve the documentation of the ByteArraySet used by PathFilterGroup"Shawn Pearce2013-02-281-4/+12
|\ \ | |/ |/|
| * Improve the documentation of the ByteArraySet used by PathFilterGroupRobin Rosenberg2013-02-221-4/+12
| | | | | | | | Change-Id: I2ba7a67e8e1596aa6c33a9caddee03a6be48f008
* | Include supported extensions in PackFile constructor.Colby Ranger2013-02-284-22/+85
| | | | | | | | | | | | | | | | | | Previously a PackFile class was assumed to only support a .pack and .idx file. Update the constructor to enumerate the supported extensions for the pack file. This will allow the bitmap code to only be executed if the bitmap extension file is known to exist. Change-Id: Ie59041dffec5f60d7ea2771026ffd945106bd4bf
* | Fix while boundries in DateRevQueue.add()Gustaf Lundh2013-02-251-1/+1
| | | | | | | | | | | | | | In add(), "low" will never equals "first". This fact should be reflected in the code. Change-Id: I5cab51374e67bd2d3301e5d9dac47c4259b5e562
* | Merge "Performance fixes in DateRevQueue"Shawn Pearce2013-02-251-2/+62
|\ \