summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Always allocate the PackOutputStream copyBufferShawn Pearce2013-04-101-4/+2
| | | | | | | | | | | | The getCopyBuffer() is almost always used during output. All known implementations of ObjectReuseAsIs rely on the buffer to be present, and the only sane way to get good performance from PackWriter is to reuse objects during packing. Avoid a branch and test when obtaining this buffer by making sure it is always populated. Change-Id: I200baa0bde5dcdd11bab7787291ad64535c9f7fb
* Disable CRC32 computation when no PackIndex will be createdShawn Pearce2013-04-105-23/+35
| | | | | | | | | | | | | | | If a server is streaming 3GiB worth of pack data to a client there is no reason to compute the CRC32 checksum on the objects. The CRC32 code computed by PackWriter is used only in the new index created by writeIndex(), which is never invoked for the native Git network protocols. Object reuse may still compute its own CRC32 to verify the data being copied from an existing pack has not been corrupted. This check is done by the ObjectReader that implements ObjectReuseAsIs and has no relationship to the CRC32 being skipped during output. Change-Id: I05626f2e0d6ce19119b57d8a27193922636d60a7
* Steal work from delta threads to rebalance CPU loadShawn Pearce2013-04-103-66/+147
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If the configuration wants to run 4 threads the delta search work is initially split somewhat evenly across the 4 threads. During execution some threads will finish early due to the work not being split fairly, as the initial partitions were based on object count and not cost to inflate or size of DeltaIndex. When a thread finishes early it now tries to take 50% of the work remaining on a sibling thread, and executes that before exiting. This repeats as each thread completes until a thread has only 1 object remaining. Repacking Blink, Chromium's new fork of WebKit (2.2M objects 3.9G): [pack] reuseDeltas = false reuseObjects = false depth = 50 threads = 8 window = 250 windowMemory = 800m before: ~105% CPU after 80% after: >780% CPU to 100% Change-Id: I65e45422edd96778aba4b6e5a0fd489ea48e8ca3
* Support cutting existing delta chains longer than the max depthShawn Pearce2013-04-053-7/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some packs built by JGit have incredibly long delta chains due to a long standing bug in PackWriter. Google has packs created by JGit's DfsGarbageCollector with chains of 6000 objects long, or more. Inflating objects at the end of this 6000 long chain is impossible to complete within a reasonable time bound. It could take a beefy system hours to perform even using the heavily optimized native C implementation of Git, let alone with JGit. Enable pack.cutDeltaChains to be set in a configuration file to permit the PackWriter to determine the length of each delta chain and clip the chain at arbitrary points to fit within pack.depth. Delta chain cycles are still possible, but no attempt is made to detect them. A trivial chain of A->B->A will iterate for the full pack.depth configured limit (e.g. 50) and then pick an object to store as non-delta. When cutting chains the object list is walked in reverse to try and take advantage of existing chain computations. The assumption here is most deltas are near the end of the list, and their bases are near the front of the list. Going up from the tail attempts to reuse chainLength computations by relying on the memoized value in the delta base. The chainLength field in ObjectToPack is overloaded into the depth field normally used by DeltaWindow. This is acceptable because the chain cut happens before delta search, and the chainLength is reset to 0 if delta search will follow. Change-Id: Ida4fde9558f3abbbb77ade398d2af3941de9c812
* Micro-optimize reuseDeltaFor in PackWriterShawn Pearce2013-04-051-10/+6
| | | | | | | | | | | | | | | | | This switch is called mostly for OBJ_TREE and OBJ_BLOB types, which typically make up 66% of the objects in a repository. Simplify the test for these common types by testing for the one bit they have in common and returning early. Object type 5 is currently undefined. In the old code it would hit the default and return true. In the new code it will match the early case and also return true. In either implementation 5 should never show up as it is not a valid type known to Git. Object type 6 OFS_DELTA is not permitted to be supplied here. Object type 7 REF_DELTA is not permitted to be supplied here. Change-Id: I0ede8acee928bb3e73c744450863942064864e9c
* Static import OBJ_* constants into PackWriterShawn Pearce2013-04-051-55/+57
| | | | | | Shortens most of the code that touches the objectLists. Change-Id: Ib14d366dd311e544e7ba50e9ce07a6f3ce0cf254
* Renumber internal ObjectToPack flagsShawn Pearce2013-04-041-14/+4
| | | | | | | | Now that WANT_WRITE is gone renumber the flags to move the unused bit next to the type. Recluster AS_IS and DELTA_ATTEMPTED to be next to each other since these bits are tested as a pair. Change-Id: I42994b5ff1f67435e15c3f06d02e3b82141e8f08
* Move wantWrite flag to be special offset 1Shawn Pearce2013-04-041-6/+4
| | | | | | | | | | | | | | | | Free up the WANT_WRITE flag in ObjectToPack by switching the test to use the special offset value of 1. The Git pack file format calls for the first 4 bytes to be 'PACK', which means any object must start at an offset >= 4. Current versions require another 8 bytes in the header, placing the first object at offset = 12. So offset = 1 is an invalid location for an object, and can be used as a marker signal to indicate the writing loop has tried to write the object, but recursed into the base first. When an object is visited with offset == 1 it means there is a cycle in the delta base path, and the cycle must be broken. Change-Id: I2d05b9017c5f9bd9464b91d43e8d4b4a085e55bc
* Don't delta compress garbage objectsShawn Pearce2013-04-042-6/+11
| | | | | | | | | | | | | | | | | Garbage is randomly ordered and unlikely to delta compress against other garbage. Disable delta compression allowing objects to switch to whole form when moving to the garbage pack. Because the garbage is not well compressed assume deltas were not attempted during a normal GC cycle. Override the reuse settings, garbage that can be reused should be reused as-is into the garbage pack rather than switching something like the compression level during a GC. It is intended that garbage will eventually be removed from the repository so expending CPU time on a compression switch is not worthwhile. Change-Id: I0e8e58ee99e5011d375d3d89c94f2957de8402b9
* Delete broken DFS read-ahead supportShawn Pearce2013-04-047-544/+17
| | | | | | | | | | | | | | | | | | | | | | | | | This implementation has been proven to deadlock in production server loads. Google has been running with it disabled for a quite a while, as the bugs have been difficult to identify and fix. Instead of suggesting it works and is useful, drop the code. JGit should not advertise support for functionality that is known to be broken. In a few of the places where read-ahead was enabled by DfsReader there is more information about what blocks should be loaded when. During object representation selection, or size lookup, or sending object as-is to a PackWriter, or sending an entire pack as-is the reader knows exactly which blocks are required in the cache, and it also can compute when those will be needed. The broken read-ahead code was stupid and just read a fixed amount ahead of the current offset, which can waste IOs if more precise data was available. DFS systems are usually slow to respond so read-ahead is still a desired feature, but it needs to be rebuilt from scratch and make better use of the offset information. Change-Id: Ibaed8288ec3340cf93eb269dc0f1f23ab5ab1aea
* Optimize DFS object reuse selection codeShawn Pearce2013-04-045-103/+66
| | | | | | | | | | | | | | | | | | | Rewrite this complicated logic to examine each pack file exactly once. This reduces thrashing when there are many large pack files present and the reader needs to locate each object's header. The intermediate temporary list is now smaller, it is bounded to the same length as the input object list. In the prior version of this code the list contained one entry for every representation of every object being packed. Only one representation object is allocated, reducing the overall memory footprint to be approximately one reference per object found in the current pack file (the pointer in the BlockList). This saves considerable working set memory compared to the prior version that made and held onto a new representation for every ObjectToPack. Change-Id: I2c1f18cd6755643ac4c2cf1f23b5464ca9d91b22
* Simplify size test in PackWriterShawn Pearce2013-04-041-8/+6
| | | | | | | | | | | | | Clip the configured limit to Integer.MAX_VALUE at the top of the loop, saving a compare branch per object considered. This can cut 2M branches out of a repacking of the Linux kernel. Rewrite the logic so the primary path is to match the conditional; most objects are larger than BLKSZ (16 bytes) and less than limit. This may help branch prediction on CPUs if the CPU tries to assume execution takes the side of the branch and not the second. Change-Id: I5133d1651640939afe9fbcfd8cfdb59965c57d5a
* Declare critical exposed methods of ObjectToPack finalShawn Pearce2013-04-041-11/+11
| | | | | | | | | | | | There is no reasonable way for a subclass to correctly override and implement these methods. They depend on internal state that cannot otherwise be managed. Most of these methods are also in critical paths of PackWriter. Declare them final so subclasses do not try to replace them, and so the JIT knows the smaller ones can be safely inlined. Change-Id: I9026938e5833ac0b94246d21c69a143a9224626c
* Declare internal flag accessors of ObjectToPack finalShawn Pearce2013-04-041-22/+22
| | | | | | | | | None of these methods should ever be overridden at runtime by an extension class. Given how small they are the JIT should perform inlining where reasonable. Hint this is possible by marking all methods final so its clear no replacement can be loaded later on. Change-Id: Ia75a5d36c6bd25b24169e2bdfa360c8f52b669cd
* Remove unused method isDeltaAttempted()Shawn Pearce2013-04-041-4/+0
| | | | | | | | This flag is never checked on its own. It is only checked as part of a pair through the doNotAttemptDelta() method. Delete the method so there is less confusion about the flag being used on its own. Change-Id: Id7088caa649599f4f11d633412c2a2af0fd45dd8
* Simplify setDoNotDelta() to always set the flagShawn Pearce2013-04-042-9/+6
| | | | | | | | This method is only invoked with true as the argument. Remove the unnecessary parameter and branch, making the code easier for the JIT to optimize. Change-Id: I68a9cd82f197b7d00a524ea3354260a0828083c6
* Add the no-commit option to MergeCommandTomasz Zarna2013-04-049-12/+241
| | | | | | | | | | Added also tests and the associated option for the command line Merge command. Bug: 335091 Change-Id: Ie321c572284a6f64765a81674089fc408a10d059 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* Merge "Fix PathFilterGroup not to throw StopWalkException too early"Christian Halstrick2013-04-042-1/+24
|\
| * Fix PathFilterGroup not to throw StopWalkException too earlyRobin Rosenberg2013-04-032-1/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to the Git internal sort order a directory is sorted as if it ended with a '/', this means that the path filter didn't set the last possible matching entry to the correct value. In the reported issue we had the following filters. org.eclipse.jgit.console org.eclipse.jgit As an optimization we throw a StopWalkException when the walked tree passes the last possible filter, which was this: org.eclipse.jgit.console Due to the git sorting order, the tree was processed in this order: org.eclipse.jgit.console org.eclipse.jgit.test org.eclipse.jgit At org.eclipse.jgit.test we threw the StopWalkException preventing the walk from completing successfully. A correct last possible match should be: org.eclipse.jgit/ For simplicit we define it as: org/eclipse/jgit/ This filter would be the maximum if we also had e.g. org and org.eclipse in the filter, but that would require more work so we simply replace all characters lower than '/' by a slash. We believe the possible extra walking does not not warrant the extra analysis. Bug: 362430 Change-Id: I4869019ea57ca07d4dff6bfa8e81725f56596d9f
* | Merge "Indicate initial commit on a branch in the reflog"Christian Halstrick2013-04-044-12/+9
|\|
| * Indicate initial commit on a branch in the reflogRobin Rosenberg2013-04-024-12/+9
| | | | | | | | | | Bug: 393463 Change-Id: I4733d6f719bc0dc694e7a6a6ad2092de6364898c
* | Speed up clone/fetch with large number of refsRobin Rosenberg2013-03-303-19/+223
|/ | | | | | | | Instead of re-reading all refs after each update, execute the deletes first, then read all refs once and perform the check for conflicting ref names in memory. Change-Id: I17d0b3ccc27f868c8497607d8e57bf7082e65ba3
* Merge "When renaming the lock file succeeds the lock isn't held anymore"Robin Rosenberg2013-03-281-3/+8
|\
| * When renaming the lock file succeeds the lock isn't held anymoreMatthias Sohn2013-03-261-3/+8
| | | | | | | | | | | | | | | | This wrong book-keeping caused IOExceptions to be thrown because LockFile.unlock() erroneously tried to delete the non-existing lock file. These IOExeptions were hidden since they were silently caught. Change-Id: If42b6192d92c5a2d8f2bf904b16567ef08c32e89 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* | Fix CommitCommand amend mode to preserve parent orderShawn Pearce2013-03-281-3/+2
| | | | | | | | Change-Id: I476921ff8dfa6a357932d42ee59340873502b582
* | Fixed parsing of URI with a IPv6-addressAndreas König2013-03-272-2/+2
| | | | | | | | | | | | | | | | Allowed ipv6-address in a uri like: http://[::1]:8080/repo.git Change-Id: Ia00a20f694b2e9314892df77f9b11f551bb1d34e Signed-off-by: Chris Aniszczyk <zx@twitter.com>
* | New functions to facilitate the writing of CLI test casesFrançois Rey2013-03-272-14/+90
| | | | | | | | | | | | | | | | | | | | | | | | Writing CLI test cases is tedious because of all the formatting and escaping subtleties needed when comparing actual output with what's expected. While creating a test case the two new functions are to be used instead of the existing execute() in order to prepare the correct command and expected output and to generate the corresponding test code that can be pasted into the test case function. Change-Id: Ia66dc449d3f6fb861c300fef8b56fba83a56c94c Signed-off-by: Chris Aniszczyk <zx@twitter.com>
* | Merge "File.renameTo behaves differently on Unix and Windows"Matthias Sohn2013-03-271-10/+6
|\ \
| * | File.renameTo behaves differently on Unix and WindowsRobin Rosenberg2013-03-261-10/+6
| | | | | | | | | | | | | | | | | | | | | | | | On Windows renameTo will not overwrite a file, so it must be deleted first. The fix for Bug 402834 did not account for that. Bug: 403685 Change-Id: I3453342c17e064dcb50906a540172978941a10a6
* | | Merge "Extend FileUtils.rename to common git semantics"Matthias Sohn2013-03-272-1/+86
|\| | | |/ |/|
| * Extend FileUtils.rename to common git semanticsRobin Rosenberg2013-03-262-1/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | Unlike the OS or Java rename this method will (on *nix) try (on Windows) replace the target with the source provided the target does not exist, the target does exist and is a file, or if it is a directory which only contains directories. In the latter case the directory hierarchy will be deleted. If the initial rename fails and the target is an existing file the the target file will be deleted first and then the rename is retried. Change-Id: Iae75c49c85445ada7795246a02ce02f7c248d956 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
* | Always add FileExt to DfsPackDescriptionShawn Pearce2013-03-264-0/+8
| | | | | | | | | | | | | | | | | | | | Instead of forcing the implementation of the DFS backend to handle making sure the extension bits are set correctly, have the common callers in JGit set the extension at the same time they supply the file sizes to the pack description. This simplifies assumptions for an implementation of the DFS backend. Change-Id: I55142ad8ea08a3e2e8349f72b3714578eba9c342
* | Merge "Add tests for FileUtils.delete and EMPTY_DIREECTORIES_ONLY"Christian Halstrick2013-03-241-0/+85
|\|
| * Add tests for FileUtils.delete and EMPTY_DIREECTORIES_ONLYRobin Rosenberg2013-03-241-0/+85
| | | | | | | | Change-Id: I54a46c29df5eafc7739a6ef29e5dc80fa2f6d9ba
* | Update build to Tycho 0.17Matthias Sohn2013-03-241-1/+1
|/ | | | Change-Id: I92c9757a37644ec48ed1d785f4dacd6c44276632 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* SimpleHttpServer API shouldn't expose internalsMatthias Sohn2013-03-221-4/+3
| | | | Change-Id: I5963ae720f33cb148de08b4c64d02c81d6791139 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* Grant access to jgit internals to junit and http.server bundlesMatthias Sohn2013-03-221-1/+2
| | | | Change-Id: Ib34f9635b4d060f5d17a6c823ec91af1d934a180 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* Add missing @since tagsMatthias Sohn2013-03-2210-4/+25
| | | | Change-Id: I6b20d78e6bd1f245fdca331554c106f8bae44b9c Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* Fix @since tags in JGit, version 2.4 never existedTomasz Zarna2013-03-2119-9/+51
| | | | Change-Id: Iaca88ec28b412e6b58e7b39a0762ba54b25f9471 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* Merge changes If98b0b97,I7c9c09b4Shawn Pearce2013-03-215-27/+47
|\ | | | | | | | | | | * changes: Add convenience factory method for most used builder pattern Don't use internal type FileRepository in public API
| * Add convenience factory method for most used builder patternMatthias Sohn2013-03-201-0/+15
| | | | | | | | | | | | | | | | This will simplify to adapt EGit to the removal of FileRepository from jgit's public API in change I2ab1327c202ef2003565e1b0770a583970e432e9. Change-Id: If98b0b97e8f13a94d4ea7ba1be0f90d82b0fba4b Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
| * Don't use internal type FileRepository in public APIMatthias Sohn2013-03-205-27/+32
| | | | | | | | | | Change-Id: I7c9c09b4f190fa7cb830563bcdf2071407ee2ce0 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* | Allow users to show server messages while pushingAndré Dietisheim2013-03-2113-27/+300
| | | | | | | | | | | | | | | | | | | | | | | | Allow users to provide their OutputStream (via Transport# push(monitor, refUpdates, out)) so that server messages can be written to it (in SideBandInputStream) while they're coming in. CQ: 7065 Bug: 398404 Change-Id: I670782784b38702d52bca98203909aca0496d1c0 Signed-off-by: Andre Dietisheim <andre.dietisheim@gmail.com> Signed-off-by: Chris Aniszczyk <zx@twitter.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* | Don't verify host name when sslVerify is falseMatthias Sohn2013-03-191-0/+11
| | | | | | | | | | | | | | | | Native git also doesn't verify host names when http.sslVerify=false. See native git's commit a5ccc597. See: http://dev.eclipse.org/mhonarc/lists/jgit-dev/msg02047.html Change-Id: I42f509fea8e4ac89fad646aec3dfbf1753ae7e3d Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* | Fix line endings and whitespace errors in jgit featureMatthias Sohn2013-03-203-27/+27
| | | | | | | | | | Change-Id: I9fc69fccedf362453f74f1e09d2b50ac705a9cac Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* | Fix formatting of PackConfig.toString() & GC.RepoStatistics.toString()Edwin Kempin2013-03-202-13/+13
| | | | | | | | | | Change-Id: I7e0c74ecfd0e0615d10fb582b2897d33be23440a Signed-off-by: Edwin Kempin <edwin.kempin@sap.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* | Allow to get repo statistics from GarbageCollectionCommand before gcEdwin Kempin2013-03-203-0/+20
|/ | | | | | | | | | | | | | | | | When running the garbage collection for a repository it is often interesting to compare the repository statistics from before and after the garbage collection to understand the effect of the garbage collection. This is why it makes sense that the GarbageCollectionCommand provides a method to retrieve the repository statistics before running the garbage collection. So far without running the garbage collection the repository statistics can only be retrieved by using JGit internal classes. This is what EGit and Gerrit do at the moment, but it would be better to have an API for this. Change-Id: Id7e579157e9fbef5cfd1fc9f97ada45f0ca8c379 Signed-off-by: Edwin Kempin <edwin.kempin@sap.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* Merge "Fix GC for FileRepo in case packfile renames fail"Matthias Sohn2013-03-193-3/+55
|\
| * Fix GC for FileRepo in case packfile renames failChristian Halstrick2013-03-193-3/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only on Windows the rename operation which renames temporary Packfiles (and index-files and bitmap-files) sometime fails. This happens only when renaming a temporary Packfile to a Packfile which already exists. Such situations occur if you run GC twice on a repo without modifying the repo inbetween. In such situations there was bug in GC which led to a corrupted repo whithout any packfiles anymore. This commit fixes the problem by introducing a utility method which renames a file and throws an IOException if it fails. This method also takes care to repeat a failing rename if our FS class has found out we are running on a platform with a unreliable File.renameTo() method. I am searching for a better solution because even with this utility method in hand a GC on a already GC'ed repo will fail on Windows. But at least with this fix we will not produce corrupted repos anymore. Bug: 389305 Change-Id: Iac1ab3e0b8c419c90404f2e2f3559672eb8f6d28 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* | Fix location of DfsText.propertiesShawn Pearce2013-03-191-0/+0
|/ | | | | | The file was not moved when the package was renamed to internal. Change-Id: I29a078d6316daa4e4407db9ecedc8b7ed05535cd