summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge "Remove unused dependencies"Robin Rosenberg2013-04-188-24/+4
|\
| * Remove unused dependenciesMatthias Sohn2013-04-098-24/+4
| | | | | | | | Change-Id: I3cd161ac360a2e2635bffe309725a41c9527694e Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* | Improve test coverage of AutoCRLF(In|Out)putStreamRobin Stocker2013-04-183-19/+37
| | | | | | | | | | | | Bug: 405672 Change-Id: I3894e98617fcee16dc2ac9853c203c62eb30c3ab Signed-off-by: Chris Aniszczyk <zx@twitter.com>
* | Merge changes Id2848c16,I7621c434Shawn Pearce2013-04-173-56/+243
|\ \ | | | | | | | | | | | | | | | * changes: Rescale "Compressing objects" progress meter by size Split delta search buckets by byte weight
| * | Rescale "Compressing objects" progress meter by sizeShawn Pearce2013-04-173-16/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of counting objects processed, count number of bytes added into the window. This should rescale the progress meter so that 30% complete means 30% of the total uncompressed content size has been inflated and fed into the window. In theory the progress meter should be more accurate about its percentage complete/remaining fraction than with objects. When counting objects small objects move the progress meter more rapidly than large objects, but demand a smaller amount of work than large objects being compressed. Change-Id: Id2848c16a2148b5ca51e0ca1e29c5be97eefeb48
| * | Split delta search buckets by byte weightShawn Pearce2013-04-173-42/+187
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of assuming all objects cost the same amount of time to delta compress, aggregate the byte size of objects in the list and partition threads with roughly equal total bytes. Before splitting the list select the N largest paths and assign each one to its own thread. This allows threads to get through the worst cases in parallel before attempting smaller paths that are more likely to be splittable. By running the largest path buckets first on each thread the likely slowest part of compression is done early, while progress is still reporting a low percentage. This gives users a better impression of how fast the phase will run. On very complex inputs the slow part is more likely to happen first, making a user realize its time to go grab lunch, or even run it overnight. If the worst sections are earlier, memory overruns may show up earlier, giving the user a chance to correct the configuration and try again before wasting large amounts of time. It also makes it less likely the delta compression phase reaches 92% in 30 minutes and then crawls for 10 hours through the remaining 8%. Change-Id: I7621c4349b99e40098825c4966b8411079992e5f
* | | Merge "Support excluding objects during DFS compaction"Shawn Pearce2013-04-171-31/+102
|\ \ \
| * | | Support excluding objects during DFS compactionShawn Pearce2013-04-161-31/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By excluding objects the compactor can avoid storing objects that are already well packed in the base GC packs, or any other pack not being replaced by the current compaction operation. For deltas the base object is still included even if the base exists in another exclusion set. This favors keeping deltas for recent history, to support faster fetch operations for clients. Change-Id: Ie822fe075fe5072fe3171450fda2f0ca507796a1
* | | | Make recursive merge strategy the default merge strategyMatthias Sohn2013-04-156-9/+10
|/ / / | | | | | | | | | | | | | | | | | | | | | Use recursive merge as the default strategy since it can successfully merge more cases than the resolve strategy can. This is also the default in native Git. Change-Id: I38fd522edb2791f15d83e99038185edb09fed8e1 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* | | Update PackBitmapIndexRemapper to handle mappings not in the new pack.Colby Ranger2013-04-151-3/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, the code assumed all commits in the old pack would also be present in the new pack. This assumption caused an ArrayIndexOutOfBoundsException during remapping of ids. Fix the iterator to only return entries that may be remapped. Furthermore, update getBitmap() to return null if commit does not exist in the new pack. Change-Id: I065babe8cd39a7654c916bd01c7012135733dddf
* | | Fix boundary conditions in AutoCRLFOutputStreamRobin Rosenberg2013-04-142-36/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes some problems with inputs around the size of the internal buffer in AutoCRLFOutputStream (8000). Tests supplied by Robin Stocker. Bug: 405672 Change-Id: I6147897290392b3bfd4040e8006da39c302a3d49
* | | NLS warning cleanupRobin Rosenberg2013-04-142-6/+5
| | | | | | | | | | | | Change-Id: Ia76aa02dd330a1f88096c2b059b363aa38d653e9
* | | Merge "Fix a possible NPE"Robin Rosenberg2013-04-131-1/+1
|\ \ \ | |/ / |/| |
| * | Fix a possible NPERobin Rosenberg2013-04-041-1/+1
| | | | | | | | | | | | | | | | | | | | | String.valueOf is an overloaded and the compiler unfortunately picks the wrong one since null contains no type information. Change-Id: Icd197eaa046421f3cfcc5bf3e7601dc5bc7486b6
* | | Merge changes ↵Shawn Pearce2013-04-125-260/+233
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I845caede,Ie25c6d3a,I5caec313,Ib11ff99f,I9ccf20c3,Ic7826f29,I1bdd8b58,Idb84c1d7,I078841f9 * changes: Always attempt delta compression when reuseDeltas is false Avoid TemporaryBuffer.Heap on very small deltas Correct distribution of allowed delta size along chain length Split remaining delta work on path boundaries Replace DeltaWindow array with circularly linked list Micro-optimize copy instructions in DeltaEncoder Micro-optimize DeltaWindow primary loop Micro-optimize DeltaWindow maxMemory test to be != 0 Mark DeltaWindowEntry methods final
| * | | Always attempt delta compression when reuseDeltas is falseShawn Pearce2013-04-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If reuseObjects=true but reuseDeltas=false the caller wants attempt a delta for every object in the input list. Test for reuseDeltas to ensure every object passes through the searchInWindow() method. If no delta is possible for an object and it will be stored whole (non-delta format), PackWriter may still reuse its content from any source pack. This avoids an inflate()-deflate() cycle to recompress the object contents. Change-Id: I845caeded419ef4551ef1c85787dd5ffd73235d9
| * | | Avoid TemporaryBuffer.Heap on very small deltasShawn Pearce2013-04-121-25/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TemporaryBuffer is great when the output size is not known, but must be bound by a relatively large upper limit that fits in memory, e.g. 64 KiB or 20 MiB. The buffer gracefully supports growing storage by allocating 8 KiB blocks and storing them in an ArrayList. In a Git repository many deltas are less than 8 KiB. Typical tree objects are well below this threshold, and their deltas must be encoded even smaller. For these much smaller cases avoid the 8 KiB minimum allocation used by TemporaryBuffer. Instead allocate a very small OutputStream writing to an array that is sized at the limit. Change-Id: Ie25c6d3a8cf4604e0f8cd9a3b5b701a592d6ffca
| * | | Correct distribution of allowed delta size along chain lengthShawn Pearce2013-04-121-113/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nicolas Pitre discovered a very simple rule for selecting between two different delta base candidates: - if based whole object, must be <= 50% of target - if at end of a chain, must be <= 1/depth * 50% of target The rule penalizes deltas near the end of the chain, requiring them to be very small in order to be kept by the packer. This favors deltas that are based on a shorter chain, where the read-time unpack cost is much lower. Fewer bytes need to be consulted from the source pack file, and less copying is required in memory to rebuild the object. Junio Hamano explained Nico's rule to me today, and this commit fixes DeltaWindow to implement it as described. When no base has been chosen the computation is simply the statements denoted above. However once a base with depth of 9 has been chosen (e.g. when pack.depth is limited to 10), a non-delta source may create a new delta that is up to 10x larger than the already selected base. This reflects the intent of Nico's size distribution rule no matter what order objects are visited in the DeltaWindow. With this patch and my other patches applied, repacking JGit with: [pack] reuseObjects = false reuseDeltas = false depth = 50 window = 250 threads = 4 compression = 9 CGit (all) 5,711,735 bytes; real 0m13.942s user 0m47.722s [1] JGit heads 5,718,295 bytes; real 0m11.880s user 0m38.177s [2] rest 9,809 bytes The improved JGit result for the head pack is only 6.4 KiB larger than CGit's resulting pack. This patch allowed JGit to find an additional 39.7 KiB worth of space savings. JGit now also often runs 2s faster than CGit, despite also creating bitmaps and pruning objects after the head pack creation. [1] time git repack -a -d -F --window=250 --depth=50 [2] time java -Xmx128m -jar jgit debug-gc Change-Id: I5caec31359bf7248cabdd2a3254c84d4ee3cd96b
| * | | Split remaining delta work on path boundariesShawn Pearce2013-04-122-28/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an idle thread tries to steal work from a sibling's remaining toSearch queue, always try to split along a path boundary. This avoids missing delta opportunities in the current window of the thread whose work is being taken. The search order is reversed to walk further down the chain from current position, avoiding the risk of splitting the list within the path the thread is currently processing. When selecting which thread to split from use an accurate estimate of the size to be taken. This avoids selecting a thread that has only one path remaining but may contain more pending entries than another thread with several paths remaining. As there is now a race condition where the straggling thread can start the next path before the split can finish, the stealWork() loop spins until it is able to acquire a split or there is only one path remaining in the siblings. Change-Id: Ib11ff99f90a4d9efab24bf4a85342cc63203dba5
| * | | Replace DeltaWindow array with circularly linked listShawn Pearce2013-04-112-73/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Typical window sizes are 10 and 250 (although others are accepted). In either case the pointer overhead of 1 pointer in an array or 2 pointers for a double linked list is trivial. A doubly linked list as used here for window=250 is only another 1024 bytes on a 32 bit machine, or 2048 bytes on a 64 bit machine. The critical search loops scan through the array in either the previous direction or the next direction until the cycle is finished, or some other scan abort condition is reached. Loading the next object's pointer from a field in the current object avoids the branch required to test for wrapping around the edge of the array. It also saves the array bounds check on each access. When a delta is chosen the window is shuffled to hoist the currently selected base as an earlier candidate for the next object. Moving the window entry is easier in a double-linked list than sliding a group of array entries. Change-Id: I9ccf20c3362a78678aede0f0f2cda165e509adff
| * | | Micro-optimize copy instructions in DeltaEncoderShawn Pearce2013-04-111-18/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The copy instruction formatter should not to compute the shifts and masks twice. Instead compute them once and assume there is a register available to store the temporary "b" for compare with 0. Change-Id: Ic7826f29dca67b16903d8f790bdf785eb478c10d
| * | | Micro-optimize DeltaWindow primary loopShawn Pearce2013-04-111-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | javac and the JIT are more likely to understand a boolean being used as a branch conditional than comparing int against 0 and 1. Rewrite NEXT_RES and NEXT_SRC constants to be booleans so the code is clarified for the JIT. Change-Id: I1bdd8b587a69572975a84609c779b9ebf877b85d
| * | | Micro-optimize DeltaWindow maxMemory test to be != 0Shawn Pearce2013-04-111-8/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using a compare-with-0 use a does not equal 0. javac bytecode has a special instruction for this, as it is very common in software. We can assume the JIT knows how to efficiently translate the opcode to machine code, and processors can do != 0 very quickly. Change-Id: Idb84c1d744d2874517fd4bfa1db390e2dbf64eac
| * | | Mark DeltaWindowEntry methods finalShawn Pearce2013-04-101-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This class and all of its methods are only package visible. Clarify the methods as final for the benefit of the JIT to inline trivial code. Change-Id: I078841f9900dbf299fbe6abf2599f0208ae96856
* | | | Remove DFS locality ordering during packingShawn Pearce2013-04-123-48/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PackWriter generally chooses the order for objects when it builds the object lists. This ordering already depends on history information to guide placing more recent objects first and historical objects last. Allow PackWriter to make the basic ordering decisions, instead of trying to override them. The old approach of sorting the list caused DfsReader to override any ordering change PackWriter might have tried to make when repacking a repository. This now better matches with WindowCursor's implementation, where PackWriter solely determines the object ordering. Change-Id: Ic17ab5631ec539f0758b962966c3a1823735b814
* | | | Merge "Consider working tree changes when stashing newly added files"Robin Rosenberg2013-04-112-11/+43
|\ \ \ \ | |/ / / |/| | |
| * | | Consider working tree changes when stashing newly added filesRobin Rosenberg2013-04-092-11/+43
| | | | | | | | | | | | | | | | | | | | Bug: 402396 Change-Id: I50ff707c0c9abcab3f98eea21aaa6e824f7af63a
* | | | Merge changes ↵Shawn Pearce2013-04-109-136/+236
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ideecc472,I2b12788a,I6cb9382d,I12cd3326,I200baa0b,I05626f2e,I65e45422 * changes: Increase PackOutputStream copy buffer to 64 KiB Tighten object header writing in PackOutuptStream Skip main thread test in ThreadSafeProgressMonitor Declare members of PackOutputStream final Always allocate the PackOutputStream copyBuffer Disable CRC32 computation when no PackIndex will be created Steal work from delta threads to rebalance CPU load
| * | | | Increase PackOutputStream copy buffer to 64 KiBShawn Pearce2013-04-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Colby just pointed out to me the buffer was 16 KiB. This may be very small for common objects. Increase to 64 KiB. Change-Id: Ideecc4720655a57673252f7adb8eebdf2fda230d
| * | | | Tighten object header writing in PackOutuptStreamShawn Pearce2013-04-101-31/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most objects are written as OFS_DELTA with the base in the pack, that is why this case comes first in writeHeader(). Rewrite the condition to always examine this first and cache the PackWriter's formatting flag for use of OFS_DELTA headers, in modern Git networks this is true more often then it it is false. Assume the cost of write() is high, especially due to entering the MessageDigest to update the pack footer SHA-1 computation. Combine the OFS_DELTA information as part of the header buffer so that the entire burst is a single write call, rather than two relatively small ones. Most OFS_DELTA headers are <= 6 bytes, so this rewrite tranforms 2 writes of 3 bytes each into 1 write of ~6 bytes. Try to simplify the objectHeader code to reduce branches and use more local registers. This shouldn't really be necessary if the compiler is well optimized, but it isn't very hard to clarify data usage to either javac or the JIT, which may make it easier for the JIT to produce better machine code for this method. Change-Id: I2b12788ad6866076fabbf7fa11f8cce44e963f35
| * | | | Skip main thread test in ThreadSafeProgressMonitorShawn Pearce2013-04-102-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | update(int) is only invoked from a worker thread, in JGit's case this is DeltaTask. The Javadoc of TSPM suggests update should only ever be used by a worker thread. Skip the main thread check, saving some cycles on each run of the progress monitor. Change-Id: I6cb9382d71b4cb3f8e8981c7ac382da25304dfcb
| * | | | Declare members of PackOutputStream finalShawn Pearce2013-04-101-9/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These methods cannot be sanely overridden anywhere. Most methods are package visible only, or are private. A few public methods do exist but there is no useful way to override them since creation of PackOutputStream is managed by PackWriter and cannot be delegated. Change-Id: I12cd3326b78d497c1f9751014d04d1460b46e0b0
| * | | | Always allocate the PackOutputStream copyBufferShawn Pearce2013-04-101-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The getCopyBuffer() is almost always used during output. All known implementations of ObjectReuseAsIs rely on the buffer to be present, and the only sane way to get good performance from PackWriter is to reuse objects during packing. Avoid a branch and test when obtaining this buffer by making sure it is always populated. Change-Id: I200baa0bde5dcdd11bab7787291ad64535c9f7fb
| * | | | Disable CRC32 computation when no PackIndex will be createdShawn Pearce2013-04-105-23/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a server is streaming 3GiB worth of pack data to a client there is no reason to compute the CRC32 checksum on the objects. The CRC32 code computed by PackWriter is used only in the new index created by writeIndex(), which is never invoked for the native Git network protocols. Object reuse may still compute its own CRC32 to verify the data being copied from an existing pack has not been corrupted. This check is done by the ObjectReader that implements ObjectReuseAsIs and has no relationship to the CRC32 being skipped during output. Change-Id: I05626f2e0d6ce19119b57d8a27193922636d60a7
| * | | | Steal work from delta threads to rebalance CPU loadShawn Pearce2013-04-103-66/+147
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the configuration wants to run 4 threads the delta search work is initially split somewhat evenly across the 4 threads. During execution some threads will finish early due to the work not being split fairly, as the initial partitions were based on object count and not cost to inflate or size of DeltaIndex. When a thread finishes early it now tries to take 50% of the work remaining on a sibling thread, and executes that before exiting. This repeats as each thread completes until a thread has only 1 object remaining. Repacking Blink, Chromium's new fork of WebKit (2.2M objects 3.9G): [pack] reuseDeltas = false reuseObjects = false depth = 50 threads = 8 window = 250 windowMemory = 800m before: ~105% CPU after 80% after: >780% CPU to 100% Change-Id: I65e45422edd96778aba4b6e5a0fd489ea48e8ca3
* | | | | Merge "LogCommand.all(): filter out refs that do not refer to commit objects"Robin Rosenberg2013-04-102-2/+21
|\ \ \ \ \
| * | | | | LogCommand.all(): filter out refs that do not refer to commit objectsArthur Baars2013-03-312-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. I have authored 100% of the content I'm contributing, 2. I have the rights to donate the content to Eclipse, 3. I contribute the content under the EDL Change-Id: I48b1828e0b1304f76276ec07ebac7ee9f521b194
* | | | | | Merge "LogCommand.all(), peel references before using them"Robin Rosenberg2013-04-102-0/+27
|\| | | | | | |_|_|_|/ |/| | | |
| * | | | LogCommand.all(), peel references before using themArthur Baars2013-03-312-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: LogCommand.all() throws an IncorrectObjectTypeException when there are tag references, and the repository does not contain the file "packed-refs". It seems that the references were not properly peeled before being added to the markStart() method. Solution: Call getRepository().peel() on every Ref that has isPeeled()==false in LogCommand.all() . Added test case for LogCommand.all() on repo with a tag. 1. I have authored 100% of the content I'm contributing, 2. I have the rights to donate the content to Eclipse, 3. I contribute the content under the EDL Bug: 402025 Change-Id: Idb8881eeb6ccce8530f2837b25296e8e83636eb7
* | | | | Merge "clean up merge squash and no-commit messages in pgm"Christian Halstrick2013-04-092-1/+2
|\ \ \ \ \
| * | | | | clean up merge squash and no-commit messages in pgmTomasz Zarna2013-04-082-1/+2
| | |_|/ / | |/| | | | | | | | | | | | | Change-Id: Iffa6e8752fbd94f3ef69f49df772be82e3da5d05
* | | | | Add a constant for info/excludeRobin Rosenberg2013-04-082-2/+5
| | | | | | | | | | | | | | | | | | | | Change-Id: Ifd537ce4e726cb9460ea332f683428689bd3d7f4
* | | | | Merge changes I8445070d,I38f10d62,I2af0bf68Matthias Sohn2013-04-0812-10/+13
|\ \ \ \ \ | |/ / / / |/| | | | | | | | | | | | | | | | | | | | | | | | * changes: Fix plugin provider names to conform with release train requirement Add missing @since tags for new API methods DfsReaderOptions are options for a DFS stored repository
| * | | | Fix plugin provider names to conform with release train requirementMatthias Sohn2013-04-089-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to release train requirements [1] the provider name for all artifacts of Eclipse projects is "Eclipse <project name>". [1] http://wiki.eclipse.org/Development_Resources/HOWTO/Release_Reviews#Checklist Change-Id: I8445070d1d96896d378bfc49ed062a5e7e0f201f Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
| * | | | Add missing @since tags for new API methodsMatthias Sohn2013-04-072-0/+3
| | | | | | | | | | | | | | | | | | | | Change-Id: I38f10d622c30f19d1154a4901477e844cb411707 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
| * | | | DfsReaderOptions are options for a DFS stored repositoryMatthias Sohn2013-04-061-1/+1
| | |/ / | |/| | | | | | | | | | Change-Id: I2af0bf686188f1402fb53bf6dbe0ecb228069ace Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* / | | Detect and handle a checkout conflict during merge nicelyRobin Rosenberg2013-04-083-1/+17
|/ / / | | | | | | | | | | | | | | | Report the conflicting files nicely and inform the user. Change-Id: I75d464d4156d10c6cc6c7ce5a321e2c9fb0df375
* | | Support cutting existing delta chains longer than the max depthShawn Pearce2013-04-053-7/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some packs built by JGit have incredibly long delta chains due to a long standing bug in PackWriter. Google has packs created by JGit's DfsGarbageCollector with chains of 6000 objects long, or more. Inflating objects at the end of this 6000 long chain is impossible to complete within a reasonable time bound. It could take a beefy system hours to perform even using the heavily optimized native C implementation of Git, let alone with JGit. Enable pack.cutDeltaChains to be set in a configuration file to permit the PackWriter to determine the length of each delta chain and clip the chain at arbitrary points to fit within pack.depth. Delta chain cycles are still possible, but no attempt is made to detect them. A trivial chain of A->B->A will iterate for the full pack.depth configured limit (e.g. 50) and then pick an object to store as non-delta. When cutting chains the object list is walked in reverse to try and take advantage of existing chain computations. The assumption here is most deltas are near the end of the list, and their bases are near the front of the list. Going up from the tail attempts to reuse chainLength computations by relying on the memoized value in the delta base. The chainLength field in ObjectToPack is overloaded into the depth field normally used by DeltaWindow. This is acceptable because the chain cut happens before delta search, and the chainLength is reset to 0 if delta search will follow. Change-Id: Ida4fde9558f3abbbb77ade398d2af3941de9c812
* | | Micro-optimize reuseDeltaFor in PackWriterShawn Pearce2013-04-051-10/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This switch is called mostly for OBJ_TREE and OBJ_BLOB types, which typically make up 66% of the objects in a repository. Simplify the test for these common types by testing for the one bit they have in common and returning early. Object type 5 is currently undefined. In the old code it would hit the default and return true. In the new code it will match the early case and also return true. In either implementation 5 should never show up as it is not a valid type known to Git. Object type 6 OFS_DELTA is not permitted to be supplied here. Object type 7 REF_DELTA is not permitted to be supplied here. Change-Id: I0ede8acee928bb3e73c744450863942064864e9c
* | | Static import OBJ_* constants into PackWriterShawn Pearce2013-04-051-55/+57
| | | | | | | | | | | | | | | | | | Shortens most of the code that touches the objectLists. Change-Id: Ib14d366dd311e544e7ba50e9ce07a6f3ce0cf254