mirrors/jgit - jgit - source @ dussan.org

Commit Graph

Author	SHA1	Message	Date
Carsten Hammer	c0268f899e	Join catch sections using multicatch Change-Id: I1a9112e6a4f938638c599b489cb0858eca27ab91 Signed-off-by: Carsten Hammer <carsten.hammer@t-online.de> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	5 years ago
Han-Wen Nienhuys	6d370d837c	Remove 'final' in parameter lists Change-Id: Id924f79c8b2c720297ebc49bf9c5d4ddd6d52547 Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>	6 years ago
Shawn Pearce	61d4922928	Fix missing deltas near type boundaries Delta search was discarding discovered deltas if an object appeared near a type boundary in the delta search window. This has caused JGit to produce larger pack files than other implementations of the packing algorithm. Delta search works by pushing prior objects into a search window, an ordered list of objects to attempt to delta compress the next object against. (The window size is bounded, avoiding O(N^2) behavior.) For implementation reasons multiple object types can appear in the input list, and the window. PackWriter commonly passes both trees and blobs in the input list handed to the DeltaWindow algorithm. The pack file format requires an object to only delta compress against the same type, so the DeltaWindow algorithm must stop doing comparisions if a blob would be compared to a tree. Because the input list is sorted by object type and the window is recently considered prior objects, once a wrong type is discovered in the window the search algorithm stops and uses the current result. Unfortunately the termination condition was discarding any found delta by setting deltaBase and deltaBuf to null when it was trying to break the window search. When this bug occurs, the state of the DeltaWindow looks like this: current \| \ / input list: tree0 tree1 blob1 blob2 window: blob1 tree1 tree0 / \ \| res.prev As the loop iterates to the right across the window, it first finds that blob1 is a suitable delta base for blob2, and temporarily holds this in the bestDelta/deltaBuf fields. It then considers tree1, but tree1 has the wrong type (blob != tree), so the window loop must give up and fall through the remaining code. Moving the condition up and discarding the window contents allows the bestDelta/deltaBuf to be kept, letting the final file delta compress blob1 against blob0. The impact of this bug (and its fix) on real world repositories is likely minimal. The boundary from blob to tree happens approximately once in the search, as the input list is sorted by type. Only the first window size worth of blobs (e.g. 10 or 250) were failing to produce a delta in the final file. This bug fix does produce significantly different results for small test repositories created in the unit test suite, such as when a pack may contains 6 objects (2 commits, 2 trees, 2 blobs). Packing test cases can now better sample different output pack file sizes depending on delta compression and object reuse flags in PackConfig. Change-Id: Ibec09398d0305d4dbc0c66fce1daaf38eb71148f	7 years ago
Shawn Pearce	5d8a9f6f3f	Rescale "Compressing objects" progress meter by size Instead of counting objects processed, count number of bytes added into the window. This should rescale the progress meter so that 30% complete means 30% of the total uncompressed content size has been inflated and fed into the window. In theory the progress meter should be more accurate about its percentage complete/remaining fraction than with objects. When counting objects small objects move the progress meter more rapidly than large objects, but demand a smaller amount of work than large objects being compressed. Change-Id: Id2848c16a2148b5ca51e0ca1e29c5be97eefeb48	11 years ago
Shawn Pearce	21e4aa2b9e	Split delta search buckets by byte weight Instead of assuming all objects cost the same amount of time to delta compress, aggregate the byte size of objects in the list and partition threads with roughly equal total bytes. Before splitting the list select the N largest paths and assign each one to its own thread. This allows threads to get through the worst cases in parallel before attempting smaller paths that are more likely to be splittable. By running the largest path buckets first on each thread the likely slowest part of compression is done early, while progress is still reporting a low percentage. This gives users a better impression of how fast the phase will run. On very complex inputs the slow part is more likely to happen first, making a user realize its time to go grab lunch, or even run it overnight. If the worst sections are earlier, memory overruns may show up earlier, giving the user a chance to correct the configuration and try again before wasting large amounts of time. It also makes it less likely the delta compression phase reaches 92% in 30 minutes and then crawls for 10 hours through the remaining 8%. Change-Id: I7621c4349b99e40098825c4966b8411079992e5f	11 years ago
Shawn Pearce	a5c6aac76c	Avoid TemporaryBuffer.Heap on very small deltas TemporaryBuffer is great when the output size is not known, but must be bound by a relatively large upper limit that fits in memory, e.g. 64 KiB or 20 MiB. The buffer gracefully supports growing storage by allocating 8 KiB blocks and storing them in an ArrayList. In a Git repository many deltas are less than 8 KiB. Typical tree objects are well below this threshold, and their deltas must be encoded even smaller. For these much smaller cases avoid the 8 KiB minimum allocation used by TemporaryBuffer. Instead allocate a very small OutputStream writing to an array that is sized at the limit. Change-Id: Ie25c6d3a8cf4604e0f8cd9a3b5b701a592d6ffca	11 years ago
Shawn Pearce	8a7c2f97d0	Correct distribution of allowed delta size along chain length Nicolas Pitre discovered a very simple rule for selecting between two different delta base candidates: - if based whole object, must be <= 50% of target - if at end of a chain, must be <= 1/depth * 50% of target The rule penalizes deltas near the end of the chain, requiring them to be very small in order to be kept by the packer. This favors deltas that are based on a shorter chain, where the read-time unpack cost is much lower. Fewer bytes need to be consulted from the source pack file, and less copying is required in memory to rebuild the object. Junio Hamano explained Nico's rule to me today, and this commit fixes DeltaWindow to implement it as described. When no base has been chosen the computation is simply the statements denoted above. However once a base with depth of 9 has been chosen (e.g. when pack.depth is limited to 10), a non-delta source may create a new delta that is up to 10x larger than the already selected base. This reflects the intent of Nico's size distribution rule no matter what order objects are visited in the DeltaWindow. With this patch and my other patches applied, repacking JGit with: [pack] reuseObjects = false reuseDeltas = false depth = 50 window = 250 threads = 4 compression = 9 CGit (all) 5,711,735 bytes; real 0m13.942s user 0m47.722s [1] JGit heads 5,718,295 bytes; real 0m11.880s user 0m38.177s [2] rest 9,809 bytes The improved JGit result for the head pack is only 6.4 KiB larger than CGit's resulting pack. This patch allowed JGit to find an additional 39.7 KiB worth of space savings. JGit now also often runs 2s faster than CGit, despite also creating bitmaps and pruning objects after the head pack creation. [1] time git repack -a -d -F --window=250 --depth=50 [2] time java -Xmx128m -jar jgit debug-gc Change-Id: I5caec31359bf7248cabdd2a3254c84d4ee3cd96b	11 years ago
Shawn Pearce	3b7924f403	Split remaining delta work on path boundaries When an idle thread tries to steal work from a sibling's remaining toSearch queue, always try to split along a path boundary. This avoids missing delta opportunities in the current window of the thread whose work is being taken. The search order is reversed to walk further down the chain from current position, avoiding the risk of splitting the list within the path the thread is currently processing. When selecting which thread to split from use an accurate estimate of the size to be taken. This avoids selecting a thread that has only one path remaining but may contain more pending entries than another thread with several paths remaining. As there is now a race condition where the straggling thread can start the next path before the split can finish, the stealWork() loop spins until it is able to acquire a split or there is only one path remaining in the siblings. Change-Id: Ib11ff99f90a4d9efab24bf4a85342cc63203dba5	11 years ago
Shawn Pearce	af33a911d0	Replace DeltaWindow array with circularly linked list Typical window sizes are 10 and 250 (although others are accepted). In either case the pointer overhead of 1 pointer in an array or 2 pointers for a double linked list is trivial. A doubly linked list as used here for window=250 is only another 1024 bytes on a 32 bit machine, or 2048 bytes on a 64 bit machine. The critical search loops scan through the array in either the previous direction or the next direction until the cycle is finished, or some other scan abort condition is reached. Loading the next object's pointer from a field in the current object avoids the branch required to test for wrapping around the edge of the array. It also saves the array bounds check on each access. When a delta is chosen the window is shuffled to hoist the currently selected base as an earlier candidate for the next object. Moving the window entry is easier in a double-linked list than sliding a group of array entries. Change-Id: I9ccf20c3362a78678aede0f0f2cda165e509adff	11 years ago
Shawn Pearce	1db50c9d91	Micro-optimize DeltaWindow primary loop javac and the JIT are more likely to understand a boolean being used as a branch conditional than comparing int against 0 and 1. Rewrite NEXT_RES and NEXT_SRC constants to be booleans so the code is clarified for the JIT. Change-Id: I1bdd8b587a69572975a84609c779b9ebf877b85d	11 years ago
Shawn Pearce	6903fa4a34	Micro-optimize DeltaWindow maxMemory test to be != 0 Instead of using a compare-with-0 use a does not equal 0. javac bytecode has a special instruction for this, as it is very common in software. We can assume the JIT knows how to efficiently translate the opcode to machine code, and processors can do != 0 very quickly. Change-Id: Idb84c1d744d2874517fd4bfa1db390e2dbf64eac	11 years ago
Shawn Pearce	d0a5337625	Steal work from delta threads to rebalance CPU load If the configuration wants to run 4 threads the delta search work is initially split somewhat evenly across the 4 threads. During execution some threads will finish early due to the work not being split fairly, as the initial partitions were based on object count and not cost to inflate or size of DeltaIndex. When a thread finishes early it now tries to take 50% of the work remaining on a sibling thread, and executes that before exiting. This repeats as each thread completes until a thread has only 1 object remaining. Repacking Blink, Chromium's new fork of WebKit (2.2M objects 3.9G): [pack] reuseDeltas = false reuseObjects = false depth = 50 threads = 8 window = 250 windowMemory = 800m before: ~105% CPU after 80% after: >780% CPU to 100% Change-Id: I65e45422edd96778aba4b6e5a0fd489ea48e8ca3	11 years ago
Shawn Pearce	f32b861243	JGit 3.0: move internal classes into an internal subpackage This breaks all existing callers once. Applications are not supposed to build against the internal storage API unless they can accept API churn and make necessary updates as versions change. Change-Id: I2ab1327c202ef2003565e1b0770a583970e432e9	11 years ago
Colby Ranger	154e3c886b	Do not enforce DeltaWindow maxMemory when zero. The maxMemory for a DeltaWindow can be optionally disabled when it is less than or equal to zero. Respect this configuration when enforcing the limits on object load. Change-Id: Ic0f4ffcabf82105f8e690bd0eb5e6be485a313b3	11 years ago
Colby Ranger	51beee5568	Enforce max memory for DeltaWindow. Previously, memory limits were enforced at the start of each iteration of the delta search, based on objects that were currently loaded in memory. However, new objects added to the window may be expanded in a future iteration of the search and thus were not accounted for correctly at the start of the search. To fix this, memory limits are now enforced before each object is loaded. Change-Id: I898ab43e7bf5ee7189831f3a68bb9385ae694b8f	11 years ago
Colby Ranger	b9e485661d	Fix DeltaWindow.clear() to release loaded buffer bytes. It is possible for the buffer to be set but not the index. It ocurrs when an exception occurs during creating an index, but after the buffer is loaded. Furthermore, the cleared DeltaWindowEntry should have been ent and not res. Change-Id: I2e0d79540316635bf7aa43efd225e4eb38230844	11 years ago
Colby Ranger	b77ba04976	Do not delta compress objects that have already tried to compress. If an object is in a pack file already, delta compression will not attempt to re-compress it. This assumes that the previous packing already performed the optimal compression attempt, however, the subclasses of StoredObjectRepresentation may use other heuristics to determine if the stored format is optimal. Change-Id: I403de522f4b0dd2667d54f6faed621f392c07786	12 years ago
Shawn O. Pearce	37a10e3006	PackWriter: Don't include edges in progress meter When compressing objects, don't include the edges in the progress meter. These cost almost no CPU time as they are simply pushed into and popped out of the delta search window. Change-Id: I7ea19f0263e463c65da34a7e92718c6db1d4a131 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>	13 years ago
Shawn O. Pearce	e6bd689d2c	Improve LargeObjectException reporting Use 3 different types of LargeObjectException for the 3 major ways that we can fail to load an object. For each of these use a unique string translation which describes the root cause better than just the ObjectId.name() does. Change-Id: I810c98d5691b74af9fc6cbd46fc9879e35a7bdca Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	f048af3fd1	Implement async/batch lookup of object data An ObjectReader implementation may be very slow for a single object, but yet support bulk queries efficiently by batching multiple small requests into a single larger request. This easily happens when the reader is built on top of a database that is stored on another host, as the network round-trip time starts to dominate the operation cost. RevWalk, ObjectWalk, UploadPack and PackWriter are the first major users of this new bulk interface, with the goal being to support an efficient way to pack a repository for a fetch/clone client when the source repository is stored in a high-latency storage system. Processing the want/have lists is now done in bulk, to remove the high costs associated with common ancestor negotiation. PackWriter already performs object reuse selection in bulk, but it now can also do the object size lookup and object counting phases with higher efficiency. Actual object reuse, deltification, and final output are still doing sequential lookups, making them a bit more expensive to perform. Change-Id: I4c966f84917482598012074c370b9831451404ee Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 years ago
Shawn O. Pearce	1a06179ea7	Move PackWriter configuration to PackConfig This refactoring permits applications to configure global per-process settings for all packing and easily pass it through to per-request PackWriters, ensuring that the process configuration overrides the repository specific settings. For example this might help in a daemon environment where the server wants to cap the resources used to serve a dynamic upload pack request, even though the repository's own pack.* settings might be configured to be more aggressive. This allows fast but less bandwidth efficient serving of clients, while still retaining good compression through a cron managed `git gc`. Change-Id: I58cc5e01b48924b1a99f79aa96c8150cdfc50846 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 years ago
Shawn O. Pearce	12fe0f2d1e	Discard the uncompressed delta as soon as its compressed The DeltaCache will most likely need to copy the compressed delta into a new buffer in order to compact away the wasted space at the end caused by over allocation. Since we don't need the uncompressed format anymore, null out our only reference to it so the GC can reclaim this memory if it needs to perform a collection in order to satisfy the cache's allocation attempt. Change-Id: I50403cfd2e3001b093f93a503cccf7adab43cc9d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 years ago
Shawn O. Pearce	9734194917	Honor pack.windowlimit to cap memory usage during packing The pack.windowlimit configuration parameter places an upper bound on the number of bytes used by the DeltaWindow class as it scans through the object list. If memory usage would exceed the limit the window is temporarily decreased in size to keep memory used within that bound. Change-Id: I09521b8f335475d8aee6125826da8ba2e545060d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 years ago
Shawn O. Pearce	a960d1429e	Cache small deltas during packing PackWriter now caches small deltas, or deltas that are very tiny compared to their source inputs, so that the writing phase goes faster by reusing those cached deltas. The cached data is stored compressed, which usually translates to a bigger footprint due to deltas being very hard to compress, but saves time during writing by avoiding the deflate step. They are held under SoftReferences so that the JVM GC can clear out deltas if memory gets very tight. We would rather continue working and spend a bit more CPU time during writing than crash due to OOME. To avoid OutOfMemoryErrors during the caching phase we also trap OOME and just abort out of the caching. Because deflateBound() always produces something larger than what we need to actually store the deflated data, we copy it over into a new buffer if the actual length doesn't match the buffer length. When packing jgit.git this saves over 111 KiB in the cache, and is thus a worthwhile hit on CPU time. To further save memory we store the inflated size of the delta (which we need for the object header) in the same field as the pathHash, as the pathHash is no longer necessary by this phase of the packing algorithm. Change-Id: I0da0c600d845e8ec962289751f24e65b5afa56d7 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 years ago
Shawn O. Pearce	dfad23bf3d	Implement delta generation during packing PackWriter now produces new deltas if there is not a suitable delta available for reuse from an existing pack file. This permits JGit to send less data on the wire by sending a delta relative to an object the other side already has, instead of sending the whole object. The delta searching algorithm is similar in style to what C Git uses, but apparently has some differences (see below for more on). Briefly, objects that should be considered for delta compression are pushed onto a list. This list is then sorted by a rough similarity score, which is derived from the path name the object was discovered at in the repository during object counting. The list is then walked in order. At each position in the list, up to $WINDOW objects prior to it are attempted as delta bases. Each object in the window is tried, and the shortest delta instruction sequence selects the base object. Some rough rules are used to prevent pathological behavior during this matching phase, like skipping pairings of objects that are not similar enough in size. PackWriter intentionally excludes commits and annotated tags from this new delta search phase. In the JGit repository only 28 out of 2600+ commits can be delta compressed by C Git. As the commit count tends to be a fair percentage of the total number of objects in the repository, and they generally do not delta compress well, skipping over them can improve performance with little increase in the output pack size. Because this implementation was rebuilt from scratch based on my own memory of how the packing algorithm has evolved over the years in C Git, PackWriter, DeltaWindow, and DeltaEncoder don't use exactly the same rules everywhere, and that leads JGit to produce different (but logically equivalent) pack files. Repository \| Pack Size (bytes) \| Packing Time \| JGit - CGit = Difference \| JGit / CGit -----------+----------------------------------+----------------- git \| `25094348` - `24322890` = +771458 \| 59.434s / 59.133s jgit \| `5669515` - `5709046` = - 39531 \| 6.654s / 6.806s linux-2.6 \| 389M - 386M = +3M \| 20m02s / 18m01s For the above tests pack.threads was set to 1, window size=10, delta depth=50, and delta and object reuse was disabled for both implementations. Both implementations were reading from an already fully packed repository on local disk. The running time reported is after 1 warm-up run of the tested implementation. PackWriter is writing 771 KiB more data on git.git, 3M more on linux-2.6, but is actually 39.5 KiB smaller on jgit.git. Being larger by less than 0.7% on linux-2.6 isn't bad, nor is taking an extra 2 minutes to pack. On the running time side, JGit is at a major disadvantage because linux-2.6 doesn't fit into the default WindowCache of 20M, while C Git is able to mmap the entire pack and have it available instantly in physical memory (assuming hot cache). CGit also has a feature where it caches deltas that were created during the compression phase, and uses those cached deltas during the writing phase. PackWriter does not implement this (yet), and therefore must create every delta twice. This could easily account for the increased running time we are seeing. Change-Id: I6292edc66c2e95fbe45b519b65fdb3918068889c Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 years ago

13 Commits (c0268f899e3e600a45f6c028f4d0088b2fb2fce1)