mirrors/jgit - jgit - source @ dussan.org

Commit Graph

Author	SHA1	Message	Date
Robin Rosenberg	32ff57a2b2	Cleanup javadocs so they pass the java8 doclint checks Bug: 431552 Change-Id: I469316f5645205016e1fa6b0fbd2ff3b509b14bc Signed-off-by: Robin Stocker <robin@nibor.org>	10 years ago
Matthias Sohn	2f93551e18	Add missing @since tags for new API methods Change-Id: I38f10d622c30f19d1154a4901477e844cb411707 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	11 years ago
Shawn Pearce	5d446f410d	Support cutting existing delta chains longer than the max depth Some packs built by JGit have incredibly long delta chains due to a long standing bug in PackWriter. Google has packs created by JGit's DfsGarbageCollector with chains of 6000 objects long, or more. Inflating objects at the end of this 6000 long chain is impossible to complete within a reasonable time bound. It could take a beefy system hours to perform even using the heavily optimized native C implementation of Git, let alone with JGit. Enable pack.cutDeltaChains to be set in a configuration file to permit the PackWriter to determine the length of each delta chain and clip the chain at arbitrary points to fit within pack.depth. Delta chain cycles are still possible, but no attempt is made to detect them. A trivial chain of A->B->A will iterate for the full pack.depth configured limit (e.g. 50) and then pick an object to store as non-delta. When cutting chains the object list is walked in reverse to try and take advantage of existing chain computations. The assumption here is most deltas are near the end of the list, and their bases are near the front of the list. Going up from the tail attempts to reuse chainLength computations by relying on the memoized value in the delta base. The chainLength field in ObjectToPack is overloaded into the depth field normally used by DeltaWindow. This is acceptable because the chain cut happens before delta search, and the chainLength is reset to 0 if delta search will follow. Change-Id: Ida4fde9558f3abbbb77ade398d2af3941de9c812	11 years ago
Tomasz Zarna	48f30b8614	Fix @since tags in JGit, version 2.4 never existed Change-Id: Iaca88ec28b412e6b58e7b39a0762ba54b25f9471 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	11 years ago
Edwin Kempin	e02708a8b3	Fix formatting of PackConfig.toString() & GC.RepoStatistics.toString() Change-Id: I7e0c74ecfd0e0615d10fb582b2897d33be23440a Signed-off-by: Edwin Kempin <edwin.kempin@sap.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	11 years ago
Shawn Pearce	f32b861243	JGit 3.0: move internal classes into an internal subpackage This breaks all existing callers once. Applications are not supposed to build against the internal storage API unless they can accept API churn and make necessary updates as versions change. Change-Id: I2ab1327c202ef2003565e1b0770a583970e432e9	11 years ago
Edwin Kempin	eac218b7b4	Add toString() for PackConfig This is helpful for writing the pack configuration into a log file. Change-Id: I5e7f5ff7e01c9538ca12a1860844ba9b467bdf05 Signed-off-by: Edwin Kempin <edwin.kempin@sap.com>	11 years ago
Shawn Pearce	3760e4319b	Remove cached_packs support in favor of bitmaps The bitmap code in PackWriter knows exactly when to use a pack as a "cached pack". It enables cached pack usage only when the pack has a bitmap and its entire closure of objects needs to be sent. This is a much simpler code path to maintain, and JGit actually has a way to write the necessary index. Change-Id: I2645d482f8733fdf0c4120cc59ba9aa4d4ba6881	11 years ago
Colby Ranger	e6883dfe4b	Enable writing bitmaps during GC by default. Bitmaps provide a huge performance boost for counting objects and they play nice with the cgit implementation. Change-Id: I33b05a6c8f1ee2df7770f0b9fdc50d0b4bbf1029	11 years ago
Colby Ranger	f82821728b	Enable writing pack indexes with bitmaps in the GC. Update the dfs and file GC implementations to prepare and write bitmaps on the packs that contain the full closure of the object graph. Update the DfsPackDescription to include the index version. Change-Id: I3f1421e9cd90fe93e7e2ef2b8179ae2f1ba819ed	11 years ago
Colby Ranger	dafcb8f6db	Support creating pack bitmap indexes in PackWriter. Update the PackWriter to support writing out pack bitmap indexes, a parallel ".bitmap" file to the ".pack" file. Bitmaps are selected at commits every 1 to 5,000 commits for each unique path from the start. The most recent 100 commits are all bitmapped. The next 19,000 commits have a bitmaps every 100 commits. The remaining commits have a bitmap every 5,000 commits. Commits with more than 1 parent are prefered over ones with 1 or less. Furthermore, previously computed bitmaps are reused, if the previous entry had the reuse flag set, which is set when the bitmap was placed at the max allowed distance. Bitmaps are used to speed up the counting phase when packing, for requests that are not shallow. The PackWriterBitmapWalker uses a RevFilter to proactively mark commits with RevFlag.SEEN, when they appear in a bitmap. The walker produces the full closure of reachable ObjectIds, given the collection of starting ObjectIds. For fetch request, two ObjectWalks are executed to compute the ObjectIds reachable from the haves and from the wants. The ObjectIds needed to be written are determined by taking all the resulting wants AND NOT the haves. For clone requests, we get cached pack support for "free" since it is possible to determine if all of the ObjectIds in a pack file are included in the resulting list of ObjectIds to write. On my machine, the best times for clones and fetches of the linux kernel repository (with about 2.6M objects and 300K commits) are tabulated below: Operation Index V2 Index VE003 Clone 37530ms (524.06 MiB) 82ms (524.06 MiB) Fetch (1 commit back) 75ms 107ms Fetch (10 commits back) 456ms (269.51 KiB) 341ms (265.19 KiB) Fetch (100 commits back) 449ms (269.91 KiB) 337ms (267.28 KiB) Fetch (1000 commits back) 2229ms ( 14.75 MiB) 189ms ( 14.42 MiB) Fetch (10000 commits back) 2177ms ( 16.30 MiB) 254ms ( 15.88 MiB) Fetch (100000 commits back) 14340ms (185.83 MiB) 1655ms (189.39 MiB) Change-Id: Icdb0cdd66ff168917fb9ef17b96093990cc6a98d	12 years ago
Colby Ranger	3b325917a5	Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f	11 years ago
Colby Ranger	be7a135e94	Break the dependency on RevObject when creating a newObjectToPack(). Update the ObjectReuseAsIs API to support creating new ObjectToPack with only the AnyObjectId and Git object type. This is needed to support the future pack index bitmaps, which only contain this information and do not want the overhead of creating a temporary object for every ObjectId. Change-Id: I906360b471412688bf429ecef74fd988f47875dc	11 years ago
Colby Ranger	4a317a1790	Include supported extensions in PackFile constructor. Previously a PackFile class was assumed to only support a .pack and .idx file. Update the constructor to enumerate the supported extensions for the pack file. This will allow the bitmap code to only be executed if the bitmap extension file is known to exist. Change-Id: Ie59041dffec5f60d7ea2771026ffd945106bd4bf	11 years ago
Colby Ranger	7fbd6588be	Reduce memory held and speed up DfsGarbageCollector. getObjectList() returns a list of ObjectToPack. These can hold on to a lot of memory. Furthermore, binary searching for objects in a sorted array can be slow. Improve the speed and reduce the memory by creating a copy of the ObjectId and inserting it into an ObjectIdOwnerMap. Change-Id: Ib5aa5b7447e05938b47fa55812a87b9872c20ea7	11 years ago
Colby Ranger	7c58f6282a	Update DfsGarbageCollector to not read back a pack index. Previously, the Dfs GC excluded objects from packs by passing a previously written index to the PackWriter. Reading back a file on Dfs is slow. Instead, allow the PackWriter to expose the objects included in a pack and forward that to invocations of excludeObjects() . Change-Id: I377cb4ab07f62cf790505e1eeb0b2efe81897c79	11 years ago
Colby Ranger	698705c754	Rename PackConstants to PackExt, a typed pack file extension. PackConstants previously contained string values for the pack and pack index extension. Change PackConstant to be PackExt, a typed wrapper around the string pack file extension. Change-Id: I86ac4db6da8f33aa42d6f37cfcc119e819444318	11 years ago
Colby Ranger	5d3c2b3def	Update DfsObjDatabase API to open/write by pack extension. Previously, the DfsObjDatabase had a hardcoded getPackFile() and getPackIndex() methods which opens a .pack and .idx file, respectively. A future change to add a bitmap index will need to be stored in a parallel .bitmap file. Update the DfsObjDatabase to support opening and writing of files for any pack extension. Change-Id: I7c403b501e242096a2d435f6865d6025a9f86108	11 years ago
Colby Ranger	154e3c886b	Do not enforce DeltaWindow maxMemory when zero. The maxMemory for a DeltaWindow can be optionally disabled when it is less than or equal to zero. Respect this configuration when enforcing the limits on object load. Change-Id: Ic0f4ffcabf82105f8e690bd0eb5e6be485a313b3	11 years ago
Colby Ranger	51beee5568	Enforce max memory for DeltaWindow. Previously, memory limits were enforced at the start of each iteration of the delta search, based on objects that were currently loaded in memory. However, new objects added to the window may be expanded in a future iteration of the search and thus were not accounted for correctly at the start of the search. To fix this, memory limits are now enforced before each object is loaded. Change-Id: I898ab43e7bf5ee7189831f3a68bb9385ae694b8f	11 years ago
Robin Rosenberg	c310fa0c80	Mark non-externalizable strings as such A few classes such as Constanrs are marked with @SuppressWarnings, as are toString() methods with many liternal, but otherwise $NLS-n$ is used for string containing text that should not be translated. A few literals may fall into the gray zone, but mostly I've tried to only tag the obvious ones. Change-Id: I22e50a77e2bf9e0b842a66bdf674e8fa1692f590	11 years ago
Colby Ranger	b9e485661d	Fix DeltaWindow.clear() to release loaded buffer bytes. It is possible for the buffer to be set but not the index. It ocurrs when an exception occurs during creating an index, but after the buffer is loaded. Furthermore, the cleared DeltaWindowEntry should have been ent and not res. Change-Id: I2e0d79540316635bf7aa43efd225e4eb38230844	11 years ago
Robin Stocker	b7f5ed5612	Add Javadoc description for packages These appear as descriptions in the index, see here (currently empty): http://download.eclipse.org/jgit/docs/latest/apidocs/ Change-Id: If7996deef30ae688bade8b3ad6b19547ca3d8b50 Signed-off-by: Chris Aniszczyk <zx@twitter.com>	11 years ago
Colby Ranger	b77ba04976	Do not delta compress objects that have already tried to compress. If an object is in a pack file already, delta compression will not attempt to re-compress it. This assumes that the previous packing already performed the optimal compression attempt, however, the subclasses of StoredObjectRepresentation may use other heuristics to determine if the stored format is optimal. Change-Id: I403de522f4b0dd2667d54f6faed621f392c07786	12 years ago
Christian Halstrick	0f84b86e01	fix PackWriter excluded objects handling PackWriter supports excluding objects from being written to the pack. You may specify a PackIndex which lists all those objects which should not go into the new pack. This feature was broken because not all commits have been checked whether they should be excluded or not. For other object types the exclude algorithm worked. This commit adds the missing check. Change-Id: Id0047098393641ccba784c58b8325175c22fcece Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 years ago
Kevin Sawicki	17fb542e9e	Remove 86 boxing warnings Use Integer, Character, and Long valueOf methods when passing parameters to MessageFormat and other places that expect objects instead of primitives Change-Id: I5942fbdbca6a378136c00d951ce61167f2366ca4	12 years ago
Shawn O. Pearce	6c0d300a54	Fix loading packed objects >2G Parsing the size from a packed object header was incorrectly computing the total inflated length when the length exceeded the range of a Java int. The next 7 bits of size information was shifted left as an int using a shift of 25 bits, placing the higher bits of the size into the sign position. When this size was extended to a long to be added to the current size accumulator the size went negative, resulting in NegativeArraySizeException being thrown. Fix all places where this particular pattern of code is used to read a pack size field, or a binary delta header, as they both use the same variable length encoding scheme. Change-Id: I04008728ed828f18202652c3d5401cf95a441d0a	12 years ago
Robin Rosenberg	95d311f888	Move JGitText to an internal package Change-Id: I763590a45d75f00a09097ab6f89581a3bbd3c797	12 years ago
Shawn O. Pearce	60e51251db	Do not write edge objects to the pack stream Consider two objects A->B where A uses B as a delta base, and these are in the same source pack file ordered as "A B". If cached packs is enabled and B is also in the cached pack that will be appended onto the end of the thin pack, and both A, B are supposed to be in the thin pack, PackWriter must consider the fact that A's base B is an edge object that claims to be part of the new pack, but is actually "external" and cannot be written first. If the object reuse system considered B candidates fist this bug does not arise, as B will be marked as edge due to it existing in the cached pack. When the A candidates are later examined, A sees a valid delta base is available as an edge, and will not later try to "write base first" during the writing phase. However, when the reuse system considers A candidates first they see that B will be in the outgoing pack, as it is still part of the thin pack, and arrange for A to be written first. Later when A switches from being in-pack to being an edge object (as it is part of the cached pack) the pointer in B does not get its type changed from ObjectToPack to ObjectId, so B thinks A is non-edge. We work around this case by also checking that the delta base B is non-edge before writing the object to the pack. Later when A writes its object header, delta base B's ObjectToPack will have an offset == 0, which makes isWritten() = false, and the OBJ_REF delta format will be used for A's header. This will be resolved by the client to the copy of B that appears in the later cached pack. Change-Id: Ifab6bfdf3c0aa93649468f49bcf91d67f90362ca	12 years ago
Shawn O. Pearce	1421106d76	Use long for more object counts in PackWriter Packs can contain up to 2^32-1 objects, which exceeds the range of a Java int. Try harder to accept higher object counts in some cases by using long more often when we are working with the object count value. This is a trivial refactoring, we may have to make even more changes to the object handling code to support more than 2^31-1 objects. Change-Id: I8cd8146e97cd1c738ad5b48fa9e33804982167e7	12 years ago
Shawn O. Pearce	41a18d57bc	Search for annotated tag reuse first Annotated tags are relatively rare and currently are scheduled in a pack file near the commits, decreasing the time it takes to resolve client requests reading tags as part of a history traversal. Putting them first before the commits allows the storage system to page in the tag area, and have it relatively hot in the LRU when the nearby commit area gets examined too. Later looking at the tree and blob data will pollute the cache, making it more likely the tags are not loaded and would require file IO. Change-Id: I425f1f63ef937b8447c396939222ea20fdda290f	12 years ago
Shawn O. Pearce	29997ab084	Correct progress monitor on "Getting sizes:" phase This counter always was running 1 higher, because it incremented after the queue was exhausted (and every object was processed). Move increments to be after the queue has provided a result, to ensure we do not show a higher in-progress count than total count. Change-Id: I97f815a0492c0957300475af409b6c6260008463	12 years ago
Dave Borowitz	2b584b9216	Keep track of a static collection of all PackWriter instances Stored in a weak concurrent hash map, which we clean up while iterating. Usually the weak reference behavior should not be necessary because PackWriters should be released with release(), but we still want to avoid leaks when dealing with broken client code. Change-Id: I337abb952ac6524f7f920fedf04065edf84d01d2	12 years ago
Dave Borowitz	f26b79d044	Estimate the amount of memory used by a PackWriter Memory usage is dominated by three terms: - The maximum memory allocated to each delta window. - The maximum size of a single file held in memory during delta search. - ObjectToPack instances owned by the writer. For the first two terms, rather than doing complex instrumentation of the DeltaWindows, we just overestimate based on the config parameters (though we may underestimate if the maximum size is not set). For the ObjectToPack instances, we do some rough byte accounting of the underlying Java object representation. Change-Id: I23fe3cf9d260a91f1aeb6ea22d75af8ddb9b1939	12 years ago
Dave Borowitz	16b8ebf2d1	Add an object encapsulating the state of a PackWriter Exposes essentially the same state machine to the programmer as is exposed to the client via a ProgressMonitor, using a wrapper around beginTask()/endTask(). Change-Id: Ic3622b4acea65d2b9b3551c668806981fa7293e3	12 years ago
Shawn O. Pearce	1b6a549ff3	PackWriter: Export more statistics Export the shallow pack information, and also a handy function to sum up the total times. Include the time writing out the index file, if it was created. Change-Id: I7f60ae6848455a357b25feedb23743bbf6c153cf Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Matt Fischer	9952223e06	Implement server support for shallow clones This implements the server side of shallow clones only (i.e. git-upload-pack), not the client side. CQ: 5517 Bug: 301627 Change-Id: Ied5f501f9c8d1fe90ab2ba44fac5fa67ed0035a4 Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>	14 years ago
Shawn O. Pearce	a1a8c6d77e	PackWriter: support excluding objects already in other packs This can be useful when implementing garbage collection and there are packs that should not be copied, such as huge packs that have a sibling ".keep" file alongside of them. Callers driving PackWriter need to initialize the list of packs not to include objects from by passing each index to excludeObjects(). Change-Id: Id7f34df69df97be406bcae184308e92b0e8690fd Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>	13 years ago
Shawn O. Pearce	74333e63b6	PackWriter: Make want/have actual sets During parsing these are used with contains(). If they are a List type, the contains operation is not efficient. Some callers such as UploadPack often pass a List here, so convert to Set when the type isn't efficient for contains(). Change-Id: If948ae3bf1f46e756bd2d5db14795e12ba7a6207 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	2610eaf386	Revert "PackWriter: Do not delta compress already packed objects" This reverts commit `67b064fc9f`. The "tiny optimization" introduced by 67b0 turns out to have a big savings on wall-clock time when the object store is very slow (e.g. the DHT support in JGit), but comes with a much bigger penalty in space used by the output stream. CGit packed with 67b0 enabled is 7 MiB larger than it should be (36 MiB rather than 28/29 MiB). The much bigger Linux kernel repository gained over 200 MiB, though some of this may have been caused by a smaller window setting. Revert this patch as PackWriter should be optimizing for space used rather than time spent, since its primary use is network transfer, and that isn't free. Change-Id: I7413a9ef89762208159b4a1adc5a22a4c9245611 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	99e6cfb131	PackWriter: Only search for base objects on thin packs A non-thin pack does not need to worry about preferred bases, the pack will be self-contained and all required delta base objects will appear within the pack itself. Obtaining the path buffer and length from the ObjectWalk to build the preferred base table is "expensive", so avoid the cost unless a thin pack is being constructed. Change-Id: I16e30cd864f4189d4304e7957a7cd5bdb9e84528 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	68cc21b60d	PackWriter: Skip progress messages on fast operations If the "Finding sources" phase will complete in <1 second with no delta compression enabled, don't bother showing the progress meter for this phase. Small repositories on the local filesystem tend to rip through this phase always subsecond and the ProgressMonitor display can actually slow the operation down. If delta compression is enabled, there are two phases that may run very quickly. Set the timer to 500 milliseconds instead, reducing the risk that the user has to wait longer than 1 second before any sort of output from the packer occurs. Change-Id: I58110f17e2a5ffa0134f9768b94804d16bbb8399 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	14680f6e6e	Add copy constructor to PackConfig Some embeddings of UploadPack (e.g. Gerrit Code Review) set their own PackConfig from a server-wide configuration, overriding any JGit defaults or settings that may exist at the local repository level. Make a copy constructor form of PackConfig so this server-wide configuration object can be copied and then merged with repository specific configuration data. Change-Id: I4463c95aeaf7d6536c3ab132dec9c50ee528d9e0 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	de8946c0c2	Store Git on any DHT jgit.storage.dht is a storage provider implementation for JGit that permits storing the Git repository in a distributed hashtable, NoSQL system, or other database. The actual underlying storage system is undefined, and can be plugged in by implementing 7 small interfaces: * Database * RepositoryIndexTable * RepositoryTable * RefTable * ChunkTable * ObjectIndexTable * WriteBuffer The storage provider interface tries to assume very little about the underlying storage system, and requires only three key features: * key -> value lookup (a hashtable is suitable) * atomic updates on single rows * asynchronous operations (Java's ExecutorService is easy to use) Most NoSQL database products offer all 3 of these features in their clients, and so does any decent network based cache system like the open source memcache product. Relying only on key equality for data retrevial makes it simple for the storage engine to distribute across multiple machines. Traditional SQL systems could also be used with a JDBC based spi implementation. Before submitting this change I have implemented six storage systems for the spi layer: * Apache HBase[1] * Apache Cassandra[2] * Google Bigtable[3] * an in-memory implementation for unit testing * a JDBC implementation for SQL * a generic cache provider that can ride on top of memcache All six systems came in with an spi layer around 1000 lines of code to implement the above 7 interfaces. This is a huge reduction in size compared to prior attempts to implement a new JGit storage layer. As this package shows, a complete JGit storage implementation is more than 17,000 lines of fairly complex code. A simple cache is provided in storage.dht.spi.cache. Implementers can use CacheDatabase to wrap any other type of Database and perform fast reads against a network based cache service, such as the open source memcached[4]. An implementation of CacheService must be provided to glue this spi onto the network cache. [1] https://github.com/spearce/jgit_hbase [2] https://github.com/spearce/jgit_cassandra [3] http://labs.google.com/papers/bigtable.html [4] http://memcached.org/ Change-Id: I0aa4072781f5ccc019ca421c036adff2c40c4295 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	8ac65d33ed	PackWriter: Fix the way delta chain cycles are prevented Take a very simple approach to avoiding delta chains during object reuse: objects are now always selected from the oldest pack that contains them. This prevents cycles because a pack must not have a cycle in the delta chain. If both objects A and B are chosen out of the same source pack then there cannot be an A->B->A cycle. The oldest pack is also the most likely to have the smallest deltas. Its the biggest pack in the system and probably came from the clone (or last GC) of this repository, where all objects were previously considered and packed tightly together. If an object appears again (for example due to a revert and a push into this repository) the newer copy of won't be nearly as small as the older delta version of it, even if the newer one is also itself a delta. ObjectDirectory already enumerates objects during selection in this newest->oldest order, so it already is supplying these assumptions to PackWriter. Taking advantage of this can speed up selection by a tiny amount by avoiding some tests, but can also help to prevent a cycle needing to be broken on the fly during writing. The previous cycle breaking logic wasn't fully correct either. If a different delta base was chosen, the new delta base might not have been written into the output pack before the current object, forcing the use of REF_DELTA when OFS_DELTA is always smaller. This logic has now been reworked to always re-check the delta base and ensure it gets written before the current object. If a cycle occurs, it gets broken the same way as before, by disabling delta reuse and finding an alternative form of the object, which may require inflating/deflating in whole format. Change-Id: I9953ab8be54ceb8b588e1280d6f7edd688887747 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	36a38adf71	PackWriter: Combine small reuse batches together If the total number of objects to look for reuse on is under 4096 this is really close to a reasonable batch size for the DHT storage system to lookup at once. Combine all of the objects into a single temporary list, perform reuse, and then prune the main lists if any duplicate objects were detected from a selected CachedPack. The intention here is to try and avoid 4 tiny sequential lookups on the storage system when the time to wait for each of those to finish is higher than the CPU time required to build (and later GC) this temporary list. Change-Id: I528daf9d2f7744dc4a6281750c2d61d8f9da9f3a Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	0be24ebf33	PackWriter: Remove dummy list 0 Instead of looping over the objectsLists array, always set slot 0 to null and explicitly work on the 4 indexes that matter. This kills some loops and increases the length of the code slightly, but I've always really disliked that dummy 0 slot. Change-Id: I5ad938501c1c61f637ffdaff0d0d88e3962d8942 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	9f5bbb5dd4	PackWriter: Speed up pruning of objects from cached packs During object enumeration for the thin pack, very few objects come out that are duplicated with the cached pack. Typically these are only cases where a blob or tree was cherry-picked forward, got a copy or rename, or was reverted... all relatively infrequent events. Speed up pruning of the thin pack object list by combining the phase with the object representation selection. Implementers should already be offering to reuse the object from the cached pack if it is stored there, at which point the implementation can perform a very fast type of containment test using the cached pack's identity rather than yet another index lookup. For the local disk case this is probably not a big improvement, but it does help on the DHT implementation where the two passes combined into one reduces latency. Change-Id: I6a07fc75d9075bf6233e967360b6546f9e9a2b33 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	bb1956e647	PackWriter: Collect stats by object type Frequently enough I'm wondering how much of a pack is commits vs. trees, and the total line doesn't really tell us this because its a gross total from the pack. Computing the counts per object type is simple during packing, as PackWriter already has everything in memory broken up by object type. Its virtually free to get these values and track them. Change-Id: Id5e6b1902ea909c72f103a0fbca5d8bc316f9ab3 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	7a9bf1e2e0	PackWriter: Rename getObjectsNumber to getObjectCount This better matches with PackFile and CachedPack's methods that return the same value. Change-Id: Idb9b7c71d2048dd2344a62c2cde20b4e34529ab7 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago

1 2 3

130 Commits (stable-3.6)