mirrors/jgit - jgit - source @ dussan.org

提交線圖

作者	SHA1	備註	提交日期
Colby Ranger	b77ba04976	Do not delta compress objects that have already tried to compress. If an object is in a pack file already, delta compression will not attempt to re-compress it. This assumes that the previous packing already performed the optimal compression attempt, however, the subclasses of StoredObjectRepresentation may use other heuristics to determine if the stored format is optimal. Change-Id: I403de522f4b0dd2667d54f6faed621f392c07786	12 年之前
Shawn O. Pearce	28ba4747bc	Allow ObjectReuseAsIs to have more control over write ordering The reuse system used by an object database may be able to benefit from knowing what objects are coming next, and even improve data throughput by delaying (or moving up) objects that are stored near each other in the source database. Pushing the iteration down into the reuse code makes it possible for a smarter implementation to aggregate reuse. But for the standard pack file format on disk we don't bother, its quite efficient already. Change-Id: I64f0048ca7071a8b44950d6c2a5dfbca3be6bba6 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 年之前
Shawn O. Pearce	fe18e52195	Allow ObjectToPack subclasses to use up to 4 bits of flags Some instances may benefit from having access to memory efficient storage for some small values, like single flag bits. Give up a portion of our delta depth field to make 4 bits available to any subclass that wants it. This still gives us room for delta chains of 1,048,576 objects, and that is just insane. Unpacking 1 million objects to get to something is longer than most users are willing to wait for data from Git. Change-Id: If17ea598dc0ddbde63d69a6fcec0668106569125 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 年之前
Shawn O. Pearce	f048af3fd1	Implement async/batch lookup of object data An ObjectReader implementation may be very slow for a single object, but yet support bulk queries efficiently by batching multiple small requests into a single larger request. This easily happens when the reader is built on top of a database that is stored on another host, as the network round-trip time starts to dominate the operation cost. RevWalk, ObjectWalk, UploadPack and PackWriter are the first major users of this new bulk interface, with the goal being to support an efficient way to pack a repository for a fetch/clone client when the source repository is stored in a high-latency storage system. Processing the want/have lists is now done in bulk, to remove the high costs associated with common ancestor negotiation. PackWriter already performs object reuse selection in bulk, but it now can also do the object size lookup and object counting phases with higher efficiency. Actual object reuse, deltification, and final output are still doing sequential lookups, making them a bit more expensive to perform. Change-Id: I4c966f84917482598012074c370b9831451404ee Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 年之前
Shawn O. Pearce	109c695936	Expose getType in ObjectToPack Storage implementations may find this useful when implementing the ObjectReuseAsIs interface on their ObjectReader. Expose it so we don't force them to create a redundant copy of the information. Change-Id: I802ec8113c00884fccde5d0e92b9849716316f62 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 年之前
Shawn O. Pearce	a960d1429e	Cache small deltas during packing PackWriter now caches small deltas, or deltas that are very tiny compared to their source inputs, so that the writing phase goes faster by reusing those cached deltas. The cached data is stored compressed, which usually translates to a bigger footprint due to deltas being very hard to compress, but saves time during writing by avoiding the deflate step. They are held under SoftReferences so that the JVM GC can clear out deltas if memory gets very tight. We would rather continue working and spend a bit more CPU time during writing than crash due to OOME. To avoid OutOfMemoryErrors during the caching phase we also trap OOME and just abort out of the caching. Because deflateBound() always produces something larger than what we need to actually store the deflated data, we copy it over into a new buffer if the actual length doesn't match the buffer length. When packing jgit.git this saves over 111 KiB in the cache, and is thus a worthwhile hit on CPU time. To further save memory we store the inflated size of the delta (which we need for the object header) in the same field as the pathHash, as the pathHash is no longer necessary by this phase of the packing algorithm. Change-Id: I0da0c600d845e8ec962289751f24e65b5afa56d7 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 年之前
Shawn O. Pearce	b38426ae8c	Add debugging toString() method to ObjectToPack Its useful to know what the flags are or what the base that was selected is. Dump these out as part of the object's toString. Change-Id: I8810067fb8337b08b4fcafd5f9ea3e1e31ca6726 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 年之前
Shawn O. Pearce	699e4aa7c5	Make ObjectToPack clearReuseAsIs signal available to subclasses A subclass may want to use this method to release handles that are caching reuse information. Make it protected so they can override it and update themselves. Change-Id: I2277a56ad28560d2d2d97961cbc74bc7405a70d4 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 年之前
Shawn O. Pearce	85b7a53d52	Refactor ObjectToPack's delta depth setting Long ago when PackWriter is first written we thought that the delta depth could be updated automatically. But its never used. Instead make this a simple standard setter so the caller can more directly set the delta depth of this object. This permits us to configure a depth that takes into account more than just the depth of another object in this same pack. Change-Id: I1d71b74f2edd7029b8743a2c13b591098ce8cc8f Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 年之前
Shawn O. Pearce	823e9a9721	Add doNotDelta flag to ObjectToPack This flag will later control whether or not PackWriter search for a delta base for this object. Edge objects will never get searched, as the writer won't be outputting them, so they should always have this flag set on. Sometime in the future this flag should also be set for file blobs on file paths that have the "-delta" gitattribute set in the repository's attributes file. Change-Id: I6e518e1a6996c8ce00b523727f1b605e400e82c6 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 年之前
Shawn O. Pearce	2f93a09dd1	Save object path hash codes during packing We need to remember these so we can later cluster objects that have similar file paths near each other as we search for deltas between them. Change-Id: I52cb1e4ca15c9c267a2dbf51dd0d795f885f4cf8 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 年之前
Shawn O. Pearce	ea21c111cb	Move PackWriter over to storage.pack.PackWriter Similar to what we did with the file code, move the pack writer into its own package so the related classes and their package private methods are hidden from the rest of the library. Change-Id: Ic1b5c7c8c8d266e90c910d8d68dfc8e93586854f Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 年之前
Shawn O. Pearce	bf4ffff07f	Redo PackWriter object reuse selection The new selection implementation uses a public API on the ObjectReader, allowing the storage library to enumerate its candidates and select the best one for this packer without needing to build a temporary list of the candidates first. Change-Id: Ie01496434f7d3581d6d3bbb9e33c8f9fa649b6cd Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 年之前
Shawn O. Pearce	e0c9368f3e	Reclaim some bits in ObjectToPack flags field Make the lower bits available for flags that PackWriter can use to keep track of facts about the object. We shouldn't need more than 2^24 delta depths, unpacking that chain is unfathomable anyway. This change gets us 4 bits that are unused in the lower end of the word, which are typically easier to load from Java and most machine instruction sets. We can use these in later changes. Change-Id: Ib9e11221b5bca17c8a531e4ed130ba14c0e3744f Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 年之前
Shawn O. Pearce	6fc3ecac84	Extract PackFile specific code to ObjectToPack subclass The ObjectReader class is dual-purposed into being a factory for the ObjectToPack, permitting specific ObjectDatabase implementations to override the method and offer their own custom subclass of the generic ObjectToPack class. By allowing them to directly extend the type, each implementation can add custom fields to support tracking where an object is stored, without incurring any additional penalties like a parallel Map<ObjectId,Object> would cost. The reader was chosen to act as a factory rather than the database, as the reader will eventually be tied more tightly with the ObjectWalk and TreeWalk. During object enumeration the reader would have had to load the object for the RevWalk, and may chose to cache object position data internally so it can later be reused and fed into the ObjectToPack instance supplied to the PackWriter. Since a reader is not thread-safe, and is scoped to this PackWriter and its internal ObjectWalk, its a great place for the database to perform caching, if any. Right now this change goes a bit backwards by changing what should be generic ObjectToPack references inside of PackWriter to the very PackFile specific LocalObjectToPack subclass. We will correct these in a later commit as we start to refine what the ObjectToPack API will eventually look like in order to better support the PackWriter. Change-Id: I9f047d26b97e46dee3bc0ccb4060bbebedbe8ea9 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 年之前
Shawn O. Pearce	a2208be6aa	Extract ObjectToPack to be top-level This shortens the implementation within PackWriter, and starts to open the door for some other refactorings based on changing the ObjectToPack to be a public part of the API. Change-Id: Id849cbffc4de20b903e844a2de7737eeb8b7a3ff Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 年之前

12 次程式碼提交 (b77ba049762e4ea3aadb756dad1d06c859bb3fe3)