jgit.git - JGit, the Java implementation of git: https://github.com/eclipse-jgit/jgit

	Commit message (Collapse)	Author	Age	Files	Lines
...
\| * \|	Use core.streamFileThreshold to set our streaming limit	Shawn O. Pearce	2010-07-02	13	-9/+193
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We default this to 1 MiB for now, but we allow users to modify it through the Repository's configuration file to be a different value. A new repository listener is used to identify when the setting has been updated and trigger a reconfiguration of any active ObjectReaders. To prevent a horrible explosion we cap core.streamFileThreshold at no more than 1/4 of the maximum JVM heap size. We do this because we need at least 2 byte arrays equal in size to the stream threshold for the worst case delta inflation scenario, and our host application probably also needs some amount of the heap for their working set size. Change-Id: I103b3a541dc970bbf1a6d92917a12c5a1ee34d6c Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Support large delta packed objects as streams	Shawn O. Pearce	2010-07-02	9	-93/+864
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Very large delta instruction streams, or deltas which use very large base objects, are now streamed through as large objects rather than being inflated into a byte array. This isn't the most efficient way to access delta encoded content, as we may need to rewind and reprocess the base object when there was a block moved within the file, but it will at least prevent the JVM from having its heap explode. When streaming a delta we have an inflater open for each level in the delta chain, to inflate the instruction set of the delta, as well as an inflater for the base level object. The base object is buffered, as is the top level delta requested by the application, but we do not buffer the intermediate delta streams. This keeps memory usage lower, so its closer to 1024 bytes per level in the chain, without having an adverse impact on raw throughput as the top-level buffer gets pushed down to the lowest stream that has the next region. Delta instructions transparently collapse here, if the top level does not copy a region from its base, the base won't materialize that part from its own base, etc. This allows us to avoid copying around a lot of segments which have been deleted from the final version. Change-Id: I724d45245cebb4bad2deeae7b896fc55b2dd49b3 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Support large whole packed objects as streams	Shawn O. Pearce	2010-07-01	4	-3/+221
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to the loose object support, whole packed objects can now be streamed back to the caller. The streaming is less efficient as we copy the data from the cached window array into the InflaterInputStream's internal buffer, then inflate it there before returning to the application. Like with unpacked objects, there is plenty of room for some optimization, especially for the copyTo method, where we don't necessarily need so much buffering to exist. Change-Id: Ie23be81289e37e24b91d17b0891e47b9da988008 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Replace PackedObjectLoader with ObjectLoader.SmallObject	Shawn O. Pearce	2010-07-01	3	-105/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The class is identical, but ObjectLoader.SmallObject is part of our public API for storage implementations to build on top of. Change-Id: I381a3953b14870b6d3d74a9c295769ace78869dc Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Support large loose objects as streams	Shawn O. Pearce	2010-07-01	7	-235/+468
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Big loose objects can now be streamed if they are over the large object size threshold. This prevents the JVM heap from exploding with a very large byte array to hold the slurped file, and then again with its uncompressed copy. We may have slightly slowed down the simple case for small loose objects, as the loader no longer slurps the entire thing and decompresses in memory. To try and keep good performance for the very common small objects that are below 8 KiB in size, buffers are set to 8 KiB, causing the reader to slurp most of the file anyway. However the data has to be copied at least once, from the BufferedInputStream into the InflaterInputStream. New unit tests are supplied to get nearly 100% code coverage on the unpacked code paths, for both standard and pack style loose objects. We tested a fair chunk of the code elsewhere, but these new tests are better isolated to the specific branches in the code path. Change-Id: I87b764ab1b84225e9b5619a2a55fd8eaa640e1fe Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Permit AnyObjectTo to compareTo AnyObjectId	Shawn O. Pearce	2010-06-30	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Assume that the argument of compareTo won't be mutated while we are doing the compare, and support the wider AnyObjectId type so MutableObjectId is suitable on either side of the compareTo call. Change-Id: I2a63a496c0a7b04f0e5f27d588689c6d5e149d98 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Use copyTo during checkout of files to working tree	Shawn O. Pearce	2010-06-30	1	-7/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This way we can stream a large file through memory, rather than loading the entire thing into a single contiguous byte array. Change-Id: I3ada2856af2bf518f072edec242667a486fb0df1 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Stream whole deflated objects in PackWriter	Shawn O. Pearce	2010-06-30	1	-12/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of loading the entire object as a byte array and passing that into the deflater, let the ObjectLoader copy the object onto the DeflaterOutputStream. This has the nice side effect of using some sort of stride hack in the Sun implementation that may improve compression performance. Change-Id: I3f3d681b06af0da93ab96c75468e00e183ff32fe Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Lazily allocate Deflater in PackWriter	Shawn O. Pearce	2010-06-30	1	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Only allocate the Deflater if we can't reuse everything, but also make sure we release it when we release the PackWriter's resources. Change-Id: I16a32b94647af0778658eda87acbafc9a25b314a Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Add openStream to ObjectLoader for big blobs	Shawn O. Pearce	2010-06-30	5	-2/+318
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Blobs that are too large to read as a single byte array should be accessed through an InputStream based interface instead, allowing the application to walk through the data stream incrementally. Define the basic interface to support streaming contents, but don't implement it yet for the file based backend. Change-Id: If9e4442e9ef4ed52c3e0f1af9398199a73145516 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Move DirCache factory methods to Repository	Shawn O. Pearce	2010-06-30	3	-53/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of creating the DirCache from a static factory method, use an instance method on Repository, permitting the implementation to override the method with a completely different type of DirCache reading and writing. This would better support a repository in the cloud strategy, or even just an in-memory unit test environment. Change-Id: I6399894b12d6480c4b3ac84d10775dfd1b8d13e7 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Create NoWorkTreeException for bare repositories	Shawn O. Pearce	2010-06-30	3	-24/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using a custom exception type makes it easire for an application developer to understand why an exception was thrown out of a method we declare. To remain compatiable with existing callers, we still extend off IllegalStateException. Change-Id: Ideeef2399b11ca460a2dbb3cd80eb76aa0a025ba Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Ensure RevWalk is released when done	Shawn O. Pearce	2010-06-29	13	-160/+258
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update a number of calling sites of RevWalk to ensure the walker's internal ObjectReader is released after the walk is no longer used. Because the ObjectReader is likely to hold onto a native resource like an Inflater, we don't want to leak them outside of their useful scope. Where possible we also try to share ObjectReaders across several walk pools, or between a walker and a PackWriter. This permits the ObjectReader to actually do some caching if it felt inclined to do so. Not everything was updated, we'll probably need to come back and update even more call sites, but these are some of the biggest offenders. Test cases in particular aren't updated. My plan is to move most storage-agnostic tests onto some purely in-memory storage solution that doesn't do compression. Change-Id: I04087ec79faeea208b19848939898ad7172b6672 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Use ObjectReader in DirCacheBuilder.addTree	Shawn O. Pearce	2010-06-29	2	-21/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than building a custom reader, have the caller supply us one. Change-Id: Ief2b5a6b1b75f05c8a6bc732a60d4d1041dd8254 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Use one ObjectReader for WalkFetchConnection	Shawn O. Pearce	2010-06-28	1	-8/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of creating new ObjectReader for each walker, use one for the entire connection and delegate reads through it. Change-Id: I7f0a2ec8c9fe60b095a7be77dc423a2ff8b443a3 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Use ObjectReader in RevWalk, TreeWalk	Shawn O. Pearce	2010-06-28	23	-135/+209
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't actually need a Repository object here, just an ObjectReader that can load content for us. So change the API to depend on that. However, this breaks the asCommit and asTag legacy translation methods on RevCommit and RevTag, so we still have to keep the Repository inside of RevWalk for those two types. Hopefully we can drop those in the future, and then drop the Repository off the RevWalk. Change-Id: Iba983e48b663790061c43ae9ffbb77dfe6f4818e Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Fix minor formatting issue in UploadPack	Shawn O. Pearce	2010-06-28	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change-Id: Ifc0c3a94dc0e16126af6cf17e9c4a7cb96e8ffab Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Remove volatile keyword from RepositoryEvent	Shawn O. Pearce	2010-06-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't need this field to be volatile. Events are delivered by the same thread that created the RepositoryEvent object, and thus any cross-thread operations would need to be handled by some other type of synchronization in the listener, and that would protect both the repository field and any other per-event data. Change-Id: Iefe345959e1a2d4669709dbf82962bcc1b8913e3 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Rename openObject, hasObject to just open, has	Shawn O. Pearce	2010-06-28	12	-27/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to what we did on Repository, the openObject method already implied we wanted to open an object, given its main argument was of type AnyObjectId. Simplify the method name to just the action, has or open. Change-Id: If055e5e0d8de0e2424c18a773f6d2bc2f66054f4 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Refactor Repository.openObject to be Repository.open	Shawn O. Pearce	2010-06-28	7	-81/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We drop the "Object" suffix, because its pretty clear here that we want to open an object, given that we pass in AnyObjectId as the main parameter. We also fix the calling convention to throw a MissingObjectException or IncorrectObjectTypeException, so that callers don't have to do this error checking themselves. Change-Id: I72c43353cea8372278b032f5086d52082c1eee39 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Move PackWriter progress monitors onto the operations	Shawn O. Pearce	2010-06-28	6	-82/+76
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than taking the ProgressMonitor objects in our constructor and carrying them around as instance fields, take them as arguments to the actual time consuming operations we need to run. Change-Id: I2b230d07e277de029b1061c807e67de5428cc1c4 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Pass the PackOutputStream down the call stack	Shawn O. Pearce	2010-06-28	1	-16/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than storing this in an instance member, pass it down the calling stack. Its cleaner, we don't have to poke the stream as a temporary field, and then unset it. Change-Id: I0fd323371bc12edb10f0493bf11885d7057aeb13 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Remove Repository.openObject(ObjectReader, AnyObjectId)	Shawn O. Pearce	2010-06-28	6	-49/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Going through ObjectReader.openObject(AnyObjectId) is faster, but also produces cleaner application level code. The error checking is done inside of the openObject method, which means it can be removed from the application code. Change-Id: Ia927b448d128005e1640362281585023582b1a3a Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Throw IncorrectObjectTypeException on bad type hints	Shawn O. Pearce	2010-06-28	2	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the type hint isn't OBJ_ANY and it doesn't match the actual type observed from the object store, define the reader to throw back an IncorrectObjectTypeException. This way the caller doesn't have to perform this check itself before it evaluates the object data, and we can simplify quite a few call sites. Change-Id: I9f0dfa033857f439c94245361fcae515bc0a6533 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Ensure ObjectReader used by PackWriter is released	Shawn O. Pearce	2010-06-28	5	-76/+93
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ObjectReader API demands that we release the reader when we are done with it. PackWriter contains a reader, which it uses for the entire packing session. Expose the release of the reader through a release method on the writer. This still doesn't address the RevWalk and TreeWalk users, who don't correctly release their reader. But its a small step in the right direction. Change-Id: I5cb0b5c1b432434a799fceb21b86479e09b84a0a Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Ensure PackWriter releases its ObjectReader	Shawn O. Pearce	2010-06-28	1	-13/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change-Id: I3f8af29066cc5a2132dc4a75c9654d97800f2f18 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Release ObjectReader before the cached ObjectDatabase	Shawn O. Pearce	2010-06-28	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I don't want to play games with the order of release here, its probably safer to release the reader before the database, just in case the one depends on the other. Change-Id: I2394c7d2477eaf7a7e1556fc3393c59d3b31e764 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Release ObjectInserter in merge() not mergeImpl()	Shawn O. Pearce	2010-06-28	2	-10/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	By doing the release at the higher level class, we can ensure the release occurs if the inserter was allocated, even if the implementation forgets to do this. Since the higher level class is what allocated it, it makes sense to have it also do the release. Change-Id: Id617b2db864c3208ed68cba4eda80e51612359ad Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Commit: Use Repository.newObjectInserter	Shawn O. Pearce	2010-06-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Everyone else does. This must have been a spot I missed during some sort of squash while developing the series. Change-Id: I62eae50b618f47ee33ad7cf71fc05b724f603201 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Move PackWriter over to storage.pack.PackWriter	Shawn O. Pearce	2010-06-26	21	-30/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar to what we did with the file code, move the pack writer into its own package so the related classes and their package private methods are hidden from the rest of the library. Change-Id: Ic1b5c7c8c8d266e90c910d8d68dfc8e93586854f Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Simplify ObjectLoaders coming from PackFile	Shawn O. Pearce	2010-06-26	9	-433/+97
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We no longer need an ObjectLoader to be lazy and try to delay the materialization of the object content. That was done only to support PackWriter searching for a good reuse candidate. Instead, simplify the code base by doing the materialization immediately when the loader asks for it, because any caller asking for the loader is going to need the content. Change-Id: Id867b1004529744f234ab8f9cfab3d2c52ca3bd0 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Remove getRawSize, getRawType from ObjectLoader	Shawn O. Pearce	2010-06-26	8	-93/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These were only used by PackWriter to help it filter object representations. Their only user disappeared when we rewrote the object selection code path to use the new representation type. Change-Id: I9ed676bfe4f87fcf94aa21e53bda43115912e145 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Tighten up local packed object representation during packing	Shawn O. Pearce	2010-06-26	5	-40/+127
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than making a loader, and then using that to fill the object representation, parse the header and set up our data directly. This saves some time, as we don't waste cycles on information we won't use right now. The weight computed for a representation is now its actual stored size in the pack file, rather than its inflated size. This accounts for changes made when the compression level is modified on the repository. It is however more costly to determine the weight of the object, since we have to find its length in the pack. To try and recover that cost we now cache the length as part of our ObjectToPack record, so it doesn't have to be found during the output phase. A LocalObjectToPack now costs us (assuming 32 bit pointers): (32 bit) (64 bit) vm header: 8 bytes 8 bytes ObjectId: 20 bytes 20 bytes PackedObjectInfo: 12 bytes 12 bytes ObjectToPack: 8 bytes 12 bytes LocalOTP: 20 bytes 24 bytes ----------- --------- 68 bytes 74 bytes Change-Id: I923d2736186eb2ac8ab498d3eb137e17930fcb50 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Move FileRepository to storage.file.FileRepository	Shawn O. Pearce	2010-06-26	55	-65/+203
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This move isolates all of the local file specific implementation code into a single package, where their package-private methods and support classes are properly hidden away from the rest of the core library. Because of the sheer number of files impacted, I have limited this change to only the renames and the updated imports. Change-Id: Icca4884e1a418f83f8b617d0c4c78b73d8a4bd17 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Implement zero-copy for single window objects	Shawn O. Pearce	2010-06-26	3	-22/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Objects that fall completely within a single window can be worked with in a zero-copy fashion, provided that the window is backed by a normal byte[] and not by a ByteBuffer. This works for a surprising number of objects. The default window size is 8 KiB, but most deltas are quite a bit smaller than that. Objects smaller than 1/2 of the window size have a very good chance of falling completely within a window's array, which means we can work with them without copying their data around. Larger objects, or objects which are unlucky enough to span over a window boundary, get copied through the temporary buffer. We pay a tiny penalty to realize we can't use the zero-copy code path, but its easier than trying to keep track of two adjacent windows. With this change (as well as everything preceeding it), packing is actually a bit faster. Some crude benchmarks based on cloning linux-2.6.git (~324 MiB, 1,624,785 objects) over localhost using C git client and JGit daemon shows we get better throughput, and slightly better times: Total Time \| Throughput (old) (now) \| (old) (now) --------------+--------------------------- 2m45s 2m37s \| 12.49 MiB/s 21.17 MiB/s 2m42s 2m36s \| 16.29 MiB/s 22.63 MiB/s 2m37s 2m31s \| 16.07 MiB/s 21.92 MiB/s Change-Id: I48b2c8d37f08d7bf5e76c5a8020cde4a16ae3396 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Redo PackWriter object reuse output	Shawn O. Pearce	2010-06-26	12	-294/+512
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Output of selected reuses is refactored to use a new ObjectReuseAsIs interface that extends the ObjectReader. This interface allows the reader to control how it performs the reuse into the output stream, but also allows it to throw an exception to request the writer to find a different candidate representation. The PackFile reuse code was overhauled, cleaning up the APIs so they aren't exposed in the object loader, but instead are now a single method on the PackFile itself. The reuse algorithm was changed to do a data verification pass, followed by the copy pass to the output. This permits us to work around a corrupt object in a pack file by seeking another copy of that object when this one is bad. The reuse code was also optimized for the common case, where the in-pack representation is under 16 KiB. In these smaller cases data is sent to the pack writer more directly, avoiding some copying. Change-Id: I6350c2b444118305e8446ce1dfd049259832bcca Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Redo PackWriter object reuse selection	Shawn O. Pearce	2010-06-26	11	-129/+415
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new selection implementation uses a public API on the ObjectReader, allowing the storage library to enumerate its candidates and select the best one for this packer without needing to build a temporary list of the candidates first. Change-Id: Ie01496434f7d3581d6d3bbb9e33c8f9fa649b6cd Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Reclaim some bits in ObjectToPack flags field	Shawn O. Pearce	2010-06-25	1	-7/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make the lower bits available for flags that PackWriter can use to keep track of facts about the object. We shouldn't need more than 2^24 delta depths, unpacking that chain is unfathomable anyway. This change gets us 4 bits that are unused in the lower end of the word, which are typically easier to load from Java and most machine instruction sets. We can use these in later changes. Change-Id: Ib9e11221b5bca17c8a531e4ed130ba14c0e3744f Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Extract PackFile specific code to ObjectToPack subclass	Shawn O. Pearce	2010-06-25	5	-52/+148
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ObjectReader class is dual-purposed into being a factory for the ObjectToPack, permitting specific ObjectDatabase implementations to override the method and offer their own custom subclass of the generic ObjectToPack class. By allowing them to directly extend the type, each implementation can add custom fields to support tracking where an object is stored, without incurring any additional penalties like a parallel Map<ObjectId,Object> would cost. The reader was chosen to act as a factory rather than the database, as the reader will eventually be tied more tightly with the ObjectWalk and TreeWalk. During object enumeration the reader would have had to load the object for the RevWalk, and may chose to cache object position data internally so it can later be reused and fed into the ObjectToPack instance supplied to the PackWriter. Since a reader is not thread-safe, and is scoped to this PackWriter and its internal ObjectWalk, its a great place for the database to perform caching, if any. Right now this change goes a bit backwards by changing what should be generic ObjectToPack references inside of PackWriter to the very PackFile specific LocalObjectToPack subclass. We will correct these in a later commit as we start to refine what the ObjectToPack API will eventually look like in order to better support the PackWriter. Change-Id: I9f047d26b97e46dee3bc0ccb4060bbebedbe8ea9 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Extract ObjectToPack to be top-level	Shawn O. Pearce	2010-06-25	2	-143/+191
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This shortens the implementation within PackWriter, and starts to open the door for some other refactorings based on changing the ObjectToPack to be a public part of the API. Change-Id: Id849cbffc4de20b903e844a2de7737eeb8b7a3ff Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Allow Repository.getDirectory() to be null	Shawn O. Pearce	2010-06-25	5	-18/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some types of repositories might not be stored on local disk. For these, they will most likely return null for getDirectory() as the java.io.File type cannot describe where their storage is, its not in the host's filesystem. Document that getDirectory() can return null now, and update all current non-test callers in JGit that might run into problems on such repositories. For the most part, just act like its bare. Change-Id: I061236a691372a267fd7d41f0550650e165d2066 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Redo event listeners to be more generic	Shawn O. Pearce	2010-06-25	12	-170/+371
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Replace the old crude event listener system with a much more generic implementation, patterned after the event dispatch techniques used in Google Web Toolkit 1.5 and later. Each event delivers to an interface that defines a single method, and the event itself is what performs the delivery in a type-safe way through its own dispatch method. Listeners are registered in a generic listener list, indexed by the interface they implement and wish to receive an event for. Delivery of events is performed by looping through all listeners implementing the event's corresponding listener interface, and using the event's own dispatch method to deliver the event. This is the classical "double dispatch" pattern for event delivery. Listeners can be unregistered by invoking remove() on their registration handle. This change therefore requires application code to track the handle if it wishes to remove the listener at a later point in time. Event delivery is now exposed as a generic public method on the Repository class, making it easier for any type of message to be sent out to any type of listener that has registered, without needing to pre-arrange for type-safe fireFoo() methods. New event types can be added in the future simply by defining a new RepositoryEvent subclass and a corresponding RepositoryListener interface that it dispatches to. By always adding new events through a new interface, we never need to worry about defining an Adapter to provide default no-op implementations of new event methods. Change-Id: I651417b3098b9afc93d91085e9f0b2265df8fc81 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Rename Repository getWorkDir to getWorkTree	Shawn O. Pearce	2010-06-25	3	-10/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This better matches with the name used in the environment (GIT_WORK_TREE), in the configuration file (core.worktree), and in our builder object. Since we are already breaking a good chunk of other code related to repository access, and this fairly easy to fix in an application's code base, I'm not going to offer the wrapper getWorkDir() method. Change-Id: Ib698ba4bbc213c48114f342378cecfe377e37bb7 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Refactor repository construction to builder class	Shawn O. Pearce	2010-06-25	7	-175/+812
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new FileRepositoryBuilder class helps applications to construct a properly configured FileRepository, with properties assumed based upon the standard Git rules for the local filesystem. To better support simple command line applications, environment variable handling and repository searching was moved into this builder class. The change gets rid of the ever-growing FileRepository constructor variants, and the multitude of java.io.File typed parameters, by using simple named setter methods. Change-Id: I17e8e0392ad1dbf6a90a7eb49a6d809388d27e4c Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Remove Repository.toFile(ObjectId)	Shawn O. Pearce	2010-06-25	2	-26/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Not every type of Repository will be able to map an ObjectId into a local file system path that stores that object's file contents. Heck, its not even true for the FileRepository, as an object can be stored in a pack file and not in its loose format. Remove this from our public API, it was a mistake to publish it. Change-Id: I20d1b8c39104023936e6d46a5b0d7ef39ff118e8 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Use ObjectInserter for loose objects in WalkFetchConnection	Shawn O. Pearce	2010-06-25	1	-56/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rather than relying on the repository's ability to give us the local file path for a loose object, just pass its inflated form to the ObjectInserter for the repository. We have to recompress it, which may slow down fetches, but this is the slow dumb protocol. The extra cost to do the compression locally isn't going to be a major bottleneck. This nicely removes the nasty part about computing the object identity by hand, allowing us to instead rely upon the inserter's internal computation. Unfortunately it means we might store a loose object whose SHA-1 doesn't match the expected SHA-1, such as if the remote repository was corrupted. This is fairly harmless, as the incorrectly named object will now be stored under its proper name, and will eventually be garbage collected, as its not referenced by the local repository. We have to flush the inserter after the object is stored because we aren't sure if we need to read the object later, or not. Change-Id: Idb1e2b1af1433a23f8c3fd55aeb20575e6047ef0 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Replace WindowCache with ObjectReader	Shawn O. Pearce	2010-06-25	16	-150/+266
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The WindowCache is an implementation detail of PackFile and how its used by ObjectDirectory. Lets start to hide it and replace the public API with a more generic concept, ObjectReader. Because PackedObjectLoader is also considered a private detail of PackFile, we have to make PackWriter temporarily dependent upon the WindowCursor and thus FileRepository and ObjectDirectory in order to just start the refactoring. In later changes we will clean up the APIs more, exposing sufficient support to PackWriter without needing the file specific implementation details. Change-Id: I676be12b57f3534f1285854ee5de1aa483895398 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Refactor alternate object databases below ObjectDirectory	Shawn O. Pearce	2010-06-25	7	-613/+382
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Not every object storage system will have the concept of alternate object databases to search, and even if they do, they may not have the notion of fast-access / slow-access split like we do within the ObjectDirectory code for pack files and loose objects. Push all of that down below the generic API so that it is a hidden detail of the ObjectDirectory and its related supporting classes. Change-Id: I54bc1ca5ff2ac94dfffad1f9a9dad7af202b9523 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Start using ObjectInserter instead of ObjectWriter	Shawn O. Pearce	2010-06-25	7	-61/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some newer style APIs are updated to use the newer ObjectInserter interface instead of the now deprecated ObjectWriter. In many of the unit tests we don't bother to release the inserter, these are typically using the file backend which doesn't need a release, but in the future should use an in-memory HashMap based store, which really wouldn't need it either. Change-Id: I91a15e1dc42da68e6715397814e30fbd87fa2e73 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
\| * \|	Refactor object writing responsiblities to ObjectDatabase	Shawn O. Pearce	2010-06-25	11	-278/+694
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ObjectInserter API permits ObjectDatabase implementations to control their own object insertion behavior, rather than forcing it to always be a new loose file created in the local filesystem. Inserted objects can also be queued and written asynchronously to the main application, such as by appending into a pack file that is later closed and added to the repository. This change also starts to open the door to non-file based object storage, such as an in-memory HashMap for unit testing, or a more complex system built on top of a distributed hash table. To help existing application code port to the newer interface we are keeping ObjectWriter as a delegation wrapper to the new API. Each ObjectWriter instances holds a reference to an ObjectInserter for the Repository's top-level ObjectDatabase, and it flushes and releases that instance on each object processed. Change-Id: I413224fb95563e7330c82748deb0aada4e0d6ace Signed-off-by: Shawn O. Pearce <spearce@spearce.org>