mirrors/jgit - jgit - source @ dussan.org

677 Commits

49 Branches

Tree: 741659fed4

Commit Graph

Author	SHA1	Message	Date
Shawn O. Pearce	741659fed4	DeltaStream: Fix data corruption when reading large copies If the copy instruction was larger than the input buffer given to us, we copied the wrong part of the base stream during the next read(). This occurred on really big binary files where a copy instruction of 64k wasn't unreasonable, but the caller's buffer was only 8192 bytes long. We copied the first 8192 bytes correctly, but then reseeked the base stream back to the start of the copy region on the second read of 8192 bytes. Instead of a sequence like ABCD being read into the caller, we read AAAA. Change-Id: I240a3f722a3eda1ce8ef5db93b380e3bceb1e201 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 years ago
Shawn O. Pearce	ad68553be4	Support large delta packed objects as streams Very large delta instruction streams, or deltas which use very large base objects, are now streamed through as large objects rather than being inflated into a byte array. This isn't the most efficient way to access delta encoded content, as we may need to rewind and reprocess the base object when there was a block moved within the file, but it will at least prevent the JVM from having its heap explode. When streaming a delta we have an inflater open for each level in the delta chain, to inflate the instruction set of the delta, as well as an inflater for the base level object. The base object is buffered, as is the top level delta requested by the application, but we do not buffer the intermediate delta streams. This keeps memory usage lower, so its closer to 1024 bytes per level in the chain, without having an adverse impact on raw throughput as the top-level buffer gets pushed down to the lowest stream that has the next region. Delta instructions transparently collapse here, if the top level does not copy a region from its base, the base won't materialize that part from its own base, etc. This allows us to avoid copying around a lot of segments which have been deleted from the final version. Change-Id: I724d45245cebb4bad2deeae7b896fc55b2dd49b3 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 years ago

Author

SHA1

Message

Date

Shawn O. Pearce

741659fed4

DeltaStream: Fix data corruption when reading large copies

If the copy instruction was larger than the input buffer given to us,
we copied the wrong part of the base stream during the next read().

This occurred on really big binary files where a copy instruction
of 64k wasn't unreasonable, but the caller's buffer was only 8192
bytes long.  We copied the first 8192 bytes correctly, but then
reseeked the base stream back to the start of the copy region on
the second read of 8192 bytes.  Instead of a sequence like ABCD
being read into the caller, we read AAAA.

Change-Id: I240a3f722a3eda1ce8ef5db93b380e3bceb1e201
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>

13 years ago

Shawn O. Pearce

ad68553be4

Support large delta packed objects as streams

Very large delta instruction streams, or deltas which use very large
base objects, are now streamed through as large objects rather than
being inflated into a byte array.

This isn't the most efficient way to access delta encoded content, as
we may need to rewind and reprocess the base object when there was a
block moved within the file, but it will at least prevent the JVM from
having its heap explode.

When streaming a delta we have an inflater open for each level in the
delta chain, to inflate the instruction set of the delta, as well as
an inflater for the base level object.  The base object is buffered,
as is the top level delta requested by the application, but we do not
buffer the intermediate delta streams.  This keeps memory usage lower,
so its closer to 1024 bytes per level in the chain, without having an
adverse impact on raw throughput as the top-level buffer gets pushed
down to the lowest stream that has the next region.

Delta instructions transparently collapse here, if the top level does
not copy a region from its base, the base won't materialize that part
from its own base, etc.  This allows us to avoid copying around a lot
of segments which have been deleted from the final version.

Change-Id: I724d45245cebb4bad2deeae7b896fc55b2dd49b3
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>

14 years ago

2 Commits (741659fed4e73b71693b408e137804b17e65f0cb)