| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When the add parameter is set all modified and deleted files
are staged prior to commit.
Change-Id: Id23bc25730fcdd151386cd495a7cdc0935cbc00b
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
|
|\| | | |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This change is mainly done for a subsequent commit
which will introduce the "all" parameter to the Commit
command.
Bug: 318439
Change-Id: I85a8a76097d0197ef689a289288ba82addb92fc9
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
It is useful to be able to replace an existing Change-Id
in the message, for example if the user decides not to
amend the previous commit.
Bug: 321188
Change-Id: I594e7f9efd0c57d794d2bd26d55ec45f4e6a47fd
Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
TreeWalk calls this value "path", while "name" is the stuff after the
last slash. FileHeader should do the same thing to be consistent.
Rename getOldName to getOldPath and getNewName to getNewPath.
Bug: 318526
Change-Id: Ib2e372ad4426402d37939b48d8f233154cc637da
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|\ \ \ \
| | |_|/
| |/| |
| | | |
| | | | |
* js/diff:
Fixed bug in scoring mechanism for rename detection
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
A bug in rename detection would cause file scores to be wrong. The
bug was due to the way rename detection would judge the similarity
between files. If file A has three lines containing 'foo', and file
B has 5 lines containing 'foo', the rename detection phase should
record that A and B have three lines in common (the minimum of the
number of times that line appears in both files). Instead, it would
choose the the number of times the line appeared in the destination
file, in this case file B. I fixed the bug by having the
SimilarityIndex instead choose the minimum number, as it should. I
also added a test case to verify that the bug had been fixed.
Change-Id: Ic75272a2d6e512a361f88eec91e1b8a7c2298d6b
|
| |/ /
|/| |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
IndexDiff was re-implemented and now uses TreeWalk instead
of GitIndex. Additionally, gitignore support and retrieval of
untracked files was added.
Change-Id: Ie6a8e04833c61d44c668c906b161202b200bb509
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The CommitCommand should not use java.io to delete MERGE_HEAD and MERGE_MSG
files since Repository already has utility methods for that.
Change-Id: If66a419349b95510e5b5c2237a91f06c1d5ba0d4
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
|
|\ \ \ |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
By deferring tag sorting until the commit is produced by the walker
we can avoid an infinite loop that was triggered by trying to sort
tags while allocating a commit. This also avoids needing to look
at commits which aren't going to be produced in the result.
Bug: 321103
Change-Id: I25acc739db2ec0221a50b72c2d2aa618a9a75f37
Reviewed-by: Mathias Kinzler <mathias.kinzler@sap.com>
Reviewed-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Currently, a NullPointerException occurs in this case. We should
instead throw a more meaningful Exception with a proper message.
This is a very "stupid" implementation which simply checks for
the existence of a ".gitmodules" file.
Bug: 300731
Bug: 306765
Bug: 308452
Bug: 314853
Change-Id: I155aa340a85cbc5d7d60da31dba199fc30689b67
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
|
|/ / /
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
the following tests fail under windows because certain inputstreams
are not closed and files cannot be deleted because of that. The
main problem I found is UnpackedObject.InflaterInputStream.close().
This method may throw exceptions found by checkValidEndOfStream()
but doesn't call super.close() before leaving. It is not clear to me
which resources a close() method should release before it throws an
exception. But those reseources which are not published to the
outside and which therefore cannot be closed by other means have to
be closed in all cases.
I changed the close() method to call super.close() under all
circumstances.
failing tests:
testStandardFormat_LargeObject_TruncatedZLibStream(org.eclipse.jgit.storage.file.UnpackedObjectTest)
testStandardFormat_LargeObject_TrailingGarbage(org.eclipse.jgit.storage.file.UnpackedObjectTest)
testPackFormat_SmallObject(org.eclipse.jgit.storage.file.UnpackedObjectTest)
Change-Id: Id2e609a29e725aad953ff9bd88af6381df38399d
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add a method isDirectoryFileConflict() to NameConflictTreeWalk which
tells whether the current path is part of a directory/file conflict.
Change-Id: Iffcc7090aaec743dd6f3fd1a333cac96c587ae5d
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| |/
|/|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is caused by a recursion in PlotWalk.getTags().
As a hotfix, the sort was simply removed. The sort
must be re-implemented so that parseAny() is not called
again (currently, this happens in the PlotRefComparator).
Change-Id: I060d26fda8a75ac803acaf89cfb7d3b4317328f3
Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
File pairs that are very dissimilar during a diff were not being
broken apart into their constituent ADD/DELETE pairs. The leads to
sub-optimal rename detection. Take, for example, this situation:
A file exists at src/a.txt containing "foo". A user renames src/a.txt
to src/b.txt, then adds a new src/a.txt containing "bar".
Even though the old a.txt and the new b.txt are identical, the
rename detection algorithm would not detect it as a rename since
it was already paired in a MODIFY. I added code to split all
MODIFYs below a certain score into their constituent ADD/DELETE
pairs. This allows situations like the one I described above to be
more correctly handled.
Change-Id: I22c04b70581f206bbc68c4cd1ee87a1f663b418e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Add methods to the Repository class which write into MERGE_HEAD
and MERGE_MSG files. Since we have the read methods in the same
class this seems to be the right place.
Change-Id: I5dd65306ceb06e008fcc71b37ca3a649632ba462
Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
LockFile.commit fails if another thread concurrently reads
the base file. The problem is fixed by retrying the rename
operation if it fails.
Change-Id: I6bb76ea7f2e6e90e3ddc45f9dd4d69bd1b6fa1eb
Bug: 308506
Signed-off-by: Jens Baumgart <jens.baumgart@sap.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
There were some broken links, incorrect uses of @value, an invalid
tag and an outdated comment.
Change-Id: I22886bcc869a4b62bd606ebed40669f7b4723664
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This simplifies the logic for those who already have an ObjectReader
on hand want to reuse it to lookup a single path.
Change-Id: Ief17d6b2a0674ddb34bbc9f43121b756eae960fb
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This exposes a load and save method, allowing a Repository to denote
that it has a persistent configuration of some kind which can be
accessed by the application, without needing to know exact details
of how its stored .
Change-Id: I7c414bc0f975b80f083084ea875eca25c75a07b2
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* delta: (103 commits)
Discard the uncompressed delta as soon as its compressed
Honor pack.windowlimit to cap memory usage during packing
Honor pack.threads and perform delta search in parallel
Cache small deltas during packing
Implement delta generation during packing
debug-show-packdelta: Dump a pack delta to the console
Initial pack format delta generator
Add debugging toString() method to ObjectToPack
Make ObjectToPack clearReuseAsIs signal available to subclasses
Correctly classify the compressing objects phase
Refactor ObjectToPack's delta depth setting
Configure core.bigFileThreshold into PackWriter
Add doNotDelta flag to ObjectToPack
Add more configuration options to PackWriter
Save object path hash codes during packing
Add path hash code to ObjectWalk
Add getObjectSize to ObjectReader
Allow TemporaryBuffer.Heap to allocate smaller than 8 KiB
Define a constant for 127 in DeltaEncoder
Cap delta copy instructions at 64k
...
Conflicts:
org.eclipse.jgit.pgm/src/org/eclipse/jgit/pgm/Diff.java
org.eclipse.jgit/resources/org/eclipse/jgit/JGitText.properties
org.eclipse.jgit/src/org/eclipse/jgit/JGitText.java
org.eclipse.jgit/src/org/eclipse/jgit/revwalk/RewriteTreeFilter.java
Change-Id: I7c7a05e443a48d32c836173a409ee7d340c70796
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The DeltaCache will most likely need to copy the compressed delta
into a new buffer in order to compact away the wasted space at the
end caused by over allocation. Since we don't need the uncompressed
format anymore, null out our only reference to it so the GC can
reclaim this memory if it needs to perform a collection in order
to satisfy the cache's allocation attempt.
Change-Id: I50403cfd2e3001b093f93a503cccf7adab43cc9d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The pack.windowlimit configuration parameter places an upper bound
on the number of bytes used by the DeltaWindow class as it scans
through the object list. If memory usage would exceed the limit
the window is temporarily decreased in size to keep memory used
within that bound.
Change-Id: I09521b8f335475d8aee6125826da8ba2e545060d
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If we have multiple CPUs available, packing usually goes faster
when each CPU is assigned a slice of the available search space.
The number of threads to use is guessed from the runtime if it
wasn't set by the caller, or wasn't set in the configuration.
Change-Id: If554fd8973db77632a52a0f45377dd6ec13fc220
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
PackWriter now caches small deltas, or deltas that are very tiny
compared to their source inputs, so that the writing phase goes
faster by reusing those cached deltas.
The cached data is stored compressed, which usually translates to
a bigger footprint due to deltas being very hard to compress, but
saves time during writing by avoiding the deflate step. They are
held under SoftReferences so that the JVM GC can clear out deltas
if memory gets very tight. We would rather continue working and
spend a bit more CPU time during writing than crash due to OOME.
To avoid OutOfMemoryErrors during the caching phase we also trap
OOME and just abort out of the caching.
Because deflateBound() always produces something larger than what
we need to actually store the deflated data, we copy it over into
a new buffer if the actual length doesn't match the buffer length.
When packing jgit.git this saves over 111 KiB in the cache, and is
thus a worthwhile hit on CPU time.
To further save memory we store the inflated size of the delta
(which we need for the object header) in the same field as the
pathHash, as the pathHash is no longer necessary by this phase
of the packing algorithm.
Change-Id: I0da0c600d845e8ec962289751f24e65b5afa56d7
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
PackWriter now produces new deltas if there is not a suitable delta
available for reuse from an existing pack file. This permits JGit to
send less data on the wire by sending a delta relative to an object
the other side already has, instead of sending the whole object.
The delta searching algorithm is similar in style to what C Git
uses, but apparently has some differences (see below for more on).
Briefly, objects that should be considered for delta compression are
pushed onto a list. This list is then sorted by a rough similarity
score, which is derived from the path name the object was discovered
at in the repository during object counting. The list is then
walked in order.
At each position in the list, up to $WINDOW objects prior to it
are attempted as delta bases. Each object in the window is tried,
and the shortest delta instruction sequence selects the base object.
Some rough rules are used to prevent pathological behavior during
this matching phase, like skipping pairings of objects that are
not similar enough in size.
PackWriter intentionally excludes commits and annotated tags from
this new delta search phase. In the JGit repository only 28 out
of 2600+ commits can be delta compressed by C Git. As the commit
count tends to be a fair percentage of the total number of objects
in the repository, and they generally do not delta compress well,
skipping over them can improve performance with little increase in
the output pack size.
Because this implementation was rebuilt from scratch based on my own
memory of how the packing algorithm has evolved over the years in
C Git, PackWriter, DeltaWindow, and DeltaEncoder don't use exactly
the same rules everywhere, and that leads JGit to produce different
(but logically equivalent) pack files.
Repository | Pack Size (bytes) | Packing Time
| JGit - CGit = Difference | JGit / CGit
-----------+----------------------------------+-----------------
git | 25094348 - 24322890 = +771458 | 59.434s / 59.133s
jgit | 5669515 - 5709046 = - 39531 | 6.654s / 6.806s
linux-2.6 | 389M - 386M = +3M | 20m02s / 18m01s
For the above tests pack.threads was set to 1, window size=10,
delta depth=50, and delta and object reuse was disabled for both
implementations. Both implementations were reading from an already
fully packed repository on local disk. The running time reported
is after 1 warm-up run of the tested implementation.
PackWriter is writing 771 KiB more data on git.git, 3M more on
linux-2.6, but is actually 39.5 KiB smaller on jgit.git. Being
larger by less than 0.7% on linux-2.6 isn't bad, nor is taking an
extra 2 minutes to pack. On the running time side, JGit is at a
major disadvantage because linux-2.6 doesn't fit into the default
WindowCache of 20M, while C Git is able to mmap the entire pack and
have it available instantly in physical memory (assuming hot cache).
CGit also has a feature where it caches deltas that were created
during the compression phase, and uses those cached deltas during
the writing phase. PackWriter does not implement this (yet),
and therefore must create every delta twice. This could easily
account for the increased running time we are seeing.
Change-Id: I6292edc66c2e95fbe45b519b65fdb3918068889c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This is a horribly crude application, it doesn't even verify that
the object its dumping is delta encoded. Its method of getting the
delta is pretty abusive to the public PackWriter API, because right
now we don't want to expose the real internal low-level methods
actually required to do this.
Change-Id: I437a17ceb98708b5603a2061126eb251e82f4ed4
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
DeltaIndex is a simple pack style delta generator. The function works
by creating a compact index of a source buffer's blocks, and then
walking a sliding window along a desired result buffer, searching for
the window in the index. When a match is found, the window is
stretched to the longest possible length that is common with the
source buffer, and a copy instruction is created.
Rabin's polynomial hash function is used to compute the hash for a
block, permitting efficient sliding of the window in single byte
increments. The update function to slide one byte originated from
David Mazieres' work in LBFS, and our implementation of the update
step was certainly inspired by the initial work Geert Bosch proposed
for C Git in http://marc.info/?l=git&m=114565424620771&w=2.
To ensure the encoder runs in linear time with respect to the size of
the two input buffers (source and result), the maximum number of
blocks that can share the same position in the index's hashtable is
capped at a constant number. This prevents bad inputs from causing
the encoder to run in quadratic time, but comes with a penalty of
creating a longer delta due to fewer considered copy positions.
Strange hackery is used to cap the amount of memory used by the index
to be no more than 12 bytes for every 16 bytes of source buffer, no
matter what the JVM per-object overhead is. This permits an index to
always be no larger than 1.75x the source buffer length, which is an
important feature to support large windows of candidates to match
against while packing. Here the strange hackery is nothing more than
a manually managed chained hashtable, where pointers are array indexes
into storage arrays rather than object references.
Computation of the hash function for a single fixed sized block is
done through an unrolled loop, where the first 4 iterations have been
manually reduced down to eliminate unnecessary instructions. The
pattern is derived from ObjectId.equals(byte[], int, byte[], int),
where we have unrolled the loop required to compare two 20 byte
arrays. Hours of testing with the Sun 1.6 JRE concluded that the
non-obvious "foo[idx + 1]" style of reference is faster than
"foo[idx++]", and so that is what we use here during hashing.
Change-Id: If9fb2a1524361bc701405920560d8ae752221768
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Its useful to know what the flags are or what the base that was
selected is. Dump these out as part of the object's toString.
Change-Id: I8810067fb8337b08b4fcafd5f9ea3e1e31ca6726
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
A subclass may want to use this method to release handles that are
caching reuse information. Make it protected so they can override
it and update themselves.
Change-Id: I2277a56ad28560d2d2d97961cbc74bc7405a70d4
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Searching for reuse candidates should be fast compared to actually
doing delta compression. So pull the progress monitor out of this
phase and rename it back to identify the compressing objects state.
Change-Id: I5eb80919f21c1251e0e3420ff7774126f1f79b27
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Long ago when PackWriter is first written we thought that the delta
depth could be updated automatically. But its never used. Instead
make this a simple standard setter so the caller can more directly
set the delta depth of this object. This permits us to configure a
depth that takes into account more than just the depth of another
object in this same pack.
Change-Id: I1d71b74f2edd7029b8743a2c13b591098ce8cc8f
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
C Git's fast-import uses this to determine the maximum file size
that it tries to delta compress, anything equal to or above this
setting is stored with as a whole object with simple deflate.
Define the configuration so we can use it later.
Change-Id: Iea46e787d019a1b6c51135cc73d7688a02e207f5
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This flag will later control whether or not PackWriter search for a
delta base for this object. Edge objects will never get searched,
as the writer won't be outputting them, so they should always have
this flag set on. Sometime in the future this flag should also be
set for file blobs on file paths that have the "-delta" gitattribute
set in the repository's attributes file.
Change-Id: I6e518e1a6996c8ce00b523727f1b605e400e82c6
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We now at least import other pack settings like pack.window, which
means we can later use these to control how we search for deltas.
The compression level was fixed to use pack.compression rather than
the loose object core.compression setting.
Change-Id: I72ff6d481c936153ceb6a9e485fa731faf075a9a
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We need to remember these so we can later cluster objects that
have similar file paths near each other as we search for deltas
between them.
Change-Id: I52cb1e4ca15c9c267a2dbf51dd0d795f885f4cf8
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
PackWriter wants to categorize objects that are similar in path name,
so blobs that are probably from the same file (or same sort of file)
can be delta compressed against each other. Avoid converting into
a string by performing the hashing directly against the path buffer
in the tree iterator.
We only hash the last 16 bytes of the path, and we try avoid any
spaces, as we want the suffix of a file such as ".java" to be more
important than the directory it is in, like "src".
Change-Id: I31770ee711526306769a6f534afb19f937e0ba85
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This is an informational function used by PackWriter to help it
better organize objects for delta compression. Storage systems
can implement it to provide up more detailed size information,
or they can simply rely on the default behavior that uses the
ObjectLoader obtained from open.
For local file storage, we can obtain this information faster
through specialized routines that parse a pack object header.
Change-Id: I13a09b4effb71ea5151b51547f7d091564531e58
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If the heap limit was set to something smaller than 8 KiB, we were
still allocating the full 8 KiB block size, and accepting up to
the amount we allocated by. Instead actually put a hard cap on
the limit.
Change-Id: Id1da26fde2102e76510b1da4ede8493928a981cc
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The special value 127 here means how many bytes we can put into
a single insert command. Rather than use the magical value 127,
lets name it to better document the code.
Change-Id: I5a326f4380f6ac87987fa833e9477700e984a88e
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Although all modern delta decoders can process copy instructions
with a count as large as 0xffffff (~15.9 MiB), pack version 2 streams
are only supposed to use delta copy instructions up to 64 KiB.
Rewrite our copy instruction encode loop to use the lower 64 KiB
limit, even though modern decoders would support longer copies.
To improve encoding performance we now try to encode up to four full
copy commands in our buffer before we flush it to the stream, but
we don't try to implement full buffering here. We are just trying
to amortize the virtual method call to the destination stream when
we have to do a large copy.
Change-Id: I9410a16e6912faa83180a9788dc05f11e33fabae
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The encode loop had the wrong condition, objects that are 128 bytes
in size need to have their length encoded as two bytes, not one.
Change-Id: I3bef85f2b774871ba6104042b341749eb8e7595c
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Rename the ByteWindow's inflate() method to setInput. We have
completely refactored the purpose of this method to be feeding part
(or all) of the window as input to the Inflater, and the actual
inflate activity happens in the caller.
Change-Id: Ie93a5bae0e9e637b5e822d56993ce6b562c6ad15
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We need to validate the stream state after the InflaterInputStream
thinks the stream is done. Git expects a higher level of service from
the Inflater than the InflaterInputStream usually gives, we need to
ensure the embedded CRC is valid, and that there isn't trailing
garbage at the end of the file.
Change-Id: I1c9642a82dbd76b69e607dceccf8b85dc869a3c1
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Alex pointed out that my description of a bare repository might be
confusing for some readers. Reword the description of the error,
and make it consistent throughout the Repository class's API.
Change-Id: I87929ddd3005f578a7022f363270952d1f7f8664
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
During code review, Alex raised a few comments about commit
532421d98925 ("Refactor repository construction to builder class").
Due to the size of the related series we aren't going to go back
and rebase in something this minor, so resolve them as a follow-up
commit instead.
Change-Id: Ied52f7a8f7252743353c58d20bfc3ec498933e00
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Now that any large objects are forced through a streaming loader
when its bigger than getStreamFileThreshold(), and that threshold
is pegged at Integer.MAX_VALUE as its largest size, we will never
be able to reach this code path where we threw OutOfMemoryError.
Robin pointed out that we probably should include a message here,
but the code is effectively unreachable, so there isn't any value
in adding a message at this point.
So remove it.
Change-Id: Ie611d005622e38a75537f1350246df0ab89dd500
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Since we don't know the type of object we are parsing, we don't
know if its a massive blob, or some small commit or annotated tag.
Avoid pulling the cached bytes until we have checked the type and
decided if we actually need them to continue parsing right now.
This way large blobs which won't fit in memory and would throw
a LargeObjectException don't abort parsing.
Change-Id: Ifb70df5d1c59f616aa20ee88898cb69524541636
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Callers don't necessarily need the getSize() result from a large
delta. They instead should be always using openStream() or copyTo()
for blobs going to local files, or they should be checking the
result of the constant-time isLarge() method to determine the type
of access they can use on the ObjectLoader. Avoid inflating the
delta instruction stream twice by delaying the decoding of the size
until after we have created the DeltaStream and decoded the header.
Likewise with the type, callers don't necessarily always need it
to be present in an ObjectLoader. Delay looking at it as late as
we can, thereby avoiding an ugly O(N^2) loop looking up the type
for every single object in the entire delta chain.
Change-Id: I6487b75b52a5d201d811a8baed2fb4fcd6431320
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
|