Marc Strapetz [Fri, 26 Nov 2010 10:07:04 +0000 (11:07 +0100)]
Fix DiffConfig to understand "copy" resp. "copies" for diff.renames property.
Rename detection should be considered enabled if
diff.renames config property is set to "copy" or "copies", instead of
throwing IllegalArgumentException.
Fix bug regarding handling of non-versioned files during merge
There was a bug introduced by commit 0e815fe. For non-versioned files
the merge algorithm detected an incoming deletion from THEIRS.
Consequently such files were deleted. That's a severe bug which was
fixed by more precisely detecting incoming deletions.
Change-Id: I4385d3c990db11d62e371a385dc8ee89841db84a Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Philipp Thun <philipp.thun@sap.com> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Mathias Kinzler [Mon, 22 Nov 2010 15:26:00 +0000 (16:26 +0100)]
Initial implementation of a Rebase command
This is a first iteration to implement Rebase. At the moment, this
does not implement --continue and --skip, so if the first
conflict is found, the only option is to --abort the command.
Shawn O. Pearce [Fri, 19 Nov 2010 01:04:10 +0000 (17:04 -0800)]
Move WorkingTreeIterator inherited state into an object
Instead of copying up to 4 fields from the parent iterator each time a
child iterator is initialized and used, construct a single state
object that contains the 4 fields, and pass that one state object
through to the child. This makes it easier to add additional state
fields that must be inherited, at the slight expense of an extra
object allocation per TreeWalk, and an extra level of field
indirection whenever the options, nameEncoder, or read buffer is
required by the iterator.
Change-Id: Ic4603c33b772d7a45f9c81140537d51945688fcb Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 19 Nov 2010 00:50:14 +0000 (16:50 -0800)]
Name TreeFilter and MergeFilter implementations
Naming these inner classes ensures that stack traces which contain
them will give us useful information about which filter is involved in
the trace, rather than the generated names $1, $2, etc. This makes it
much easier to understand a stack trace at a glance.
Change-Id: Ia6a75fdb382ff6461e02054d94baf011bdeee5aa Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
In the case where DirCacheCheckout was used to checkout a tree
without taking HEAD into account (e.g. during a clone or hard reset)
we didn't handle conflicts correctly. E.g. if there are conflicts
(entries with stage != 0) in the index and we tried to hard reset
we have been processing the conflicting pathes multiple times (once
for every stage). With this fix we will update the index with the
entry from the "merge" state (the state we want checkout) when we
detect existing conflicts.
Change-Id: Iffbddccaa588cf0d1460a5e44dabaf540d996e26 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Shawn O. Pearce [Sat, 13 Nov 2010 00:15:43 +0000 (16:15 -0800)]
Merge branch 'rename-detection'
* rename-detection:
RenameDetector: Only scan deletes if adds exist
SimilarityRenameDetector: Initialize sizes to 0
SimilarityRenameDetector: Avoid allocating source index
SimilarityRenameDetector: Only attempt to index large files once
SimilarityIndex: Don't overflow internal counter fields
SimilarityIndex: Accept files larger than 8 MB
SimilarityIndex: Correct comment explaining the logic
Shawn O. Pearce [Sat, 13 Nov 2010 00:12:27 +0000 (16:12 -0800)]
Merge branch 'fs-fsync'
* fs-fsync:
Remove unnecessary flush calls from LockFile
Remove unnecessary region locking from LockFile
Support core.fsyncRefFiles option
Support core.fsyncObjectFiles option
Simplify LockFile write(ObjectId) case
Shawn O. Pearce [Fri, 12 Nov 2010 23:11:30 +0000 (15:11 -0800)]
Base64: Reformat to match JGit style
Rewrite the initialization of the encoding tables to be more clear,
but slightly slower to setup. We generally perfer a clear definition
of the data over a slightly slower class load time.
Change-Id: I0c7f89b6ab82dcf71525ffb69a388c312c195913 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 12 Nov 2010 22:51:06 +0000 (14:51 -0800)]
Base64: Strip out code JGit doesn't use
Since we have already modified this class to localize an error
message, we might as well strip it down to contain only the
functionality we need, or might ever use.
To keep this simple to review we don't adjust formatting right
away, so code that was buried inside of an if or else block whose
condition was removed might not have the correct indentation anymore.
We can fix this with a later reformatting change.
Change-Id: I2996aaa704e9d6182e5500c7a63240d5e9d722cc Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When DirCacheCheckout should be used to checkout only one
tree (reset --hard, clone) then we had to use the standard
constructor and specify null as value for head. This change
adds explicit constructors not taking HEAD and documents
that.
Bug: 330021 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Shawn O. Pearce [Fri, 5 Nov 2010 02:01:51 +0000 (19:01 -0700)]
Remove unnecessary note fanout when removing notes
Fanout level notes trees are combined back together into a flat leaf
level tree if during a removal of a subtree there are less than 3/4 of
the fanout subtrees still existing, and the size of the combined leaf
is under the 256 split limit noted above.
This rule is used because deletes are less common than insertions, and
SHA-1's relatively uniform distribution suggests that with only 192
subtrees existing in the fanout, there should be approximately 192
names in the combined replacement leaf tree.
Change-Id: Ia9d145ffd5454982509fc40906bc4dbbf2b13952 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 5 Nov 2010 01:56:40 +0000 (18:56 -0700)]
Split note leaf buckets at 256 elements
Leaf level notes trees are split into a new fan-out tree if an
insertion occurs and the tree already contains >= 256 notes in it.
The splitting may occur multiple times if all of the notes have the
same prefix; in the worst case this produces a tree path such as
"00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/be" if all
of the notes begin with zeros.
Change-Id: I2d7d98f35108def9ec49936ddbdc34b13822a3c7 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 5 Nov 2010 01:46:19 +0000 (18:46 -0700)]
Add internal API for note iteration
Some algorithms need to be able to iterate through all notes within a
particular bucket, such as when splitting or combining a bucket.
Exposing an Iterator<Note> makes this traversal possible.
For a LeafBucket the iteration is simple, its over the sorted array of
elements. For FanoutBucket its a bit more complex as the iteration
needs to union the iterators of each fanout bucket, lazily loading any
buckets that aren't already in-memory.
Change-Id: I3d5279b11984f44dcf0ddb14a82a4b4e51d4632d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 5 Nov 2010 00:50:48 +0000 (17:50 -0700)]
Add in-memory updating support to NoteMap
NoteMap now supports editing in-memory, allowing applications to
modify the NoteMap once it has been loaded from the branch. The
ability to write the branch back to tree objects is not yet done,
so the edits are strictly transient.
Change-Id: I63448954abfca2a8e3e95369cd84c0d1176cdb79 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 11 Nov 2010 01:28:14 +0000 (17:28 -0800)]
Remove unnecessary region locking from LockFile
The lock file protocol relies on the atomic creation of a standardized
name in the parent directory of the file being updated. Since the
creation is atomic, at most one thread in any process can succeed on
this creation, and all others will fail. While the lock file exists,
that file is private to the thread that is writing it, and no others
will attempt to read or modify the file.
Consequently the use of the region level locks around the file are
unnecessary, and may actually reduce performance when using NFS, SMB,
or some other sort of remote filesystem that supports locking.
Change-Id: Ice312b6fb4fdf9d36c734c3624c6d0537903913b Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 11 Nov 2010 01:24:16 +0000 (17:24 -0800)]
Support core.fsyncRefFiles option
If core.fsyncRefFiles is set to true, fsync is used whenever a
reference file is updated, ensuring the file contents are also
written to disk. This can help to prevent empty ref files after
a system crash when using a filesystem such as HFS+ where data
writes may be delayed.
Change-Id: Ie508a974da50f63b0409c38afe68772322dc19f1 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 10 Nov 2010 02:58:36 +0000 (18:58 -0800)]
Support core.fsyncObjectFiles option
Some repositories may be on really unstable filesystems, but still
want to have good reliability when objects are written to disk. If
core.fsyncObjectFiles is set to true, request the JVM to ensure the
data is written before returning success to the caller of insert.
The option defaults to false because it should be useless on any
filesystem that orders writes and metadata, such as ext3 mounted with
data=ordered (or data=journal). But it may be useful on some systems
(especially HFS+) where file content may flush to the disk
independently of filesystem structure changes.
Because FileChannel.force(boolean) only claims to ensure data is
written if it was written using the write(ByteBuffer) method of
FileChannel, redirect all writes when using fsyncObjectFiles to go
through the FileChannel interface instead of through the older style
OutputStream interface. This may not be necessary on all JVMs, but
its more portable to follow the definition than the common behavior.
Change-Id: I57f6b6bb7e403c07fbae989dbf3758eaf5edbc78 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 11 Nov 2010 22:43:22 +0000 (14:43 -0800)]
SimilarityRenameDetector: Initialize sizes to 0
Setting the array elements to -1 is more expensive than relying on
the allocator to zero the array for us first. Shifting the code to
always add 1 to the size (so an empty file is actually 1 byte long)
allows us to detect an unloaded size by comparing to 0, thus saving
the array fill calls.
Change-Id: Iad859e910655675b53ba70de8e6fceaef7cfcdd1 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 11 Nov 2010 22:29:11 +0000 (14:29 -0800)]
SimilarityRenameDetector: Avoid allocating source index
If the only file added is really small, and all of the deleted
files are really big, none of the permutations will match up due
to the sizes being too far apart to fit the current rename score.
Avoid allocating the really big deleted SimilarityIndex by deferring
its construction until at least one add along that row has a
reasonable chance of matching it.
This avoids expending a lot of CPU time looking at big deleted
binary files when a small modified text file was broken due to a
high percentage of changed lines.
Change-Id: I11ae37edb80a7be1eef8cc01d79412017c2fc075 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 11 Nov 2010 22:25:01 +0000 (14:25 -0800)]
SimilarityRenameDetector: Only attempt to index large files once
If a file fails to index the first time the loop encounters it, the
file is likely to fail to index again on the next row. Rather than
wasting a huge amount of CPU to index it again and fail, remember
which destination files failed to index and skip over them on each
subsequent row.
Because this condition is very unlikely, avoid allocating the BitSet
until its actually needed. This keeps the memory usage unaffected
for the common case.
Change-Id: I93509b28b61a9bba8f681a7b4df4c6127bca2a09 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The counter portion of each pair is only 32 bits wide, but is part
of a larger 64 bit integer. If the file size was larger than 4 GB
the counter could overflow and impact the key, changing the hash,
and later resulting in an incorrect similarity score.
Guard against this overflow condition by capping the count for each
record at 2^32-1. If any record contains more than that many bytes
the table aborts hashing and throws TableFullException.
This permits the index to scan and work on files that exceed 4 GB
in size, but only if the file contains more than one unique block.
The index throws TableFullException on a 4 GB file containing all
zeros, but should succeed on a 6 GB file containing unique lines.
The index now uses a 64 bit accumulator during the common scoring
algorithm, possibly resulting in slower summations. However this
index is already heavily dependent upon 64 bit integer operations
being efficient, so increasing from 32 bits to 64 bits allows us
to correctly handle 6 GB files.
Change-Id: I14e6dbc88d54ead19336a4c0c25eae18e73e6ec2 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Thu, 11 Nov 2010 22:10:32 +0000 (14:10 -0800)]
SimilarityIndex: Accept files larger than 8 MB
Files bigger than 8 MB (2^23 bytes) tended to overflow the internal
hashtable, as the table was capped in size to 2^17 records. If a
file contained 2^17 unique data blocks/lines, the table insertion
got stuck in an infinite loop as the able couldn't grow, and there
was no open slot for the new item.
Remove the artifical 2^17 table limit and instead allow the table
to grow to be as big as 2^30. With a 64 byte block size, this
permits hashing inputs as large as 64 GB.
If the table reaches 2^30 (or cannot be allocated) hashing is
aborted. RenameDetector no longer tries to break a modify file pair,
and it does not try to match the file for rename or copy detection.
Change-Id: Ibb4d756844f4667e181e24a34a468dc3655863ac Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 12 Nov 2010 19:50:38 +0000 (11:50 -0800)]
SimilarityIndex: Correct comment explaining the logic
This comment was wrong, due to a copy-and-paste error. Here the
code is looking at records of dst that do not exist in src, and
are skipping past them to find another match.
Change-Id: I07c1fba7dee093a1eeffcf7e0c7ec85446777ffb Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 5 Nov 2010 00:06:48 +0000 (17:06 -0700)]
Remember non-note tree entries when reading
In order to safely edit a notes tree, NoteMap needs to retain any
non-note tree entries it read from the source tree and put them
back out into the modified tree when it commits a new version of
the note branch.
Remember any tree entries that didn't look like a note during
the parsing of the tree, so they can be put into a TreeFormatter
later when the tree writes to the repository.
Change-Id: Ia284af7e7866da35db35374c6c5869f00c857944 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 5 Nov 2010 01:23:40 +0000 (18:23 -0700)]
Lazy load note subtrees from fanout levels
Instead of reading a note tree recursively up front when the NoteMap
is loaded, read only the root tree and load subtrees on demand when
they are accessed by the application. This gives a lower latency
to read a note for the recent commits on a branch, as only the paths
that are needed get read.
Given a 2/38 style fanout, the tree will fully load when 256 objects
have been accessed by the application. But unlike the prior version
of NoteMap, the NoteMap will load faster and answer lookups sooner,
as the loading time for all 256 levels is spread out across each of
the get() requests.
Given a 2/2/36 style fanout, the tree won't need to fully load until
about 65,536 objects are accessed.
To simplify the implementation we only support the flat layout (all
notes in the top level tree), or a 2/38, 2/2/36, 2/2/2/34, through
2/.../2 style fanout. Unlike C Git we don't support reading the old
experimental 4/36 fanout. This is sufficient because C Git won't
create the 4/36 style fanout when creating or updating a notes tree,
and there really aren't any in the wild today.
Change-Id: I6099b35916a8404762f31e9c11f632e43e0c1bfd Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Fri, 5 Nov 2010 01:24:45 +0000 (18:24 -0700)]
Define NoteMap, a simple note tree reader
The NoteMap makes it easy to read a small notes tree as created by
the `git notes` command in C Git. To make the initial implementation
simple a notes tree is read recursively into a map in memory.
This is reasonable if the application will need to access all notes,
or if there are less than 256 notes in the tree, but doesn't behave
well when the number of notes exceeds 256 and the application
doesn't need to access all of them.
We can later add support for lazily loading different subpaths,
thus fixing the large note tree problem described above.
Currently the implementation only supports reading. Writing notes
is more complex because trees need to be expanded or collapsed at
the exact 256 entry cut-off in order to retain the same tree SHA-1
that C Git would use for the same content. It also needs to retain
non-note tree entries such as ".gitignore" or ".gitattribute" files
that might randomly appear within a notes tree. We can also add
writing support later.
Change-Id: I93704bd84ebf650d51de34da3f1577ef0f7a9144 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Shawn O. Pearce [Wed, 10 Nov 2010 22:41:10 +0000 (14:41 -0800)]
Implement command line support for CredentialsProvider
Instead of configuring the JSch session factory, configure a more
generic CredentialsProvider, which will work for other transport
types such as http, in addition to the existing ssh.
Change-Id: I22b13303c17e654ba6720edf4be2ef15fe29537a Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 10 Nov 2010 22:18:46 +0000 (14:18 -0800)]
Support CredentialsProvider for SSH connections
When setting up an SSH connection, use the caller supplied
CredentialsProvider, if one has been given to the Transport
or was defined as the default.
The CredentialsProvider is re-wrapped as a JSch UserInfo,
allowing the connection to use this for user interactive
prompts. This give a unified API for authentication on
any transport type.
Change-Id: Id3b4cf5bfd27a23207cdfb188bae3b78e71e02c0 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 10 Nov 2010 22:15:50 +0000 (14:15 -0800)]
Define a default CredentialsProvider
This permits applications to set their preferred credentials UI
implementation once, rather than needing to define it on every
single Transport instance they open.
Change-Id: I010550de1a6becab27f7aa5a9901df5a1c7e74bd Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Wed, 10 Nov 2010 22:14:35 +0000 (14:14 -0800)]
Enable providing credentials for HTTP authentication
This change is based on http://egit.eclipse.org/r/#change,1652
by David Green. The change adds the concept of a CredentialsProvider
which can be registered for git transports and which is
responsible to return credential-related data like passwords and
usernames. Whenenver the transports detects that an authentication
with certain credentials has to be done it will ask the
CredentialsProvider for this data. Foreseen implementations for
such a Provider may be a EGitCredentialsProvider (caching
credential data entered e.g. in the Clone-Wizzard) or a NetRcProvider
(gathering data out of ~/.netrc file).
Bug: 296201
Change-Id: Ibe13e546b45eed3e193c09ecb414bbec2971d362 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com> Signed-off-by: Christian Halstrick <christian.halstrick@sap.com> Signed-off-by: Stefan Lay <stefan.lay@sap.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> CC: David Green <dgreen99@gmail.com>
Stefan Lay [Wed, 10 Nov 2010 08:42:51 +0000 (09:42 +0100)]
Fix WWW-Authenticate auth-scheme comparison
The auth-scheme token (like "Basic" or "Digest") is not specified in a
case sensitive way. RFC2617 (http://tools.ietf.org/html/rfc2617) specifies
in section 1.2 the use of a "case-insensitive token to identify the
authentication scheme". Jetty, for example, uses "basic" as token.
Change-Id: I635a94eb0a741abcb3e68195da6913753bdbd889 Signed-off-by: Stefan Lay <stefan.lay@sap.com>
Shawn O. Pearce [Wed, 10 Nov 2010 03:12:24 +0000 (19:12 -0800)]
Simplify LockFile write(ObjectId) case
The ObjectId (for a ref) can be easily reformatted into a temporary
byte[] and then passed off to write(byte[]), removing the duplicated
code that existed in both write methods.
Change-Id: I09740658e070d5f22682333a2e0d325fd1c4a6cb Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 9 Nov 2010 22:36:01 +0000 (14:36 -0800)]
Fix URIish parsing of absolute scp-style URIs
We stopped handling URIs such as "example.com:/some/p ath", because
this was confused with the Windows absolute path syntax of "c:/path".
Support absolute style scp URIs again, but only when the host name
is more than 2 characters long.
Change-Id: I9ab049bc9aad2d8d42a78c7ab34fa317a28efc1a Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Instead work around the warning by defining our constant by
constructing it through a StringBuilder.
Change-Id: If139509e769d649609c62eff359ebaea5dd286b2 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> CC: Matthias Sohn <matthias.sohn@sap.com> CC: Chris Aniszczyk <caniszczyk@gmail.com>
Jens Baumgart [Mon, 8 Nov 2010 15:18:57 +0000 (16:18 +0100)]
IndexDiff: support state [removed, untracked]
IndexDiff was extended to detect files which are both removed from the
index and untracked. Before this change these files were only added
to the removed collection.
jgit.sh <command> --help was not working for the commands Diff
and ShowCommands because of missing metaVar information. Missing
information is added here.
Change-Id: I0ab7e35006b6aa7d4326a634309dddfcdb78f2a6 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Sasa Zivkov [Fri, 5 Nov 2010 14:18:00 +0000 (15:18 +0100)]
Implemented the git add commandline command.
Implementation delegates all work to the AddCommand class and,
therefore, supports only those options currently supported by the
AddCommand which means: --update and the filepattern... arguments.
Change-Id: I4827d37e08b4c988c2458d9ba60a61b6ad414d10 Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
Matthias Sohn [Sun, 7 Nov 2010 19:16:15 +0000 (20:16 +0100)]
[findBugs] Fix NP_LOAD_OF_KNOWN_NULL_VALUE
The code analyzer can't know that passing a value known to be null is
not a problem. Hence better pass null explicitly instead of the
parameters being null.
Change-Id: I8db6f8014de6c00dd95974d60f61ecc66191e6d4 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Fixed ResolveMerger regarding handling of deletions
There was a bug in ResolveMerger which is one reason for
bug 328841. If a merge was failing because of conflicts
deletions where not handled correctly. Files which have
to be deleted (because there was a non-conflicting deletion
coming in from THEIRS) are not deleted. In the
non-conflicting case we also forgot to delete the file but
in this case we explicitly checkout in the end these files
get deleted during that checkout.
This is fixed by handling incoming deletions explicitly.
Bug: 328841
Change-Id: I7f4c94ab54138e1b2f3fcdf34fb803d68e209ad0 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Sasa Zivkov [Fri, 5 Nov 2010 15:01:10 +0000 (16:01 +0100)]
Fixed the git init to properly set bare=true
When --git-dir=X is given JGit creates a bare repository in the
directory X. However, when the --bare option is not explicitly
given, this is not properly reflected in the X/config file i.e.
the bare=true is missing. This change fixes this minor issue.
Shawn O. Pearce [Thu, 4 Nov 2010 02:01:53 +0000 (19:01 -0700)]
Add MutableObjectId setByte to modify a mutable id
This mirrors the getByte() API in ObjectId and allows the caller to
modify a single byte, which is useful when updating it as part of a
loop walking through 0x00..0xff inside of a range of objects.
Change-Id: I57fa8420011fe5ed5fc6bfeb26f87a02b3197dab Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 2 Nov 2010 23:29:39 +0000 (16:29 -0700)]
Add ObjectId getByte for random access
Processing git notes requires random access to part of the raw data
of each ObjectId... which isn't easy because ObjectIds are stored
with an internal representation of 5 ints. Expose random access
to the individual data bytes through new methods, avoiding the
need to convert first to a byte[20].
Change-Id: I99e64700b27fc0c95aa14ef8ad46a0e8832d4441 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Tue, 2 Nov 2010 21:12:00 +0000 (14:12 -0700)]
Refactor tree entry formatting into a common class
Instead of hiding this logic inside of DirCacheTree and the legacy
Tree type, pull it into a common place where we can reuse it by
creating tree records in a buffer that can be passed directly into
the ObjectInserter. This allows us to avoid some copying, as the
inserter can be given the internal buffer of the formatter.
Because we trust these two callers to feed us records in the proper
order, without '/' in the names, and without duplicate names in the
same tree, we don't do any validation inside of the formatter itself.
To protect themselves from making ordering errors, developers should
continue to use DirCache to process edits to source code trees.
Change-Id: Idf7f10e736d4a44ccdf8afe060535d7b0554a92f Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
The JGit merge algorithm or the Merge Command may have problems with handling
deletions always correctly. Therefore one additional test is added to check
this.
Change-Id: Id6aa49136996b29047c340994fe7faba68858e8c Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
JGit merge algorithm behaved differently from C Git when
we had adjacent modifications. If line 9 was modified by
OURS and line 10 by theirs then C Git will return a
conflict while JGit was seeing this as independent
modifications. This change is not only there to achieve
compatibility, but there where also some really wrong
merge results produced by JGit in the area of adjacent
modifications.
Change-Id: I8d77cb59e82638214e45b3cf9ce3a1f1e9b35c70 Signed-off-by: Christian Halstrick <christian.halstrick@sap.com>
Shawn O. Pearce [Sat, 30 Oct 2010 01:35:43 +0000 (18:35 -0700)]
Fix ugly diff showing insertion of new method
When adding a new method near the end of the sequence we want to
show the full method inserted, and not tear the prior method due
to the common trailing curly brace being consumed as part of the
common end region of the sequences.
Bug: 328895
Change-Id: I233bc40445fb5452863f5fb082bc3097433a8da6 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sat, 30 Oct 2010 01:04:10 +0000 (18:04 -0700)]
Delete DiffPerformanceTest
This test isn't that useful. The better way to evaluate diff
algorithm performance is to run `jgit debug-diff-algorithms` over
real-world repositories, such as linux-2.6.git. Whenever we modify
an algorithm we should manually verify that its runtime performance
doesn't get any worse than it already is.
Change-Id: I0beed3a5a8a537c958a5a6438a1283f97fa2097a Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Shawn O. Pearce [Sat, 30 Oct 2010 00:50:26 +0000 (17:50 -0700)]
Fix broken HistogramDiff
HistogramDiff failed on cases where the initial element for the LCS
was actually very common (e.g. has 20 occurrences), and the first
element of the inserted region after the LCS was also common but
had fewer occurrences (e.g. 10), while the LCS also contained a
unique element (1 occurrence).
This happens often in Java source code. The initial element for
the LCS might be the empty line ("\n"), and the inserted but common
element might be "\t/**\n", with the LCS being a large span of
lines that contains unique method declarations. Even though "/**"
occurs less often than the empty line its not a better LCS if the
LCS we already have contains a unique element.
The logic in HistogramDiff would normally have worked fine, except I
tried to optimize scanning of B by making tryLongestCommonSequence
return the end of the region when there are matching elements
found in A. This allows us to skip over the current LCS region,
as it has already been examined, but caused us to fail to identify
an element that had a lower occurrence count within the region.
The solution used here is to trade space-for-time by keeping a
table of A positions to their occurrence counts. This allows the
matching logic to always use the smallest count for this region,
even if the smallest count doesn't appear on the initial element.
The new unit test testEdit_LcsContainsUnique() verifies this new
behavior works as expected.
Bug: 328895
Change-Id: Id170783b891f645b6a8cf6f133c6682b8de40aaf Signed-off-by: Shawn O. Pearce <spearce@spearce.org>