aboutsummaryrefslogtreecommitdiffstats
path: root/org.eclipse.jgit/src/org/eclipse/jgit/patch
Commit message (Collapse)AuthorAgeFilesLines
* Switch FileHeader.extractFileLines to TemporaryBuffer.HeapShawn Pearce2014-11-252-14/+5
| | | | | | | | File contents are processed into a single byte[] for character conversion. The data must fit entirely in memory, so avoid any file IO. Change-Id: I3fe8be2e5f37d5ae953596dda1ed3fe6d4f6aebc
* Mark non-externalizable strings as suchRobin Rosenberg2012-12-276-31/+32
| | | | | | | | | | A few classes such as Constanrs are marked with @SuppressWarnings, as are toString() methods with many liternal, but otherwise $NLS-n$ is used for string containing text that should not be translated. A few literals may fall into the gray zone, but mostly I've tried to only tag the obvious ones. Change-Id: I22e50a77e2bf9e0b842a66bdf674e8fa1692f590
* Add Javadoc description for packagesRobin Stocker2012-10-311-0/+4
| | | | | | | | | These appear as descriptions in the index, see here (currently empty): http://download.eclipse.org/jgit/docs/latest/apidocs/ Change-Id: If7996deef30ae688bade8b3ad6b19547ca3d8b50 Signed-off-by: Chris Aniszczyk <zx@twitter.com>
* Remove 86 boxing warningsKevin Sawicki2012-05-083-5/+13
| | | | | | | | Use Integer, Character, and Long valueOf methods when passing parameters to MessageFormat and other places that expect objects instead of primitives Change-Id: I5942fbdbca6a378136c00d951ce61167f2366ca4
* Move JGitText to an internal packageRobin Rosenberg2012-03-124-4/+4
| | | | Change-Id: I763590a45d75f00a09097ab6f89581a3bbd3c797
* Add toString() to HunkHeaderTomasz Zarna2011-12-081-0/+13
| | | | | | | Since FileHeader provides toString() method (via DiffEntry) we could add a similar method to HunkHeader. Change-Id: I7886e5b8f775fa8e8478ac5af37d90b6ef677d8b
* Remove two "Dead store to local variable" warningsRobin Stocker2010-10-291-1/+1
| | | | Change-Id: I950de82db15c4610dc5a94f304279971daef971e
* Rename getOldName,getNewName to getOldPath,getNewPathShawn O. Pearce2010-08-041-14/+14
| | | | | | | | | | TreeWalk calls this value "path", while "name" is the stuff after the last slash. FileHeader should do the same thing to be consistent. Rename getOldName to getOldPath and getNewName to getNewPath. Bug: 318526 Change-Id: Ib2e372ad4426402d37939b48d8f233154cc637da Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* Create FileHeader from DiffEntryJeff Schumacher2010-07-083-40/+78
| | | | | | | | | | | | | | | | | Added support for converting DiffEntrys to FileHeaders. FileHeaders are DiffEntrys with a buffer containing the diff output as well as a list of HunkHeaders. The HunkHeaders contain EditLists. The createFileHeader(DiffEntry) method in DiffFormatter performs a Myers Diff on the files refered to by the DiffEntry, then puts the returned EditList into a single HunkHeader, which is then put into the FileHeader to be returned. It also generates the appropriate diff header an puts it into the FileHeader's buffer. The rest of the diff output, which would normally be parsed to generate the HunkHeaders, is not generated. In fact, the purpose of this method is to avoid the costly diff output generation and parsing normally required to create a FileHeader. Change-Id: I7d8b18c0f6c85e3d02ad58995d3d231e69af5887
* Implement similarity based rename detectionShawn O. Pearce2010-07-031-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Content similarity based rename detection is performed only after a linear time detection is performed using exact content match on the ObjectIds. Any names which were paired up during that exact match phase are excluded from the inexact similarity based rename, which reduces the space that must be considered. During rename detection two entries cannot be marked as a rename if they are different types of files. This prevents a symlink from being renamed to a regular file, even if their blob content appears to be similar, or is identical. Efficiently comparing two files is performed by building up two hash indexes and hashing lines or short blocks from each file, counting the number of bytes that each line or block represents. Instead of using a standard java.util.HashMap, we use a custom open hashing scheme similiar to what we use in ObjecIdSubclassMap. This permits us to have a very light-weight hash, with very little memory overhead per cell stored. As we only need two ints per record in the map (line/block key and number of bytes), we collapse them into a single long inside of a long array, making very efficient use of available memory when we create the index table. We only need object headers for the index structure itself, and the index table, but not per-cell. This offers a massive space savings over using java.util.HashMap. The score calculation is done by approximating how many bytes are the same between the two inputs (which for a delta would be how much is copied from the base into the result). The score is derived by dividing the approximate number of bytes in common into the length of the larger of the two input files. Right now the SimilarityIndex table should average about 1/2 full, which means we waste about 50% of our memory on empty entries after we are done indexing a file and sort the table's contents. If memory becomes an issue we could discard the table and copy all records over to a new array that is properly sized. Building the index requires O(M + N log N) time, where M is the size of the input file in bytes, and N is the number of unique lines/blocks in the file. The N log N time constraint comes from the sort of the index table that is necessary to perform linear time matching against another SimilarityIndex created for a different file. To actually perform the rename detection, a SxD matrix is created, placing the sources (aka deletions) along one dimension and the destinations (aka additions) along the other. A simple O(S x D) loop examines every cell in this matrix. A SimilarityIndex is built along the row and reused for each column compare along that row, avoiding the costly index rebuild at the row level. A future improvement would be to load a smaller square matrix into SimilarityIndexes and process everything in that sub-matrix before discarding the column dimension and moving down to the next sub-matrix block along that same grid of rows. An optional ProgressMonitor is permitted to be passed in, allowing applications to see the progress of the detector as it works through the matrix cells. This provides some indication of current status for very long running renames. The default line/block hash function used by the SimilarityIndex may not be optimal, and may produce too many collisions. It is borrowed from RawText's hash, which is used to quickly skip out of a longer equality test if two lines have different hash functions. We may need to refine this hash in the future, in order to minimize the number of collisions we get on common source files. Based on a handful of test commits in JGit (especially my own recent rename repository refactoring series), this rename detector produces output that is very close to C Git. The content similarity scores are sometimes off by 1%, which is most probably caused by our SimilarityIndex type using a different hash function than C Git uses when it computes the delta size between any two objects in the rename matrix. Bug: 318504 Change-Id: I11dff969e8a2e4cf252636d857d2113053bdd9dc Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* Refactored code out of FileHeader to facilitate rename detectionJeff Schumacher2010-06-301-123/+2
| | | | | | | | | | Refactored a superclass out of FileHeader called DiffEntry that holds the more general data from FileHeader that is useful in rename detection (old/new Ids, modes, names, as well as changeType and score). FileHeader is now a DiffEntry that adds Hunks, parsing abilities, etc. Change-Id: I8398728cd218f8c6e98f7a4a7f2f342391d865e4
* Externalize strings from JGitSasa Zivkov2010-05-194-18/+22
| | | | | | | | | | | | | | | The strings are externalized into the root resource bundles. The resource bundles are stored under the new "resources" source folder to get proper maven build. Strings from tests are, in general, not externalized. Only in cases where it was necessary to make the test pass the strings were externalized. This was typically necessary in cases where e.getMessage() was used in assert and the exception message was slightly changed due to reuse of the externalized strings. Change-Id: Ic0f29c80b9a54fcec8320d8539a3e112852a1f7b Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>
* Refactor TemporaryBuffer to support reuse in other contextsShawn O. Pearce2010-01-122-3/+3
| | | | | | | | | | | | | | | | Later we are going to add support for smart HTTP, which requires us to buffer at least some of the request created by a client before we ship it to the server. For many requests, we can fit it completely into a 1 MiB buffer, but if it doesn't we can drop back to using the chunked transfer encoding to send an unknown stream length. Rather than recoding the block based memory buffer, we refactor the local file overflow strategy into a subclass, allowing the HTTP client code to replace this portion of the logic with its own approach to start the chunked encoding request. Change-Id: Iac61ea1017b14e0ad3c4425efc3d75718b71bb8e Signed-off-by: Shawn O. Pearce <sop@google.com>
* Initial JGit contribution to eclipse.orgGit Development Community2009-09-297-0/+2313
Per CQ 3448 this is the initial contribution of the JGit project to eclipse.org. It is derived from the historical JGit repository at commit 3a2dd9921c8a08740a9e02c421469e5b1a9e47cb. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>