summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* log: Add whitespace ignore optionsShawn O. Pearce2010-07-0311-58/+181
| | | | | | | | | | | | | | Similar to what we did with diff, implement whitespace ignore options for log too. This requires us to define some means of creating any RawText object type at will inside of DiffFormatter, so we define a new factory interface to construct RawText instances on demand. Unfortunately we have to copy the entire block of common options. args4j only processes the options/arguments on the one command class and Java doesn't support multiple inheritance. Change-Id: Ia16cd3a11b850fffae9fbe7b721d7e43f1d0e8a5 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* Format submodule links during differencesShawn O. Pearce2010-07-031-8/+20
| | | | | | | | Instead of crashing, output a submodule link with the simple "Subproject commit $fullid\n" syntax used by C Git. Change-Id: Iae8646941683fb19b73fb038217d2e3bf5f77fa9 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* Redo DiffFormatter API to be easier to useShawn O. Pearce2010-07-036-97/+143
| | | | | | | | | | Passing around the OutputStream and the Repository is crazy. Instead put the stream in the constructor, since this formatter exists only to output to the stream, and put the repository as a member variable that can be optionally set. Change-Id: I2bad012fee7f40dc1346700ebd19f1e048982878 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* log, diff: Add rename detection supportShawn O. Pearce2010-07-035-70/+300
| | | | | | | | | | | | | Implement rename detection in the command line diff and log commands. Also support --name-status, -p and -U flags, as these can be quite useful to view more detail. All of the Git patch file formatting code is now moved over to the DiffFormatter class. This permits us to reuse it in any context, including inside of IDEs. Change-Id: I687ccba34e18105a07e0a439d2181c323209d96c Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* Implement similarity based rename detectionShawn O. Pearce2010-07-0311-258/+1362
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Content similarity based rename detection is performed only after a linear time detection is performed using exact content match on the ObjectIds. Any names which were paired up during that exact match phase are excluded from the inexact similarity based rename, which reduces the space that must be considered. During rename detection two entries cannot be marked as a rename if they are different types of files. This prevents a symlink from being renamed to a regular file, even if their blob content appears to be similar, or is identical. Efficiently comparing two files is performed by building up two hash indexes and hashing lines or short blocks from each file, counting the number of bytes that each line or block represents. Instead of using a standard java.util.HashMap, we use a custom open hashing scheme similiar to what we use in ObjecIdSubclassMap. This permits us to have a very light-weight hash, with very little memory overhead per cell stored. As we only need two ints per record in the map (line/block key and number of bytes), we collapse them into a single long inside of a long array, making very efficient use of available memory when we create the index table. We only need object headers for the index structure itself, and the index table, but not per-cell. This offers a massive space savings over using java.util.HashMap. The score calculation is done by approximating how many bytes are the same between the two inputs (which for a delta would be how much is copied from the base into the result). The score is derived by dividing the approximate number of bytes in common into the length of the larger of the two input files. Right now the SimilarityIndex table should average about 1/2 full, which means we waste about 50% of our memory on empty entries after we are done indexing a file and sort the table's contents. If memory becomes an issue we could discard the table and copy all records over to a new array that is properly sized. Building the index requires O(M + N log N) time, where M is the size of the input file in bytes, and N is the number of unique lines/blocks in the file. The N log N time constraint comes from the sort of the index table that is necessary to perform linear time matching against another SimilarityIndex created for a different file. To actually perform the rename detection, a SxD matrix is created, placing the sources (aka deletions) along one dimension and the destinations (aka additions) along the other. A simple O(S x D) loop examines every cell in this matrix. A SimilarityIndex is built along the row and reused for each column compare along that row, avoiding the costly index rebuild at the row level. A future improvement would be to load a smaller square matrix into SimilarityIndexes and process everything in that sub-matrix before discarding the column dimension and moving down to the next sub-matrix block along that same grid of rows. An optional ProgressMonitor is permitted to be passed in, allowing applications to see the progress of the detector as it works through the matrix cells. This provides some indication of current status for very long running renames. The default line/block hash function used by the SimilarityIndex may not be optimal, and may produce too many collisions. It is borrowed from RawText's hash, which is used to quickly skip out of a longer equality test if two lines have different hash functions. We may need to refine this hash in the future, in order to minimize the number of collisions we get on common source files. Based on a handful of test commits in JGit (especially my own recent rename repository refactoring series), this rename detector produces output that is very close to C Git. The content similarity scores are sometimes off by 1%, which is most probably caused by our SimilarityIndex type using a different hash function than C Git uses when it computes the delta size between any two objects in the rename matrix. Bug: 318504 Change-Id: I11dff969e8a2e4cf252636d857d2113053bdd9dc Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* Added a preliminary version of rename detectionJeff Schumacher2010-07-015-0/+479
| | | | | | | | | | JGit does not currently do rename detection during diffs. I added a class that, given a TreeWalk to iterate over, can output a list of DiffEntry's for that TreeWalk, taking into account renames. This class only detects renames by SHA1's. More complex rename detection, along the lines of what C Git does will be added later. Change-Id: I93606ce15da70df6660651ec322ea50718dd7c04
* Refactored code out of FileHeader to facilitate rename detectionJeff Schumacher2010-06-302-123/+176
| | | | | | | | | | Refactored a superclass out of FileHeader called DiffEntry that holds the more general data from FileHeader that is useful in rename detection (old/new Ids, modes, names, as well as changeType and score). FileHeader is now a DiffEntry that adds Hunks, parsing abilities, etc. Change-Id: I8398728cd218f8c6e98f7a4a7f2f342391d865e4
* Fix missing flush in StreamCopyThreadDmitry Neverov2010-06-301-14/+12
| | | | | | | | | | | | | | | | | | | | It is possible that StreamCopyThread will not flush everything from it's src to it's dst. In most cases StreamCopyThread works like this: in loop: n = src.read(buf); dst.write(buf, 0, n); and when we want to flush, we interrupt() StreamCopyThread and it flushes everything it wrote to dst. The problem is that our interrupt() could interrupt reading. In this case we will flush everything we wrote to dst, but not everything we wrote to src. Change-Id: Ifaf4d8be87535c7364dd59b217dfc631460018ff Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* Added check for binary files while diffingJeff Schumacher2010-06-292-6/+38
| | | | | | | | | Added a check in Diff to ensure that files that are most likely not text are not line-by-line diffed. Files are determined to be binary by checking the first 8000 bytes for a null character. This is a similar heuristic to what C Git uses. Change-Id: I2b6f05674c88d89b3f549a5db483f850f7f46c26
* Merge "Update build to use Tycho 0.9.0"Matthias Sohn2010-06-291-1/+1
|\
| * Update build to use Tycho 0.9.0Matthias Sohn2010-06-291-1/+1
| | | | | | | | | | Change-Id: I589267e6cfd0514383c2a3da51c9b7a659f77844 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* | Merge changes Ie56301aa,Ic2f79e85Shawn Pearce2010-06-2811-2/+1103
|\ \ | |/ |/| | | | | | | * changes: Added further support for whitespace ignoring during diff Added support for whitespace ignoring
| * Added further support for whitespace ignoring during diffJeff Schumacher2010-06-287-2/+663
| | | | | | | | | | | | | | | | | | Added code to support ignoring leading, trailing, and changed whitespace when performing a diff operation. I also added command line options to Diff to enable the various whitespace ignoring methods. These match the flags for git diff. Change-Id: Ie56301aafad59ee3f0fe5de62719f5023cd702c8
| * Added support for whitespace ignoringJeff Schumacher2010-06-284-0/+440
| | | | | | | | | | | | | | | | | | | | | | | | | | JGit did not have support for skipping whitespace when comparing lines in RawText objects. I added a subclass of RawText that skips whitespace in its equals and hashCode methods. I used a subclass rather than adding functionality into RawText so that performance would not be impacted by extra logic. This class only supports ignoring all whitespace. Others will follow that allow other forms of whitespace ignoring. Change-Id: Ic2f79e85215e48d3fd53ec1b4ad13373dd183a4a
* | UploadPack: Avoid unnecessary flush in smart HTTPShawn O. Pearce2010-06-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Under smart HTTP the biDirectionalPipe flag is false, and we return back immediately at this point in the negotiation process. There is no need to flush the stream to the client, the request is over and it will be automatically flushed out by the higher level servlet that invoked us. Avoiding flush here allows us to only use flush after a progress message is sent during pack generation. Change-Id: Id0c8b7e95e3be6ca4c1b479e096bed6b0283b828 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* | Add MutableObjectId.copyFrom(AnyObjectId)Shawn O. Pearce2010-06-232-10/+16
| | | | | | | | | | | | | | | | | | | | | | This simplifies the PackIndex code, which is trying to quickly copy an existing ObjectId into a MutableObjectId. Rather than having the PackIndex violate the ObjectId's internals, expose a copy from function similar to the other ones for copying from raw byte arrays or hex formatted strings. Change-Id: I142635cbece54af2ab83c58477961ce925dc8255 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* | Expose AnyObjectId compareTo(byte[]) and compareTo(int[])Shawn O. Pearce2010-06-231-2/+24
| | | | | | | | | | | | | | | | | | | | Storage systems can use these implementations to compare a passed AnyObjectId with a stored representation of an ObjectId in the canonical network byte order format. This can be useful to do a binary search, or just linear scan, over an encoded storage file. Change-Id: I8c72993c4f4c6e98d599ac2c9867453752f25fd2 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* | Expose RefWriter constructor taking RefListShawn O. Pearce2010-06-231-2/+7
| | | | | | | | | | | | | | | | | | An implementation might prefer to use the RefList type here, and RefList is part of our public API. Expose the constructor so callers who have a RefList can take advantage of the existing sorting. Change-Id: I545867f85aa2c479d2d610024ebbe318144709c8 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* | Expose RefUpdate constructor to any subclassShawn O. Pearce2010-06-231-1/+10
| | | | | | | | | | | | | | | | | | | | | | When we finally move RefDirectory to the new storage.file package, its associated RefDirectoryUpdate will need visiblity to this constructor in order to initialize itself. This is true of any other repository implementation, so make it protected rather than package level visible. Change-Id: If838aec9baeb80ee2f12dcbca717657c725a9242 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* | Expose repository change event constructorsShawn O. Pearce2010-06-232-2/+14
| | | | | | | | | | | | | | | | | | | | | | Repository implementations outside of .lib need to be able to create these events and deliver them to listening application code. Expose and document the constructors so that they are visible when we move FileRepository into storage.file.FileRepository. Change-Id: I7fb6e8f4f5fdab683c5ebb5267673aa6d5b560bb Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* | isValidRefName: Inline the forbidden ref suffix of ".lock"Shawn O. Pearce2010-06-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A Git reference name must never end with ".lock", as it would confuse any existing C client that tries to obtain a clone of the repository over the network. Even if the repository isn't on a local filesystem, it still should ban that suffix. Because I plan to move LockFile to storage.file and make it a private implementation detail of the local file system storage model, we can't rely on its package level SUFFIX field here. Making it public probably won't work long-term either, as I also plan to pull storage.file into its own separate project that depends on the core library. So, just inline the constant here. Its as foribidden as ":" is. Change-Id: If85076861baeacc183b82696375a13e935ba8836 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* | Remove pack stream from PackWriterTestShawn O. Pearce2010-06-231-11/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This stream was used only to determine how many bytes had been written thus far. Except we're always dumping it into a simple ByteArrayOutputStream, which also knows that. Drop the dependency on the pack stream and use ByteArrayOutputStream directly. This lets us later move this test into the new storage.file package without dragging along the pack stream that is an internal implementation detail of PackWriter, which is more general than just the file storage layer. Change-Id: I291689c0b1ed799270c213ee73b710b2637fb238 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* | Remove pointless setOldObjectId in testShawn O. Pearce2010-06-231-1/+0
| | | | | | | | | | | | | | | | | | | | Setting this value is pointless, because its automatically set by the refs.newUpdate call that created the update operation. The API is protected by default, because application level code, including this test, should not be calling it. Change-Id: I8867a4e8007892e2bd44a05d7dec619081081943 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* | Remove speed tests based on mapCommitShawn O. Pearce2010-06-233-301/+0
|/ | | | | | | | | | The mapCommit API is being deprecated because it doesn't run very fast. Leaving tests around to test how fast it is relative to C Git isn't instructive. Remove them, which should help aid the transition away from the mapCommit API. Change-Id: I27e1c844610d7da5b2c44b33a00602706973c9cc Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* Change default target platform for maven build to galileoMatthias Sohn2010-06-191-1/+1
| | | | | | | | Starting with 0.9 we do no longer support ganymede. http://dev.eclipse.org/mhonarc/lists/egit-dev/msg01277.html Change-Id: Ibf40342f67d9706e86336748f15d10ea47278096 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* Merge "Fix line endings"Shawn Pearce2010-06-1810-63/+81
|\
| * Fix line endingsMatthias Sohn2010-06-1810-63/+81
| | | | | | | | | | | | | | | | Some sources had dos line endings. Also configure all projects to use unix line endings and UTF-8 text encoding. Change-Id: I8fc9a1dbb219ffa91d1b3011b3b11b7e48e74ca7 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
* | Merge ""Bare" Repository should not return working directory."Shawn Pearce2010-06-165-26/+363
|\ \
| * | "Bare" Repository should not return working directory.Mathias Kinzler2010-06-165-26/+363
| |/ | | | | | | | | | | | | | | | | If a repository is "bare", it currently still returns a working directory. This conflicts with the specification of "bare"-ness. Bug: 311902 Change-Id: Ib54b31ddc80b9032e6e7bf013948bb83e12cfd88 Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
* / Make ObjectId, RefSpec, RemoteConfig, URIish serializableAndrew Bayer2010-06-164-6/+37
|/ | | | | | | Modifications to various classes in order to allow serialization for use of JGit in Hudson's git plugin. Change-Id: If088717d3da7483538c00a927e433a74085ae9e6
* Merge "tools/version.sh: Use backup files on Win32"Matthias Sohn2010-06-151-5/+6
|\
| * tools/version.sh: Use backup files on Win32Shawn O. Pearce2010-06-141-5/+6
| | | | | | | | | | | | | | | | | | | | Windows doesn't permit us to edit a file in-place with Perl. So create backup files when we perform the edit, and remove them when we are done. This is a tad slower on POSIX systems, but is much more portable. Change-Id: I429c7d698924cb32e709363f5da82f7232bbdab2 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* | Merge "Add missing @Override tags in AlternateRepositoryDatabase"Chris Aniszczyk2010-06-151-0/+3
|\ \
| * | Add missing @Override tags in AlternateRepositoryDatabaseShawn O. Pearce2010-06-141-0/+3
| |/ | | | | | | Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
* | Allow to read configured keysMathias Kinzler2010-06-152-0/+139
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, there is no way to read the content of the Git Configuration in a way that would allow to list all configured values generically. This change extends the Config class in such a way as to being able to get a list of sections and to get a list of names for any given section or subsection. This is required in able to implement proper configuration handling in EGit (show all the content of a given configuration similar to "git config -l"). Change-Id: Idd4bc47be18ed0e36b11be8c23c9c707159dc830 Signed-off-by: Mathias Kinzler <mathias.kinzler@sap.com>
* | Merge changes ↵Shawn Pearce2010-06-1413-57/+136
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | I53f71dc0,I3a899a3a,I3e8bd245,Ie7c9db83,If396326e,I6f4cf8da,I3bf96dd0,I3a2a43a1,I292fe88c,Ia1cf40cf * changes: git-servlet: Fix comparing uploadFactory with the wrong DISABLED instance Prefer static inner classes Override equals for SwingLane since super class PlotLane defines it Make sure a Stream is closed upon errors in IpLogGenerator Make constant static in RebuildCommitGraph Make inner classes static in http code Cache filemode in GitIndex Remove unused parent field in PlotLane Removed unused repo field in WorkDirCheckout Extend DiffFormatter API to simplify styling
| * git-servlet: Fix comparing uploadFactory with the wrong DISABLED instanceRobin Rosenberg2010-06-141-1/+1
| | | | | | | | | | Change-Id: I53f71dc0e3c68839da5ff5a2e0f3eeb8340e4793 Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
| * Prefer static inner classesRobin Rosenberg2010-06-131-3/+3
| | | | | | Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
| * Override equals for SwingLane since super class PlotLane defines itRobin Rosenberg2010-06-131-0/+4
| | | | | | Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
| * Make sure a Stream is closed upon errors in IpLogGeneratorRobin Rosenberg2010-06-131-28/+32
| | | | | | Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
| * Make constant static in RebuildCommitGraphRobin Rosenberg2010-06-131-1/+1
| | | | | | Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
| * Make inner classes static in http codeRobin Rosenberg2010-06-133-3/+3
| | | | | | | | | | | | Static classes are preferrable to keep unwanted dependencies away, and they have one less member field. Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
| * Cache filemode in GitIndex Robin Rosenberg2010-06-131-1/+2
| | | | | | | | | | Apparently this was the intention, but never happened Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
| * Remove unused parent field in PlotLaneRobin Rosenberg2010-06-132-4/+0
| | | | | | Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
| * Removed unused repo field in WorkDirCheckoutRobin Rosenberg2010-06-131-4/+0
| | | | | | Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
| * Extend DiffFormatter API to simplify stylingRobin Rosenberg2010-06-121-12/+90
| | | | | | | | | | | | | | | | Refactor and extend the internals so users can override and intervene during formatting, e.g. to colorize output. Change-Id: Ia1cf40cfd4a5ed7dfb6503f8dfc617237bee0659 Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
* | Merge branch 'stable-0.8'Shawn O. Pearce2010-06-144-40/+306
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * stable-0.8: Qualify post-0.8.4 builds JGit 0.8.4 JGit 0.8.3 Include about.html in org.eclipse.jgit artifact Fix build.properties of the JGit feature Added the standard SULA for JGit Add "resources/" as a source folder Change-Id: I4ecb0af41184ef84d104345fd1adcc4a240a38f6
| * | Qualify post-0.8.4 buildsstable-0.8Shawn O. Pearce2010-06-1425-147/+147
| | | | | | | | | | | | | | | Change-Id: I21efed66921eb7e1e4010fccc9fa9af6c4150fc1 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
| * | JGit 0.8.4v0.8.4Matthias Sohn2010-06-1425-147/+147
| | | | | | | | | | | | | | | | | | | | | Created wrong tags for 0.8.3 hence creating another version. Change-Id: I4e00bbcffe1cf872e2d7e3f3d88d068701fb5330 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
| * | JGit 0.8.3Matthias Sohn2010-06-1425-147/+147
| | | | | | | | | | | | | | | Change-Id: I845da83c74475d74ec25d68f53c0a4738a898550 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>