mirrors/jgit - jgit - source @ dussan.org

Commit Graph

Yazar	SHA1	Mesaj	Tarih
Robin Rosenberg	32ff57a2b2	Cleanup javadocs so they pass the java8 doclint checks Bug: 431552 Change-Id: I469316f5645205016e1fa6b0fbd2ff3b509b14bc Signed-off-by: Robin Stocker <robin@nibor.org>	10 yıl önce
Robin Stocker	ade8af729c	Apply tree filter marks when pairing DiffEntry for renames When using a RenameDetector to generate new DiffEntries after using DiffEntry.scan, the treeFilterMarks of the original entries were lost. Now it combines the marks from src and dst. See EGit bug 335082 where this is used. Change-Id: I72b34b10ca12e3a6bd10ce44f4fa05b193fc52cc	11 yıl önce
Robin Stocker	75ddf2a0f4	Enable marking entries using TreeFilters in DiffEntry This adds a new optional TreeFilter[] argument to DiffEntry.scan. All filters will be checked during the scan to determine if an entry should be "marked" with regard to that filter. After having called scan, the user can then call isMarked(int) on the entries to find out whether they matched the TreeFilter with the passed index. An example use case for this is in the file diff viewer of EGit's History view, where we'd like to highlight entries that are matching the current filter. See EGit change I03da4b38d1591495cb290909f0e4c6e52270e97f. Bug: 393610 Change-Id: Icf911fe6fca131b2567514f54d66636a44561af1 Signed-off-by: Robin Stocker <robin@nibor.org> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	12 yıl önce
Robin Rosenberg	c310fa0c80	Mark non-externalizable strings as such A few classes such as Constanrs are marked with @SuppressWarnings, as are toString() methods with many liternal, but otherwise $NLS-n$ is used for string containing text that should not be translated. A few literals may fall into the gray zone, but mostly I've tried to only tag the obvious ones. Change-Id: I22e50a77e2bf9e0b842a66bdf674e8fa1692f590	11 yıl önce
Robin Rosenberg	95d311f888	Move JGitText to an internal package Change-Id: I763590a45d75f00a09097ab6f89581a3bbd3c797	12 yıl önce
Kevin Sawicki	78bc526d9b	Report diff entries for files that only change mode This also updates DiffFormatter to not write path lines for entries that have the same object id Bug: 361570 Change-Id: I830a78e2babf472503630a7aa020ebfd5c7e69c6	12 yıl önce
Dariusz Luksza	679cab9b32	Adds DiffEntry.scan(TreeWalk, boolean) method Adds method into DiffEntry class that allows to specify whether changed trees are included in scanning result list. By default changed trees aren't added, but in some cases having changed tree would be useful. Also adds check for tree count in TreeWalk and when it is different from two it will thrown an IllegalArgumentException. This change is required by egit I7ddb21e7ff54333dd6d7ace3209bbcf83da2b219 Change-Id: I5a680a73e1cffa18ade3402cc86008f46c1da1f1 Signed-off-by: Dariusz Luksza <dariusz@luksza.org> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 yıl önce
Shawn O. Pearce	59a262d5d2	Support creating the working directory difference If the iterators passed into a diff formatter are working tree iterators, we should enable ignoring files that are ignored, as well as actually pull up the current content from the working tree rather than getting it from the repository. Because we abstract away the working directory access logic, we can now actually support rename detection between the working directory and the local repository when using a DiffFormatter. This means its possible for an application to show an unstaged delete-add pair as a rename if the add path is not ignored. (Because the ignored file wouldn't show up in our difference output.) Even more interesting is we can now do rename detection between any two working trees, if both input iterators are WorkingTreeIterators. Unfortunately we don't (yet) optimize for comparing the working tree with the index involved so we can take advantage of cached stat data to rule out non-dirty paths. Change-Id: I4c0598afe48d8f99257266bf447a0ecd23ca7f5e Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 yıl önce
Shawn O. Pearce	60c5939b23	Rename getOldName,getNewName to getOldPath,getNewPath TreeWalk calls this value "path", while "name" is the stuff after the last slash. FileHeader should do the same thing to be consistent. Rename getOldName to getOldPath and getNewName to getNewPath. Bug: 318526 Change-Id: Ib2e372ad4426402d37939b48d8f233154cc637da Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 yıl önce
Jeff Schumacher	396fe6da45	Break dissimilar file pairs during diff File pairs that are very dissimilar during a diff were not being broken apart into their constituent ADD/DELETE pairs. The leads to sub-optimal rename detection. Take, for example, this situation: A file exists at src/a.txt containing "foo". A user renames src/a.txt to src/b.txt, then adds a new src/a.txt containing "bar". Even though the old a.txt and the new b.txt are identical, the rename detection algorithm would not detect it as a rename since it was already paired in a MODIFY. I added code to split all MODIFYs below a certain score into their constituent ADD/DELETE pairs. This allows situations like the one I described above to be more correctly handled. Change-Id: I22c04b70581f206bbc68c4cd1ee87a1f663b418e Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 yıl önce
Jeff Schumacher	31311cacfd	Implemented file path based tie breaking to exact rename detection During the exact rename detection phase in RenameDetector, ties were resolved on a first-found basis. I added support for file path based tie breaking during that phase. Basically, there are four situations that have to be handled: One add matching one delete: In this simple case, we pair them as a rename. One add matching many deletes: Find the delete whos path matches the add the closest, and pair them as a rename. Many adds matching one delete: Similar to the above case, we find the add that matches the delete the closest, and pair them as a rename. The other adds are marked as copies of the delete. Many adds matching many deletes: Build a scoring matrix similar to the one used for content- based matching, scoring instead by file path. Some of the utility functions in SimilarityRenameDetector are used in this case, as we use the same encoding scheme. Once the matrix is built, scan it for the best matches, marking them as renames. The rest are marked as copies. I don't particularly like the idea of using utility functions right out of SimilarityRenameDetector, but it works for the moment. A later commit will likely refactor this into a common utility class, as well as bringing exact rename detection out of RenameDetector and into a separate class, much like SimilarityRenameDetector. Change-Id: I1fb08390aebdcbf20d049aecf402a36506e55611	14 yıl önce
Jeff Schumacher	a8b29afd82	Create FileHeader from DiffEntry Added support for converting DiffEntrys to FileHeaders. FileHeaders are DiffEntrys with a buffer containing the diff output as well as a list of HunkHeaders. The HunkHeaders contain EditLists. The createFileHeader(DiffEntry) method in DiffFormatter performs a Myers Diff on the files refered to by the DiffEntry, then puts the returned EditList into a single HunkHeader, which is then put into the FileHeader to be returned. It also generates the appropriate diff header an puts it into the FileHeader's buffer. The rest of the diff output, which would normally be parsed to generate the HunkHeaders, is not generated. In fact, the purpose of this method is to avoid the costly diff output generation and parsing normally required to create a FileHeader. Change-Id: I7d8b18c0f6c85e3d02ad58995d3d231e69af5887	14 yıl önce
Shawn O. Pearce	978535b090	Implement similarity based rename detection Content similarity based rename detection is performed only after a linear time detection is performed using exact content match on the ObjectIds. Any names which were paired up during that exact match phase are excluded from the inexact similarity based rename, which reduces the space that must be considered. During rename detection two entries cannot be marked as a rename if they are different types of files. This prevents a symlink from being renamed to a regular file, even if their blob content appears to be similar, or is identical. Efficiently comparing two files is performed by building up two hash indexes and hashing lines or short blocks from each file, counting the number of bytes that each line or block represents. Instead of using a standard java.util.HashMap, we use a custom open hashing scheme similiar to what we use in ObjecIdSubclassMap. This permits us to have a very light-weight hash, with very little memory overhead per cell stored. As we only need two ints per record in the map (line/block key and number of bytes), we collapse them into a single long inside of a long array, making very efficient use of available memory when we create the index table. We only need object headers for the index structure itself, and the index table, but not per-cell. This offers a massive space savings over using java.util.HashMap. The score calculation is done by approximating how many bytes are the same between the two inputs (which for a delta would be how much is copied from the base into the result). The score is derived by dividing the approximate number of bytes in common into the length of the larger of the two input files. Right now the SimilarityIndex table should average about 1/2 full, which means we waste about 50% of our memory on empty entries after we are done indexing a file and sort the table's contents. If memory becomes an issue we could discard the table and copy all records over to a new array that is properly sized. Building the index requires O(M + N log N) time, where M is the size of the input file in bytes, and N is the number of unique lines/blocks in the file. The N log N time constraint comes from the sort of the index table that is necessary to perform linear time matching against another SimilarityIndex created for a different file. To actually perform the rename detection, a SxD matrix is created, placing the sources (aka deletions) along one dimension and the destinations (aka additions) along the other. A simple O(S x D) loop examines every cell in this matrix. A SimilarityIndex is built along the row and reused for each column compare along that row, avoiding the costly index rebuild at the row level. A future improvement would be to load a smaller square matrix into SimilarityIndexes and process everything in that sub-matrix before discarding the column dimension and moving down to the next sub-matrix block along that same grid of rows. An optional ProgressMonitor is permitted to be passed in, allowing applications to see the progress of the detector as it works through the matrix cells. This provides some indication of current status for very long running renames. The default line/block hash function used by the SimilarityIndex may not be optimal, and may produce too many collisions. It is borrowed from RawText's hash, which is used to quickly skip out of a longer equality test if two lines have different hash functions. We may need to refine this hash in the future, in order to minimize the number of collisions we get on common source files. Based on a handful of test commits in JGit (especially my own recent rename repository refactoring series), this rename detector produces output that is very close to C Git. The content similarity scores are sometimes off by 1%, which is most probably caused by our SimilarityIndex type using a different hash function than C Git uses when it computes the delta size between any two objects in the rename matrix. Bug: 318504 Change-Id: I11dff969e8a2e4cf252636d857d2113053bdd9dc Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 yıl önce
Jeff Schumacher	7b0b4110ed	Refactored code out of FileHeader to facilitate rename detection Refactored a superclass out of FileHeader called DiffEntry that holds the more general data from FileHeader that is useful in rename detection (old/new Ids, modes, names, as well as changeType and score). FileHeader is now a DiffEntry that adds Hunks, parsing abilities, etc. Change-Id: I8398728cd218f8c6e98f7a4a7f2f342391d865e4	14 yıl önce

14 İşlemeler (32ff57a2b2b9480f4d374a2592fada7f720b124f)