Du kan inte välja fler än 25 ämnen Ämnen måste starta med en bokstav eller siffra, kan innehålla bindestreck ('-') och vara max 35 tecken långa.

Repository.java 64KB

Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Don't swallow IOException Swallowing intermittent errors and trying to recover from them makes JGit's behavior hard to predict and difficult to debug. Propagate the errors instead. This doesn't violate JGit's usual backward compatibility promise for clients because in these contexts an IOException indicates either repository corruption or a true I/O error. Let's consider these cases one at a time. In the case of repository corruption, falling back e.g. to an empty set of refs or a missing ref will not serve a caller well. The fallback does not indicate the nature of the corruption, so they are not in a good place to recover from the error. This is analogous to Git, which attempts to provide sufficient support to recover from corruption (by ensuring commands like "git branch -D" cope with corruption) but little else. In the case of an I/O error, the best we can do is to propagate the error so that the user sees a dialog and has an opportunity to try again. As in the corruption case, the fallback behavior does not provide enough information for a caller to rely on the current error handling, and callers such as EGit already need to be able to handle runtime exceptions. To be conservative, keep the existing behavior for the deprecated Repository#peel method. In this example, the fallback behavior is to return an unpeeled ref, which is distinguishable from the ref not existing and should thus at least be possible to debug. Change-Id: I0eb58eb8c77519df7f50d21d1742016b978e67a3 Signed-off-by: Jonathan Nieder <jrn@google.com>
5 år sedan
Make Repository.normalizeBranchName less strict This operation was added recently with the goal to provide some way to auto-correct invalid user input, or to provide a correction suggestion to the user -- EGit uses it now that way. But the initial implementation was very restrictive; it removed all non-ASCII characters and even slashes. Understandably end users were not happy with that. Git has no such restriction to ASCII-only; nor does JGit. Branch names should be meaningful to the end user, and if a user-supplied branch name is invalid for technical reasons, a "normalized" name should still be meaningful to the user. Rewrite to attempt a minimal fix such that the result will pass isValidRefName. * Replace all Unicode whitespace by underscore. * Replace troublesome special characters by dash. * Collapse sequences of underscores, dots, and dashes. * Remove underscores, dots, and dashes following slashes, and collapse sequences of slashes. * Strip leading and trailing sequences of slashes, dots, dashes, and underscores. * Avoid the ".lock" extension. * Avoid the Windows reserved device names. * If input name is null return an empty String so callers don't need to check for null. This still allows branch names with single slashes as separators between components, avoids some pitfalls that isValidRefName() tests for, and leaves other character untouched and thus allows non-ASCII branch names. Also move the function from the bottom of the file up to where isValidRefName is implemented. Bug: 512508 Change-Id: Ia0576d9b2489162208c05e51c6d54e9f0c88c3a7 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 år sedan
Redo event listeners to be more generic Replace the old crude event listener system with a much more generic implementation, patterned after the event dispatch techniques used in Google Web Toolkit 1.5 and later. Each event delivers to an interface that defines a single method, and the event itself is what performs the delivery in a type-safe way through its own dispatch method. Listeners are registered in a generic listener list, indexed by the interface they implement and wish to receive an event for. Delivery of events is performed by looping through all listeners implementing the event's corresponding listener interface, and using the event's own dispatch method to deliver the event. This is the classical "double dispatch" pattern for event delivery. Listeners can be unregistered by invoking remove() on their registration handle. This change therefore requires application code to track the handle if it wishes to remove the listener at a later point in time. Event delivery is now exposed as a generic public method on the Repository class, making it easier for any type of message to be sent out to any type of listener that has registered, without needing to pre-arrange for type-safe fireFoo() methods. New event types can be added in the future simply by defining a new RepositoryEvent subclass and a corresponding RepositoryListener interface that it dispatches to. By always adding new events through a new interface, we never need to worry about defining an Adapter to provide default no-op implementations of new event methods. Change-Id: I651417b3098b9afc93d91085e9f0b2265df8fc81 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Redo event listeners to be more generic Replace the old crude event listener system with a much more generic implementation, patterned after the event dispatch techniques used in Google Web Toolkit 1.5 and later. Each event delivers to an interface that defines a single method, and the event itself is what performs the delivery in a type-safe way through its own dispatch method. Listeners are registered in a generic listener list, indexed by the interface they implement and wish to receive an event for. Delivery of events is performed by looping through all listeners implementing the event's corresponding listener interface, and using the event's own dispatch method to deliver the event. This is the classical "double dispatch" pattern for event delivery. Listeners can be unregistered by invoking remove() on their registration handle. This change therefore requires application code to track the handle if it wishes to remove the listener at a later point in time. Event delivery is now exposed as a generic public method on the Repository class, making it easier for any type of message to be sent out to any type of listener that has registered, without needing to pre-arrange for type-safe fireFoo() methods. New event types can be added in the future simply by defining a new RepositoryEvent subclass and a corresponding RepositoryListener interface that it dispatches to. By always adding new events through a new interface, we never need to worry about defining an Adapter to provide default no-op implementations of new event methods. Change-Id: I651417b3098b9afc93d91085e9f0b2265df8fc81 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Make Repository.normalizeBranchName less strict This operation was added recently with the goal to provide some way to auto-correct invalid user input, or to provide a correction suggestion to the user -- EGit uses it now that way. But the initial implementation was very restrictive; it removed all non-ASCII characters and even slashes. Understandably end users were not happy with that. Git has no such restriction to ASCII-only; nor does JGit. Branch names should be meaningful to the end user, and if a user-supplied branch name is invalid for technical reasons, a "normalized" name should still be meaningful to the user. Rewrite to attempt a minimal fix such that the result will pass isValidRefName. * Replace all Unicode whitespace by underscore. * Replace troublesome special characters by dash. * Collapse sequences of underscores, dots, and dashes. * Remove underscores, dots, and dashes following slashes, and collapse sequences of slashes. * Strip leading and trailing sequences of slashes, dots, dashes, and underscores. * Avoid the ".lock" extension. * Avoid the Windows reserved device names. * If input name is null return an empty String so callers don't need to check for null. This still allows branch names with single slashes as separators between components, avoids some pitfalls that isValidRefName() tests for, and leaves other character untouched and thus allows non-ASCII branch names. Also move the function from the bottom of the file up to where isValidRefName is implemented. Bug: 512508 Change-Id: Ia0576d9b2489162208c05e51c6d54e9f0c88c3a7 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 år sedan
Redo event listeners to be more generic Replace the old crude event listener system with a much more generic implementation, patterned after the event dispatch techniques used in Google Web Toolkit 1.5 and later. Each event delivers to an interface that defines a single method, and the event itself is what performs the delivery in a type-safe way through its own dispatch method. Listeners are registered in a generic listener list, indexed by the interface they implement and wish to receive an event for. Delivery of events is performed by looping through all listeners implementing the event's corresponding listener interface, and using the event's own dispatch method to deliver the event. This is the classical "double dispatch" pattern for event delivery. Listeners can be unregistered by invoking remove() on their registration handle. This change therefore requires application code to track the handle if it wishes to remove the listener at a later point in time. Event delivery is now exposed as a generic public method on the Repository class, making it easier for any type of message to be sent out to any type of listener that has registered, without needing to pre-arrange for type-safe fireFoo() methods. New event types can be added in the future simply by defining a new RepositoryEvent subclass and a corresponding RepositoryListener interface that it dispatches to. By always adding new events through a new interface, we never need to worry about defining an Adapter to provide default no-op implementations of new event methods. Change-Id: I651417b3098b9afc93d91085e9f0b2265df8fc81 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Redo event listeners to be more generic Replace the old crude event listener system with a much more generic implementation, patterned after the event dispatch techniques used in Google Web Toolkit 1.5 and later. Each event delivers to an interface that defines a single method, and the event itself is what performs the delivery in a type-safe way through its own dispatch method. Listeners are registered in a generic listener list, indexed by the interface they implement and wish to receive an event for. Delivery of events is performed by looping through all listeners implementing the event's corresponding listener interface, and using the event's own dispatch method to deliver the event. This is the classical "double dispatch" pattern for event delivery. Listeners can be unregistered by invoking remove() on their registration handle. This change therefore requires application code to track the handle if it wishes to remove the listener at a later point in time. Event delivery is now exposed as a generic public method on the Repository class, making it easier for any type of message to be sent out to any type of listener that has registered, without needing to pre-arrange for type-safe fireFoo() methods. New event types can be added in the future simply by defining a new RepositoryEvent subclass and a corresponding RepositoryListener interface that it dispatches to. By always adding new events through a new interface, we never need to worry about defining an Adapter to provide default no-op implementations of new event methods. Change-Id: I651417b3098b9afc93d91085e9f0b2265df8fc81 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Redo event listeners to be more generic Replace the old crude event listener system with a much more generic implementation, patterned after the event dispatch techniques used in Google Web Toolkit 1.5 and later. Each event delivers to an interface that defines a single method, and the event itself is what performs the delivery in a type-safe way through its own dispatch method. Listeners are registered in a generic listener list, indexed by the interface they implement and wish to receive an event for. Delivery of events is performed by looping through all listeners implementing the event's corresponding listener interface, and using the event's own dispatch method to deliver the event. This is the classical "double dispatch" pattern for event delivery. Listeners can be unregistered by invoking remove() on their registration handle. This change therefore requires application code to track the handle if it wishes to remove the listener at a later point in time. Event delivery is now exposed as a generic public method on the Repository class, making it easier for any type of message to be sent out to any type of listener that has registered, without needing to pre-arrange for type-safe fireFoo() methods. New event types can be added in the future simply by defining a new RepositoryEvent subclass and a corresponding RepositoryListener interface that it dispatches to. By always adding new events through a new interface, we never need to worry about defining an Adapter to provide default no-op implementations of new event methods. Change-Id: I651417b3098b9afc93d91085e9f0b2265df8fc81 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Redo event listeners to be more generic Replace the old crude event listener system with a much more generic implementation, patterned after the event dispatch techniques used in Google Web Toolkit 1.5 and later. Each event delivers to an interface that defines a single method, and the event itself is what performs the delivery in a type-safe way through its own dispatch method. Listeners are registered in a generic listener list, indexed by the interface they implement and wish to receive an event for. Delivery of events is performed by looping through all listeners implementing the event's corresponding listener interface, and using the event's own dispatch method to deliver the event. This is the classical "double dispatch" pattern for event delivery. Listeners can be unregistered by invoking remove() on their registration handle. This change therefore requires application code to track the handle if it wishes to remove the listener at a later point in time. Event delivery is now exposed as a generic public method on the Repository class, making it easier for any type of message to be sent out to any type of listener that has registered, without needing to pre-arrange for type-safe fireFoo() methods. New event types can be added in the future simply by defining a new RepositoryEvent subclass and a corresponding RepositoryListener interface that it dispatches to. By always adding new events through a new interface, we never need to worry about defining an Adapter to provide default no-op implementations of new event methods. Change-Id: I651417b3098b9afc93d91085e9f0b2265df8fc81 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Redo event listeners to be more generic Replace the old crude event listener system with a much more generic implementation, patterned after the event dispatch techniques used in Google Web Toolkit 1.5 and later. Each event delivers to an interface that defines a single method, and the event itself is what performs the delivery in a type-safe way through its own dispatch method. Listeners are registered in a generic listener list, indexed by the interface they implement and wish to receive an event for. Delivery of events is performed by looping through all listeners implementing the event's corresponding listener interface, and using the event's own dispatch method to deliver the event. This is the classical "double dispatch" pattern for event delivery. Listeners can be unregistered by invoking remove() on their registration handle. This change therefore requires application code to track the handle if it wishes to remove the listener at a later point in time. Event delivery is now exposed as a generic public method on the Repository class, making it easier for any type of message to be sent out to any type of listener that has registered, without needing to pre-arrange for type-safe fireFoo() methods. New event types can be added in the future simply by defining a new RepositoryEvent subclass and a corresponding RepositoryListener interface that it dispatches to. By always adding new events through a new interface, we never need to worry about defining an Adapter to provide default no-op implementations of new event methods. Change-Id: I651417b3098b9afc93d91085e9f0b2265df8fc81 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Redo event listeners to be more generic Replace the old crude event listener system with a much more generic implementation, patterned after the event dispatch techniques used in Google Web Toolkit 1.5 and later. Each event delivers to an interface that defines a single method, and the event itself is what performs the delivery in a type-safe way through its own dispatch method. Listeners are registered in a generic listener list, indexed by the interface they implement and wish to receive an event for. Delivery of events is performed by looping through all listeners implementing the event's corresponding listener interface, and using the event's own dispatch method to deliver the event. This is the classical "double dispatch" pattern for event delivery. Listeners can be unregistered by invoking remove() on their registration handle. This change therefore requires application code to track the handle if it wishes to remove the listener at a later point in time. Event delivery is now exposed as a generic public method on the Repository class, making it easier for any type of message to be sent out to any type of listener that has registered, without needing to pre-arrange for type-safe fireFoo() methods. New event types can be added in the future simply by defining a new RepositoryEvent subclass and a corresponding RepositoryListener interface that it dispatches to. By always adding new events through a new interface, we never need to worry about defining an Adapter to provide default no-op implementations of new event methods. Change-Id: I651417b3098b9afc93d91085e9f0b2265df8fc81 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Don't swallow IOException Swallowing intermittent errors and trying to recover from them makes JGit's behavior hard to predict and difficult to debug. Propagate the errors instead. This doesn't violate JGit's usual backward compatibility promise for clients because in these contexts an IOException indicates either repository corruption or a true I/O error. Let's consider these cases one at a time. In the case of repository corruption, falling back e.g. to an empty set of refs or a missing ref will not serve a caller well. The fallback does not indicate the nature of the corruption, so they are not in a good place to recover from the error. This is analogous to Git, which attempts to provide sufficient support to recover from corruption (by ensuring commands like "git branch -D" cope with corruption) but little else. In the case of an I/O error, the best we can do is to propagate the error so that the user sees a dialog and has an opportunity to try again. As in the corruption case, the fallback behavior does not provide enough information for a caller to rely on the current error handling, and callers such as EGit already need to be able to handle runtime exceptions. To be conservative, keep the existing behavior for the deprecated Repository#peel method. In this example, the fallback behavior is to return an unpeeled ref, which is distinguishable from the ref not existing and should thus at least be possible to debug. Change-Id: I0eb58eb8c77519df7f50d21d1742016b978e67a3 Signed-off-by: Jonathan Nieder <jrn@google.com>
5 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Make Repository.normalizeBranchName less strict This operation was added recently with the goal to provide some way to auto-correct invalid user input, or to provide a correction suggestion to the user -- EGit uses it now that way. But the initial implementation was very restrictive; it removed all non-ASCII characters and even slashes. Understandably end users were not happy with that. Git has no such restriction to ASCII-only; nor does JGit. Branch names should be meaningful to the end user, and if a user-supplied branch name is invalid for technical reasons, a "normalized" name should still be meaningful to the user. Rewrite to attempt a minimal fix such that the result will pass isValidRefName. * Replace all Unicode whitespace by underscore. * Replace troublesome special characters by dash. * Collapse sequences of underscores, dots, and dashes. * Remove underscores, dots, and dashes following slashes, and collapse sequences of slashes. * Strip leading and trailing sequences of slashes, dots, dashes, and underscores. * Avoid the ".lock" extension. * Avoid the Windows reserved device names. * If input name is null return an empty String so callers don't need to check for null. This still allows branch names with single slashes as separators between components, avoids some pitfalls that isValidRefName() tests for, and leaves other character untouched and thus allows non-ASCII branch names. Also move the function from the bottom of the file up to where isValidRefName is implemented. Bug: 512508 Change-Id: Ia0576d9b2489162208c05e51c6d54e9f0c88c3a7 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Don't swallow IOException Swallowing intermittent errors and trying to recover from them makes JGit's behavior hard to predict and difficult to debug. Propagate the errors instead. This doesn't violate JGit's usual backward compatibility promise for clients because in these contexts an IOException indicates either repository corruption or a true I/O error. Let's consider these cases one at a time. In the case of repository corruption, falling back e.g. to an empty set of refs or a missing ref will not serve a caller well. The fallback does not indicate the nature of the corruption, so they are not in a good place to recover from the error. This is analogous to Git, which attempts to provide sufficient support to recover from corruption (by ensuring commands like "git branch -D" cope with corruption) but little else. In the case of an I/O error, the best we can do is to propagate the error so that the user sees a dialog and has an opportunity to try again. As in the corruption case, the fallback behavior does not provide enough information for a caller to rely on the current error handling, and callers such as EGit already need to be able to handle runtime exceptions. To be conservative, keep the existing behavior for the deprecated Repository#peel method. In this example, the fallback behavior is to return an unpeeled ref, which is distinguishable from the ref not existing and should thus at least be possible to debug. Change-Id: I0eb58eb8c77519df7f50d21d1742016b978e67a3 Signed-off-by: Jonathan Nieder <jrn@google.com>
5 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Don't swallow IOException Swallowing intermittent errors and trying to recover from them makes JGit's behavior hard to predict and difficult to debug. Propagate the errors instead. This doesn't violate JGit's usual backward compatibility promise for clients because in these contexts an IOException indicates either repository corruption or a true I/O error. Let's consider these cases one at a time. In the case of repository corruption, falling back e.g. to an empty set of refs or a missing ref will not serve a caller well. The fallback does not indicate the nature of the corruption, so they are not in a good place to recover from the error. This is analogous to Git, which attempts to provide sufficient support to recover from corruption (by ensuring commands like "git branch -D" cope with corruption) but little else. In the case of an I/O error, the best we can do is to propagate the error so that the user sees a dialog and has an opportunity to try again. As in the corruption case, the fallback behavior does not provide enough information for a caller to rely on the current error handling, and callers such as EGit already need to be able to handle runtime exceptions. To be conservative, keep the existing behavior for the deprecated Repository#peel method. In this example, the fallback behavior is to return an unpeeled ref, which is distinguishable from the ref not existing and should thus at least be possible to debug. Change-Id: I0eb58eb8c77519df7f50d21d1742016b978e67a3 Signed-off-by: Jonathan Nieder <jrn@google.com>
5 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Don't swallow IOException Swallowing intermittent errors and trying to recover from them makes JGit's behavior hard to predict and difficult to debug. Propagate the errors instead. This doesn't violate JGit's usual backward compatibility promise for clients because in these contexts an IOException indicates either repository corruption or a true I/O error. Let's consider these cases one at a time. In the case of repository corruption, falling back e.g. to an empty set of refs or a missing ref will not serve a caller well. The fallback does not indicate the nature of the corruption, so they are not in a good place to recover from the error. This is analogous to Git, which attempts to provide sufficient support to recover from corruption (by ensuring commands like "git branch -D" cope with corruption) but little else. In the case of an I/O error, the best we can do is to propagate the error so that the user sees a dialog and has an opportunity to try again. As in the corruption case, the fallback behavior does not provide enough information for a caller to rely on the current error handling, and callers such as EGit already need to be able to handle runtime exceptions. To be conservative, keep the existing behavior for the deprecated Repository#peel method. In this example, the fallback behavior is to return an unpeeled ref, which is distinguishable from the ref not existing and should thus at least be possible to debug. Change-Id: I0eb58eb8c77519df7f50d21d1742016b978e67a3 Signed-off-by: Jonathan Nieder <jrn@google.com>
5 år sedan
Don't swallow IOException Swallowing intermittent errors and trying to recover from them makes JGit's behavior hard to predict and difficult to debug. Propagate the errors instead. This doesn't violate JGit's usual backward compatibility promise for clients because in these contexts an IOException indicates either repository corruption or a true I/O error. Let's consider these cases one at a time. In the case of repository corruption, falling back e.g. to an empty set of refs or a missing ref will not serve a caller well. The fallback does not indicate the nature of the corruption, so they are not in a good place to recover from the error. This is analogous to Git, which attempts to provide sufficient support to recover from corruption (by ensuring commands like "git branch -D" cope with corruption) but little else. In the case of an I/O error, the best we can do is to propagate the error so that the user sees a dialog and has an opportunity to try again. As in the corruption case, the fallback behavior does not provide enough information for a caller to rely on the current error handling, and callers such as EGit already need to be able to handle runtime exceptions. To be conservative, keep the existing behavior for the deprecated Repository#peel method. In this example, the fallback behavior is to return an unpeeled ref, which is distinguishable from the ref not existing and should thus at least be possible to debug. Change-Id: I0eb58eb8c77519df7f50d21d1742016b978e67a3 Signed-off-by: Jonathan Nieder <jrn@google.com>
5 år sedan
Don't swallow IOException Swallowing intermittent errors and trying to recover from them makes JGit's behavior hard to predict and difficult to debug. Propagate the errors instead. This doesn't violate JGit's usual backward compatibility promise for clients because in these contexts an IOException indicates either repository corruption or a true I/O error. Let's consider these cases one at a time. In the case of repository corruption, falling back e.g. to an empty set of refs or a missing ref will not serve a caller well. The fallback does not indicate the nature of the corruption, so they are not in a good place to recover from the error. This is analogous to Git, which attempts to provide sufficient support to recover from corruption (by ensuring commands like "git branch -D" cope with corruption) but little else. In the case of an I/O error, the best we can do is to propagate the error so that the user sees a dialog and has an opportunity to try again. As in the corruption case, the fallback behavior does not provide enough information for a caller to rely on the current error handling, and callers such as EGit already need to be able to handle runtime exceptions. To be conservative, keep the existing behavior for the deprecated Repository#peel method. In this example, the fallback behavior is to return an unpeeled ref, which is distinguishable from the ref not existing and should thus at least be possible to debug. Change-Id: I0eb58eb8c77519df7f50d21d1742016b978e67a3 Signed-off-by: Jonathan Nieder <jrn@google.com>
5 år sedan
Make Repository.normalizeBranchName less strict This operation was added recently with the goal to provide some way to auto-correct invalid user input, or to provide a correction suggestion to the user -- EGit uses it now that way. But the initial implementation was very restrictive; it removed all non-ASCII characters and even slashes. Understandably end users were not happy with that. Git has no such restriction to ASCII-only; nor does JGit. Branch names should be meaningful to the end user, and if a user-supplied branch name is invalid for technical reasons, a "normalized" name should still be meaningful to the user. Rewrite to attempt a minimal fix such that the result will pass isValidRefName. * Replace all Unicode whitespace by underscore. * Replace troublesome special characters by dash. * Collapse sequences of underscores, dots, and dashes. * Remove underscores, dots, and dashes following slashes, and collapse sequences of slashes. * Strip leading and trailing sequences of slashes, dots, dashes, and underscores. * Avoid the ".lock" extension. * Avoid the Windows reserved device names. * If input name is null return an empty String so callers don't need to check for null. This still allows branch names with single slashes as separators between components, avoids some pitfalls that isValidRefName() tests for, and leaves other character untouched and thus allows non-ASCII branch names. Also move the function from the bottom of the file up to where isValidRefName is implemented. Bug: 512508 Change-Id: Ia0576d9b2489162208c05e51c6d54e9f0c88c3a7 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 år sedan
Make Repository.normalizeBranchName less strict This operation was added recently with the goal to provide some way to auto-correct invalid user input, or to provide a correction suggestion to the user -- EGit uses it now that way. But the initial implementation was very restrictive; it removed all non-ASCII characters and even slashes. Understandably end users were not happy with that. Git has no such restriction to ASCII-only; nor does JGit. Branch names should be meaningful to the end user, and if a user-supplied branch name is invalid for technical reasons, a "normalized" name should still be meaningful to the user. Rewrite to attempt a minimal fix such that the result will pass isValidRefName. * Replace all Unicode whitespace by underscore. * Replace troublesome special characters by dash. * Collapse sequences of underscores, dots, and dashes. * Remove underscores, dots, and dashes following slashes, and collapse sequences of slashes. * Strip leading and trailing sequences of slashes, dots, dashes, and underscores. * Avoid the ".lock" extension. * Avoid the Windows reserved device names. * If input name is null return an empty String so callers don't need to check for null. This still allows branch names with single slashes as separators between components, avoids some pitfalls that isValidRefName() tests for, and leaves other character untouched and thus allows non-ASCII branch names. Also move the function from the bottom of the file up to where isValidRefName is implemented. Bug: 512508 Change-Id: Ia0576d9b2489162208c05e51c6d54e9f0c88c3a7 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 år sedan
Make Repository.normalizeBranchName less strict This operation was added recently with the goal to provide some way to auto-correct invalid user input, or to provide a correction suggestion to the user -- EGit uses it now that way. But the initial implementation was very restrictive; it removed all non-ASCII characters and even slashes. Understandably end users were not happy with that. Git has no such restriction to ASCII-only; nor does JGit. Branch names should be meaningful to the end user, and if a user-supplied branch name is invalid for technical reasons, a "normalized" name should still be meaningful to the user. Rewrite to attempt a minimal fix such that the result will pass isValidRefName. * Replace all Unicode whitespace by underscore. * Replace troublesome special characters by dash. * Collapse sequences of underscores, dots, and dashes. * Remove underscores, dots, and dashes following slashes, and collapse sequences of slashes. * Strip leading and trailing sequences of slashes, dots, dashes, and underscores. * Avoid the ".lock" extension. * Avoid the Windows reserved device names. * If input name is null return an empty String so callers don't need to check for null. This still allows branch names with single slashes as separators between components, avoids some pitfalls that isValidRefName() tests for, and leaves other character untouched and thus allows non-ASCII branch names. Also move the function from the bottom of the file up to where isValidRefName is implemented. Bug: 512508 Change-Id: Ia0576d9b2489162208c05e51c6d54e9f0c88c3a7 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 år sedan
Make Repository.normalizeBranchName less strict This operation was added recently with the goal to provide some way to auto-correct invalid user input, or to provide a correction suggestion to the user -- EGit uses it now that way. But the initial implementation was very restrictive; it removed all non-ASCII characters and even slashes. Understandably end users were not happy with that. Git has no such restriction to ASCII-only; nor does JGit. Branch names should be meaningful to the end user, and if a user-supplied branch name is invalid for technical reasons, a "normalized" name should still be meaningful to the user. Rewrite to attempt a minimal fix such that the result will pass isValidRefName. * Replace all Unicode whitespace by underscore. * Replace troublesome special characters by dash. * Collapse sequences of underscores, dots, and dashes. * Remove underscores, dots, and dashes following slashes, and collapse sequences of slashes. * Strip leading and trailing sequences of slashes, dots, dashes, and underscores. * Avoid the ".lock" extension. * Avoid the Windows reserved device names. * If input name is null return an empty String so callers don't need to check for null. This still allows branch names with single slashes as separators between components, avoids some pitfalls that isValidRefName() tests for, and leaves other character untouched and thus allows non-ASCII branch names. Also move the function from the bottom of the file up to where isValidRefName is implemented. Bug: 512508 Change-Id: Ia0576d9b2489162208c05e51c6d54e9f0c88c3a7 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 år sedan
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014101510161017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041104210431044104510461047104810491050105110521053105410551056105710581059106010611062106310641065106610671068106910701071107210731074107510761077107810791080108110821083108410851086108710881089109010911092109310941095109610971098109911001101110211031104110511061107110811091110111111121113111411151116111711181119112011211122112311241125112611271128112911301131113211331134113511361137113811391140114111421143114411451146114711481149115011511152115311541155115611571158115911601161116211631164116511661167116811691170117111721173117411751176117711781179118011811182118311841185118611871188118911901191119211931194119511961197119811991200120112021203120412051206120712081209121012111212121312141215121612171218121912201221122212231224122512261227122812291230123112321233123412351236123712381239124012411242124312441245124612471248124912501251125212531254125512561257125812591260126112621263126412651266126712681269127012711272127312741275127612771278127912801281128212831284128512861287128812891290129112921293129412951296129712981299130013011302130313041305130613071308130913101311131213131314131513161317131813191320132113221323132413251326132713281329133013311332133313341335133613371338133913401341134213431344134513461347134813491350135113521353135413551356135713581359136013611362136313641365136613671368136913701371137213731374137513761377137813791380138113821383138413851386138713881389139013911392139313941395139613971398139914001401140214031404140514061407140814091410141114121413141414151416141714181419142014211422142314241425142614271428142914301431143214331434143514361437143814391440144114421443144414451446144714481449145014511452145314541455145614571458145914601461146214631464146514661467146814691470147114721473147414751476147714781479148014811482148314841485148614871488148914901491149214931494149514961497149814991500150115021503150415051506150715081509151015111512151315141515151615171518151915201521152215231524152515261527152815291530153115321533153415351536153715381539154015411542154315441545154615471548154915501551155215531554155515561557155815591560156115621563156415651566156715681569157015711572157315741575157615771578157915801581158215831584158515861587158815891590159115921593159415951596159715981599160016011602160316041605160616071608160916101611161216131614161516161617161816191620162116221623162416251626162716281629163016311632163316341635163616371638163916401641164216431644164516461647164816491650165116521653165416551656165716581659166016611662166316641665166616671668166916701671167216731674167516761677167816791680168116821683168416851686168716881689169016911692169316941695169616971698169917001701170217031704170517061707170817091710171117121713171417151716171717181719172017211722172317241725172617271728172917301731173217331734173517361737173817391740174117421743174417451746174717481749175017511752175317541755175617571758175917601761176217631764176517661767176817691770177117721773177417751776177717781779178017811782178317841785178617871788178917901791179217931794179517961797179817991800180118021803180418051806180718081809181018111812181318141815181618171818181918201821182218231824182518261827182818291830183118321833183418351836183718381839184018411842184318441845184618471848184918501851185218531854185518561857185818591860186118621863186418651866186718681869187018711872187318741875187618771878187918801881188218831884188518861887188818891890189118921893189418951896189718981899190019011902190319041905190619071908190919101911191219131914191519161917191819191920192119221923192419251926192719281929193019311932193319341935193619371938193919401941194219431944194519461947194819491950195119521953195419551956195719581959196019611962196319641965196619671968196919701971197219731974197519761977197819791980198119821983198419851986198719881989199019911992199319941995199619971998199920002001200220032004200520062007200820092010201120122013201420152016201720182019202020212022202320242025202620272028202920302031203220332034203520362037203820392040204120422043204420452046204720482049205020512052205320542055205620572058205920602061206220632064206520662067206820692070207120722073207420752076207720782079208020812082208320842085208620872088208920902091209220932094
  1. /*
  2. * Copyright (C) 2007, Dave Watson <dwatson@mimvista.com>
  3. * Copyright (C) 2008-2010, Google Inc.
  4. * Copyright (C) 2006-2010, Robin Rosenberg <robin.rosenberg@dewire.com>
  5. * Copyright (C) 2006-2012, Shawn O. Pearce <spearce@spearce.org>
  6. * Copyright (C) 2012, Daniel Megert <daniel_megert@ch.ibm.com>
  7. * Copyright (C) 2017, Wim Jongman <wim.jongman@remainsoftware.com> and others
  8. *
  9. * This program and the accompanying materials are made available under the
  10. * terms of the Eclipse Distribution License v. 1.0 which is available at
  11. * https://www.eclipse.org/org/documents/edl-v10.php.
  12. *
  13. * SPDX-License-Identifier: BSD-3-Clause
  14. */
  15. package org.eclipse.jgit.lib;
  16. import static org.eclipse.jgit.lib.Constants.LOCK_SUFFIX;
  17. import static java.nio.charset.StandardCharsets.UTF_8;
  18. import java.io.BufferedOutputStream;
  19. import java.io.File;
  20. import java.io.FileNotFoundException;
  21. import java.io.FileOutputStream;
  22. import java.io.IOException;
  23. import java.io.OutputStream;
  24. import java.io.UncheckedIOException;
  25. import java.net.URISyntaxException;
  26. import java.text.MessageFormat;
  27. import java.util.Collection;
  28. import java.util.Collections;
  29. import java.util.HashMap;
  30. import java.util.HashSet;
  31. import java.util.LinkedList;
  32. import java.util.List;
  33. import java.util.Map;
  34. import java.util.Set;
  35. import java.util.concurrent.atomic.AtomicInteger;
  36. import java.util.concurrent.atomic.AtomicLong;
  37. import java.util.regex.Pattern;
  38. import org.eclipse.jgit.annotations.NonNull;
  39. import org.eclipse.jgit.annotations.Nullable;
  40. import org.eclipse.jgit.attributes.AttributesNodeProvider;
  41. import org.eclipse.jgit.dircache.DirCache;
  42. import org.eclipse.jgit.errors.AmbiguousObjectException;
  43. import org.eclipse.jgit.errors.CorruptObjectException;
  44. import org.eclipse.jgit.errors.IncorrectObjectTypeException;
  45. import org.eclipse.jgit.errors.MissingObjectException;
  46. import org.eclipse.jgit.errors.NoWorkTreeException;
  47. import org.eclipse.jgit.errors.RevisionSyntaxException;
  48. import org.eclipse.jgit.events.IndexChangedEvent;
  49. import org.eclipse.jgit.events.IndexChangedListener;
  50. import org.eclipse.jgit.events.ListenerList;
  51. import org.eclipse.jgit.events.RepositoryEvent;
  52. import org.eclipse.jgit.internal.JGitText;
  53. import org.eclipse.jgit.revwalk.RevBlob;
  54. import org.eclipse.jgit.revwalk.RevCommit;
  55. import org.eclipse.jgit.revwalk.RevObject;
  56. import org.eclipse.jgit.revwalk.RevTree;
  57. import org.eclipse.jgit.revwalk.RevWalk;
  58. import org.eclipse.jgit.transport.RefSpec;
  59. import org.eclipse.jgit.transport.RemoteConfig;
  60. import org.eclipse.jgit.treewalk.TreeWalk;
  61. import org.eclipse.jgit.util.FS;
  62. import org.eclipse.jgit.util.FileUtils;
  63. import org.eclipse.jgit.util.IO;
  64. import org.eclipse.jgit.util.RawParseUtils;
  65. import org.eclipse.jgit.util.SystemReader;
  66. import org.slf4j.Logger;
  67. import org.slf4j.LoggerFactory;
  68. /**
  69. * Represents a Git repository.
  70. * <p>
  71. * A repository holds all objects and refs used for managing source code (could
  72. * be any type of file, but source code is what SCM's are typically used for).
  73. * <p>
  74. * The thread-safety of a {@link org.eclipse.jgit.lib.Repository} very much
  75. * depends on the concrete implementation. Applications working with a generic
  76. * {@code Repository} type must not assume the instance is thread-safe.
  77. * <ul>
  78. * <li>{@code FileRepository} is thread-safe.
  79. * <li>{@code DfsRepository} thread-safety is determined by its subclass.
  80. * </ul>
  81. */
  82. public abstract class Repository implements AutoCloseable {
  83. private static final Logger LOG = LoggerFactory.getLogger(Repository.class);
  84. private static final ListenerList globalListeners = new ListenerList();
  85. /**
  86. * Branch names containing slashes should not have a name component that is
  87. * one of the reserved device names on Windows.
  88. *
  89. * @see #normalizeBranchName(String)
  90. */
  91. private static final Pattern FORBIDDEN_BRANCH_NAME_COMPONENTS = Pattern
  92. .compile(
  93. "(^|/)(aux|com[1-9]|con|lpt[1-9]|nul|prn)(\\.[^/]*)?", //$NON-NLS-1$
  94. Pattern.CASE_INSENSITIVE);
  95. /**
  96. * Get the global listener list observing all events in this JVM.
  97. *
  98. * @return the global listener list observing all events in this JVM.
  99. */
  100. public static ListenerList getGlobalListenerList() {
  101. return globalListeners;
  102. }
  103. /** Use counter */
  104. final AtomicInteger useCnt = new AtomicInteger(1);
  105. final AtomicLong closedAt = new AtomicLong();
  106. /** Metadata directory holding the repository's critical files. */
  107. private final File gitDir;
  108. /** File abstraction used to resolve paths. */
  109. private final FS fs;
  110. private final ListenerList myListeners = new ListenerList();
  111. /** If not bare, the top level directory of the working files. */
  112. private final File workTree;
  113. /** If not bare, the index file caching the working file states. */
  114. private final File indexFile;
  115. private final String initialBranch;
  116. /**
  117. * Initialize a new repository instance.
  118. *
  119. * @param options
  120. * options to configure the repository.
  121. */
  122. protected Repository(BaseRepositoryBuilder options) {
  123. gitDir = options.getGitDir();
  124. fs = options.getFS();
  125. workTree = options.getWorkTree();
  126. indexFile = options.getIndexFile();
  127. initialBranch = options.getInitialBranch();
  128. }
  129. /**
  130. * Get listeners observing only events on this repository.
  131. *
  132. * @return listeners observing only events on this repository.
  133. */
  134. @NonNull
  135. public ListenerList getListenerList() {
  136. return myListeners;
  137. }
  138. /**
  139. * Fire an event to all registered listeners.
  140. * <p>
  141. * The source repository of the event is automatically set to this
  142. * repository, before the event is delivered to any listeners.
  143. *
  144. * @param event
  145. * the event to deliver.
  146. */
  147. public void fireEvent(RepositoryEvent<?> event) {
  148. event.setRepository(this);
  149. myListeners.dispatch(event);
  150. globalListeners.dispatch(event);
  151. }
  152. /**
  153. * Create a new Git repository.
  154. * <p>
  155. * Repository with working tree is created using this method. This method is
  156. * the same as {@code create(false)}.
  157. *
  158. * @throws java.io.IOException
  159. * @see #create(boolean)
  160. */
  161. public void create() throws IOException {
  162. create(false);
  163. }
  164. /**
  165. * Create a new Git repository initializing the necessary files and
  166. * directories.
  167. *
  168. * @param bare
  169. * if true, a bare repository (a repository without a working
  170. * directory) is created.
  171. * @throws java.io.IOException
  172. * in case of IO problem
  173. */
  174. public abstract void create(boolean bare) throws IOException;
  175. /**
  176. * Get local metadata directory
  177. *
  178. * @return local metadata directory; {@code null} if repository isn't local.
  179. */
  180. /*
  181. * TODO This method should be annotated as Nullable, because in some
  182. * specific configurations metadata is not located in the local file system
  183. * (for example in memory databases). In "usual" repositories this
  184. * annotation would only cause compiler errors at places where the actual
  185. * directory can never be null.
  186. */
  187. public File getDirectory() {
  188. return gitDir;
  189. }
  190. /**
  191. * Get repository identifier.
  192. *
  193. * @return repository identifier. The returned identifier has to be unique
  194. * within a given Git server.
  195. * @since 5.4
  196. */
  197. public abstract String getIdentifier();
  198. /**
  199. * Get the object database which stores this repository's data.
  200. *
  201. * @return the object database which stores this repository's data.
  202. */
  203. @NonNull
  204. public abstract ObjectDatabase getObjectDatabase();
  205. /**
  206. * Create a new inserter to create objects in {@link #getObjectDatabase()}.
  207. *
  208. * @return a new inserter to create objects in {@link #getObjectDatabase()}.
  209. */
  210. @NonNull
  211. public ObjectInserter newObjectInserter() {
  212. return getObjectDatabase().newInserter();
  213. }
  214. /**
  215. * Create a new reader to read objects from {@link #getObjectDatabase()}.
  216. *
  217. * @return a new reader to read objects from {@link #getObjectDatabase()}.
  218. */
  219. @NonNull
  220. public ObjectReader newObjectReader() {
  221. return getObjectDatabase().newReader();
  222. }
  223. /**
  224. * Get the reference database which stores the reference namespace.
  225. *
  226. * @return the reference database which stores the reference namespace.
  227. */
  228. @NonNull
  229. public abstract RefDatabase getRefDatabase();
  230. /**
  231. * Get the configuration of this repository.
  232. *
  233. * @return the configuration of this repository.
  234. */
  235. @NonNull
  236. public abstract StoredConfig getConfig();
  237. /**
  238. * Create a new {@link org.eclipse.jgit.attributes.AttributesNodeProvider}.
  239. *
  240. * @return a new {@link org.eclipse.jgit.attributes.AttributesNodeProvider}.
  241. * This {@link org.eclipse.jgit.attributes.AttributesNodeProvider}
  242. * is lazy loaded only once. It means that it will not be updated
  243. * after loading. Prefer creating new instance for each use.
  244. * @since 4.2
  245. */
  246. @NonNull
  247. public abstract AttributesNodeProvider createAttributesNodeProvider();
  248. /**
  249. * Get the used file system abstraction.
  250. *
  251. * @return the used file system abstraction, or {@code null} if
  252. * repository isn't local.
  253. */
  254. /*
  255. * TODO This method should be annotated as Nullable, because in some
  256. * specific configurations metadata is not located in the local file system
  257. * (for example in memory databases). In "usual" repositories this
  258. * annotation would only cause compiler errors at places where the actual
  259. * directory can never be null.
  260. */
  261. public FS getFS() {
  262. return fs;
  263. }
  264. /**
  265. * Whether the specified object is stored in this repo or any of the known
  266. * shared repositories.
  267. *
  268. * @param objectId
  269. * a {@link org.eclipse.jgit.lib.AnyObjectId} object.
  270. * @return true if the specified object is stored in this repo or any of the
  271. * known shared repositories.
  272. * @deprecated use {@code getObjectDatabase().has(objectId)}
  273. */
  274. @Deprecated
  275. public boolean hasObject(AnyObjectId objectId) {
  276. try {
  277. return getObjectDatabase().has(objectId);
  278. } catch (IOException e) {
  279. throw new UncheckedIOException(e);
  280. }
  281. }
  282. /**
  283. * Open an object from this repository.
  284. * <p>
  285. * This is a one-shot call interface which may be faster than allocating a
  286. * {@link #newObjectReader()} to perform the lookup.
  287. *
  288. * @param objectId
  289. * identity of the object to open.
  290. * @return a {@link org.eclipse.jgit.lib.ObjectLoader} for accessing the
  291. * object.
  292. * @throws org.eclipse.jgit.errors.MissingObjectException
  293. * the object does not exist.
  294. * @throws java.io.IOException
  295. * the object store cannot be accessed.
  296. */
  297. @NonNull
  298. public ObjectLoader open(AnyObjectId objectId)
  299. throws MissingObjectException, IOException {
  300. return getObjectDatabase().open(objectId);
  301. }
  302. /**
  303. * Open an object from this repository.
  304. * <p>
  305. * This is a one-shot call interface which may be faster than allocating a
  306. * {@link #newObjectReader()} to perform the lookup.
  307. *
  308. * @param objectId
  309. * identity of the object to open.
  310. * @param typeHint
  311. * hint about the type of object being requested, e.g.
  312. * {@link org.eclipse.jgit.lib.Constants#OBJ_BLOB};
  313. * {@link org.eclipse.jgit.lib.ObjectReader#OBJ_ANY} if the
  314. * object type is not known, or does not matter to the caller.
  315. * @return a {@link org.eclipse.jgit.lib.ObjectLoader} for accessing the
  316. * object.
  317. * @throws org.eclipse.jgit.errors.MissingObjectException
  318. * the object does not exist.
  319. * @throws org.eclipse.jgit.errors.IncorrectObjectTypeException
  320. * typeHint was not OBJ_ANY, and the object's actual type does
  321. * not match typeHint.
  322. * @throws java.io.IOException
  323. * the object store cannot be accessed.
  324. */
  325. @NonNull
  326. public ObjectLoader open(AnyObjectId objectId, int typeHint)
  327. throws MissingObjectException, IncorrectObjectTypeException,
  328. IOException {
  329. return getObjectDatabase().open(objectId, typeHint);
  330. }
  331. /**
  332. * Create a command to update, create or delete a ref in this repository.
  333. *
  334. * @param ref
  335. * name of the ref the caller wants to modify.
  336. * @return an update command. The caller must finish populating this command
  337. * and then invoke one of the update methods to actually make a
  338. * change.
  339. * @throws java.io.IOException
  340. * a symbolic ref was passed in and could not be resolved back
  341. * to the base ref, as the symbolic ref could not be read.
  342. */
  343. @NonNull
  344. public RefUpdate updateRef(String ref) throws IOException {
  345. return updateRef(ref, false);
  346. }
  347. /**
  348. * Create a command to update, create or delete a ref in this repository.
  349. *
  350. * @param ref
  351. * name of the ref the caller wants to modify.
  352. * @param detach
  353. * true to create a detached head
  354. * @return an update command. The caller must finish populating this command
  355. * and then invoke one of the update methods to actually make a
  356. * change.
  357. * @throws java.io.IOException
  358. * a symbolic ref was passed in and could not be resolved back
  359. * to the base ref, as the symbolic ref could not be read.
  360. */
  361. @NonNull
  362. public RefUpdate updateRef(String ref, boolean detach) throws IOException {
  363. return getRefDatabase().newUpdate(ref, detach);
  364. }
  365. /**
  366. * Create a command to rename a ref in this repository
  367. *
  368. * @param fromRef
  369. * name of ref to rename from
  370. * @param toRef
  371. * name of ref to rename to
  372. * @return an update command that knows how to rename a branch to another.
  373. * @throws java.io.IOException
  374. * the rename could not be performed.
  375. */
  376. @NonNull
  377. public RefRename renameRef(String fromRef, String toRef) throws IOException {
  378. return getRefDatabase().newRename(fromRef, toRef);
  379. }
  380. /**
  381. * Parse a git revision string and return an object id.
  382. *
  383. * Combinations of these operators are supported:
  384. * <ul>
  385. * <li><b>HEAD</b>, <b>MERGE_HEAD</b>, <b>FETCH_HEAD</b></li>
  386. * <li><b>SHA-1</b>: a complete or abbreviated SHA-1</li>
  387. * <li><b>refs/...</b>: a complete reference name</li>
  388. * <li><b>short-name</b>: a short reference name under {@code refs/heads},
  389. * {@code refs/tags}, or {@code refs/remotes} namespace</li>
  390. * <li><b>tag-NN-gABBREV</b>: output from describe, parsed by treating
  391. * {@code ABBREV} as an abbreviated SHA-1.</li>
  392. * <li><i>id</i><b>^</b>: first parent of commit <i>id</i>, this is the same
  393. * as {@code id^1}</li>
  394. * <li><i>id</i><b>^0</b>: ensure <i>id</i> is a commit</li>
  395. * <li><i>id</i><b>^n</b>: n-th parent of commit <i>id</i></li>
  396. * <li><i>id</i><b>~n</b>: n-th historical ancestor of <i>id</i>, by first
  397. * parent. {@code id~3} is equivalent to {@code id^1^1^1} or {@code id^^^}.</li>
  398. * <li><i>id</i><b>:path</b>: Lookup path under tree named by <i>id</i></li>
  399. * <li><i>id</i><b>^{commit}</b>: ensure <i>id</i> is a commit</li>
  400. * <li><i>id</i><b>^{tree}</b>: ensure <i>id</i> is a tree</li>
  401. * <li><i>id</i><b>^{tag}</b>: ensure <i>id</i> is a tag</li>
  402. * <li><i>id</i><b>^{blob}</b>: ensure <i>id</i> is a blob</li>
  403. * </ul>
  404. *
  405. * <p>
  406. * The following operators are specified by Git conventions, but are not
  407. * supported by this method:
  408. * <ul>
  409. * <li><b>ref@{n}</b>: n-th version of ref as given by its reflog</li>
  410. * <li><b>ref@{time}</b>: value of ref at the designated time</li>
  411. * </ul>
  412. *
  413. * @param revstr
  414. * A git object references expression
  415. * @return an ObjectId or {@code null} if revstr can't be resolved to any
  416. * ObjectId
  417. * @throws org.eclipse.jgit.errors.AmbiguousObjectException
  418. * {@code revstr} contains an abbreviated ObjectId and this
  419. * repository contains more than one object which match to the
  420. * input abbreviation.
  421. * @throws org.eclipse.jgit.errors.IncorrectObjectTypeException
  422. * the id parsed does not meet the type required to finish
  423. * applying the operators in the expression.
  424. * @throws org.eclipse.jgit.errors.RevisionSyntaxException
  425. * the expression is not supported by this implementation, or
  426. * does not meet the standard syntax.
  427. * @throws java.io.IOException
  428. * on serious errors
  429. */
  430. @Nullable
  431. public ObjectId resolve(String revstr)
  432. throws AmbiguousObjectException, IncorrectObjectTypeException,
  433. RevisionSyntaxException, IOException {
  434. try (RevWalk rw = new RevWalk(this)) {
  435. rw.setRetainBody(false);
  436. Object resolved = resolve(rw, revstr);
  437. if (resolved instanceof String) {
  438. final Ref ref = findRef((String) resolved);
  439. return ref != null ? ref.getLeaf().getObjectId() : null;
  440. }
  441. return (ObjectId) resolved;
  442. }
  443. }
  444. /**
  445. * Simplify an expression, but unlike {@link #resolve(String)} it will not
  446. * resolve a branch passed or resulting from the expression, such as @{-}.
  447. * Thus this method can be used to process an expression to a method that
  448. * expects a branch or revision id.
  449. *
  450. * @param revstr a {@link java.lang.String} object.
  451. * @return object id or ref name from resolved expression or {@code null} if
  452. * given expression cannot be resolved
  453. * @throws org.eclipse.jgit.errors.AmbiguousObjectException
  454. * @throws java.io.IOException
  455. */
  456. @Nullable
  457. public String simplify(String revstr)
  458. throws AmbiguousObjectException, IOException {
  459. try (RevWalk rw = new RevWalk(this)) {
  460. rw.setRetainBody(true);
  461. Object resolved = resolve(rw, revstr);
  462. if (resolved != null) {
  463. if (resolved instanceof String) {
  464. return (String) resolved;
  465. }
  466. return ((AnyObjectId) resolved).getName();
  467. }
  468. return null;
  469. }
  470. }
  471. @Nullable
  472. private Object resolve(RevWalk rw, String revstr)
  473. throws IOException {
  474. char[] revChars = revstr.toCharArray();
  475. RevObject rev = null;
  476. String name = null;
  477. int done = 0;
  478. for (int i = 0; i < revChars.length; ++i) {
  479. switch (revChars[i]) {
  480. case '^':
  481. if (rev == null) {
  482. if (name == null)
  483. if (done == 0)
  484. name = new String(revChars, done, i);
  485. else {
  486. done = i + 1;
  487. break;
  488. }
  489. rev = parseSimple(rw, name);
  490. name = null;
  491. if (rev == null)
  492. return null;
  493. }
  494. if (i + 1 < revChars.length) {
  495. switch (revChars[i + 1]) {
  496. case '0':
  497. case '1':
  498. case '2':
  499. case '3':
  500. case '4':
  501. case '5':
  502. case '6':
  503. case '7':
  504. case '8':
  505. case '9':
  506. int j;
  507. rev = rw.parseCommit(rev);
  508. for (j = i + 1; j < revChars.length; ++j) {
  509. if (!Character.isDigit(revChars[j]))
  510. break;
  511. }
  512. String parentnum = new String(revChars, i + 1, j - i
  513. - 1);
  514. int pnum;
  515. try {
  516. pnum = Integer.parseInt(parentnum);
  517. } catch (NumberFormatException e) {
  518. RevisionSyntaxException rse = new RevisionSyntaxException(
  519. JGitText.get().invalidCommitParentNumber,
  520. revstr);
  521. rse.initCause(e);
  522. throw rse;
  523. }
  524. if (pnum != 0) {
  525. RevCommit commit = (RevCommit) rev;
  526. if (pnum > commit.getParentCount())
  527. rev = null;
  528. else
  529. rev = commit.getParent(pnum - 1);
  530. }
  531. i = j - 1;
  532. done = j;
  533. break;
  534. case '{':
  535. int k;
  536. String item = null;
  537. for (k = i + 2; k < revChars.length; ++k) {
  538. if (revChars[k] == '}') {
  539. item = new String(revChars, i + 2, k - i - 2);
  540. break;
  541. }
  542. }
  543. i = k;
  544. if (item != null)
  545. if (item.equals("tree")) { //$NON-NLS-1$
  546. rev = rw.parseTree(rev);
  547. } else if (item.equals("commit")) { //$NON-NLS-1$
  548. rev = rw.parseCommit(rev);
  549. } else if (item.equals("blob")) { //$NON-NLS-1$
  550. rev = rw.peel(rev);
  551. if (!(rev instanceof RevBlob))
  552. throw new IncorrectObjectTypeException(rev,
  553. Constants.TYPE_BLOB);
  554. } else if (item.isEmpty()) {
  555. rev = rw.peel(rev);
  556. } else
  557. throw new RevisionSyntaxException(revstr);
  558. else
  559. throw new RevisionSyntaxException(revstr);
  560. done = k;
  561. break;
  562. default:
  563. rev = rw.peel(rev);
  564. if (rev instanceof RevCommit) {
  565. RevCommit commit = ((RevCommit) rev);
  566. if (commit.getParentCount() == 0)
  567. rev = null;
  568. else
  569. rev = commit.getParent(0);
  570. } else
  571. throw new IncorrectObjectTypeException(rev,
  572. Constants.TYPE_COMMIT);
  573. }
  574. } else {
  575. rev = rw.peel(rev);
  576. if (rev instanceof RevCommit) {
  577. RevCommit commit = ((RevCommit) rev);
  578. if (commit.getParentCount() == 0)
  579. rev = null;
  580. else
  581. rev = commit.getParent(0);
  582. } else
  583. throw new IncorrectObjectTypeException(rev,
  584. Constants.TYPE_COMMIT);
  585. }
  586. done = i + 1;
  587. break;
  588. case '~':
  589. if (rev == null) {
  590. if (name == null)
  591. if (done == 0)
  592. name = new String(revChars, done, i);
  593. else {
  594. done = i + 1;
  595. break;
  596. }
  597. rev = parseSimple(rw, name);
  598. name = null;
  599. if (rev == null)
  600. return null;
  601. }
  602. rev = rw.peel(rev);
  603. if (!(rev instanceof RevCommit))
  604. throw new IncorrectObjectTypeException(rev,
  605. Constants.TYPE_COMMIT);
  606. int l;
  607. for (l = i + 1; l < revChars.length; ++l) {
  608. if (!Character.isDigit(revChars[l]))
  609. break;
  610. }
  611. int dist;
  612. if (l - i > 1) {
  613. String distnum = new String(revChars, i + 1, l - i - 1);
  614. try {
  615. dist = Integer.parseInt(distnum);
  616. } catch (NumberFormatException e) {
  617. RevisionSyntaxException rse = new RevisionSyntaxException(
  618. JGitText.get().invalidAncestryLength, revstr);
  619. rse.initCause(e);
  620. throw rse;
  621. }
  622. } else
  623. dist = 1;
  624. while (dist > 0) {
  625. RevCommit commit = (RevCommit) rev;
  626. if (commit.getParentCount() == 0) {
  627. rev = null;
  628. break;
  629. }
  630. commit = commit.getParent(0);
  631. rw.parseHeaders(commit);
  632. rev = commit;
  633. --dist;
  634. }
  635. i = l - 1;
  636. done = l;
  637. break;
  638. case '@':
  639. if (rev != null)
  640. throw new RevisionSyntaxException(revstr);
  641. if (i + 1 == revChars.length)
  642. continue;
  643. if (i + 1 < revChars.length && revChars[i + 1] != '{')
  644. continue;
  645. int m;
  646. String time = null;
  647. for (m = i + 2; m < revChars.length; ++m) {
  648. if (revChars[m] == '}') {
  649. time = new String(revChars, i + 2, m - i - 2);
  650. break;
  651. }
  652. }
  653. if (time != null) {
  654. if (time.equals("upstream")) { //$NON-NLS-1$
  655. if (name == null)
  656. name = new String(revChars, done, i);
  657. if (name.isEmpty())
  658. // Currently checked out branch, HEAD if
  659. // detached
  660. name = Constants.HEAD;
  661. if (!Repository.isValidRefName("x/" + name)) //$NON-NLS-1$
  662. throw new RevisionSyntaxException(MessageFormat
  663. .format(JGitText.get().invalidRefName,
  664. name),
  665. revstr);
  666. Ref ref = findRef(name);
  667. name = null;
  668. if (ref == null)
  669. return null;
  670. if (ref.isSymbolic())
  671. ref = ref.getLeaf();
  672. name = ref.getName();
  673. RemoteConfig remoteConfig;
  674. try {
  675. remoteConfig = new RemoteConfig(getConfig(),
  676. "origin"); //$NON-NLS-1$
  677. } catch (URISyntaxException e) {
  678. RevisionSyntaxException rse = new RevisionSyntaxException(
  679. revstr);
  680. rse.initCause(e);
  681. throw rse;
  682. }
  683. String remoteBranchName = getConfig()
  684. .getString(
  685. ConfigConstants.CONFIG_BRANCH_SECTION,
  686. Repository.shortenRefName(ref.getName()),
  687. ConfigConstants.CONFIG_KEY_MERGE);
  688. List<RefSpec> fetchRefSpecs = remoteConfig
  689. .getFetchRefSpecs();
  690. for (RefSpec refSpec : fetchRefSpecs) {
  691. if (refSpec.matchSource(remoteBranchName)) {
  692. RefSpec expandFromSource = refSpec
  693. .expandFromSource(remoteBranchName);
  694. name = expandFromSource.getDestination();
  695. break;
  696. }
  697. }
  698. if (name == null)
  699. throw new RevisionSyntaxException(revstr);
  700. } else if (time.matches("^-\\d+$")) { //$NON-NLS-1$
  701. if (name != null) {
  702. throw new RevisionSyntaxException(revstr);
  703. }
  704. String previousCheckout = resolveReflogCheckout(
  705. -Integer.parseInt(time));
  706. if (ObjectId.isId(previousCheckout)) {
  707. rev = parseSimple(rw, previousCheckout);
  708. } else {
  709. name = previousCheckout;
  710. }
  711. } else {
  712. if (name == null)
  713. name = new String(revChars, done, i);
  714. if (name.isEmpty())
  715. name = Constants.HEAD;
  716. if (!Repository.isValidRefName("x/" + name)) //$NON-NLS-1$
  717. throw new RevisionSyntaxException(MessageFormat
  718. .format(JGitText.get().invalidRefName,
  719. name),
  720. revstr);
  721. Ref ref = findRef(name);
  722. name = null;
  723. if (ref == null)
  724. return null;
  725. // @{n} means current branch, not HEAD@{1} unless
  726. // detached
  727. if (ref.isSymbolic())
  728. ref = ref.getLeaf();
  729. rev = resolveReflog(rw, ref, time);
  730. }
  731. i = m;
  732. } else
  733. throw new RevisionSyntaxException(revstr);
  734. break;
  735. case ':': {
  736. RevTree tree;
  737. if (rev == null) {
  738. if (name == null)
  739. name = new String(revChars, done, i);
  740. if (name.isEmpty())
  741. name = Constants.HEAD;
  742. rev = parseSimple(rw, name);
  743. name = null;
  744. }
  745. if (rev == null)
  746. return null;
  747. tree = rw.parseTree(rev);
  748. if (i == revChars.length - 1)
  749. return tree.copy();
  750. TreeWalk tw = TreeWalk.forPath(rw.getObjectReader(),
  751. new String(revChars, i + 1, revChars.length - i - 1),
  752. tree);
  753. return tw != null ? tw.getObjectId(0) : null;
  754. }
  755. default:
  756. if (rev != null)
  757. throw new RevisionSyntaxException(revstr);
  758. }
  759. }
  760. if (rev != null)
  761. return rev.copy();
  762. if (name != null)
  763. return name;
  764. if (done == revstr.length())
  765. return null;
  766. name = revstr.substring(done);
  767. if (!Repository.isValidRefName("x/" + name)) //$NON-NLS-1$
  768. throw new RevisionSyntaxException(
  769. MessageFormat.format(JGitText.get().invalidRefName, name),
  770. revstr);
  771. if (findRef(name) != null)
  772. return name;
  773. return resolveSimple(name);
  774. }
  775. private static boolean isHex(char c) {
  776. return ('0' <= c && c <= '9') //
  777. || ('a' <= c && c <= 'f') //
  778. || ('A' <= c && c <= 'F');
  779. }
  780. private static boolean isAllHex(String str, int ptr) {
  781. while (ptr < str.length()) {
  782. if (!isHex(str.charAt(ptr++)))
  783. return false;
  784. }
  785. return true;
  786. }
  787. @Nullable
  788. private RevObject parseSimple(RevWalk rw, String revstr) throws IOException {
  789. ObjectId id = resolveSimple(revstr);
  790. return id != null ? rw.parseAny(id) : null;
  791. }
  792. @Nullable
  793. private ObjectId resolveSimple(String revstr) throws IOException {
  794. if (ObjectId.isId(revstr))
  795. return ObjectId.fromString(revstr);
  796. if (Repository.isValidRefName("x/" + revstr)) { //$NON-NLS-1$
  797. Ref r = getRefDatabase().findRef(revstr);
  798. if (r != null)
  799. return r.getObjectId();
  800. }
  801. if (AbbreviatedObjectId.isId(revstr))
  802. return resolveAbbreviation(revstr);
  803. int dashg = revstr.indexOf("-g"); //$NON-NLS-1$
  804. if ((dashg + 5) < revstr.length() && 0 <= dashg
  805. && isHex(revstr.charAt(dashg + 2))
  806. && isHex(revstr.charAt(dashg + 3))
  807. && isAllHex(revstr, dashg + 4)) {
  808. // Possibly output from git describe?
  809. String s = revstr.substring(dashg + 2);
  810. if (AbbreviatedObjectId.isId(s))
  811. return resolveAbbreviation(s);
  812. }
  813. return null;
  814. }
  815. @Nullable
  816. private String resolveReflogCheckout(int checkoutNo)
  817. throws IOException {
  818. ReflogReader reader = getReflogReader(Constants.HEAD);
  819. if (reader == null) {
  820. return null;
  821. }
  822. List<ReflogEntry> reflogEntries = reader.getReverseEntries();
  823. for (ReflogEntry entry : reflogEntries) {
  824. CheckoutEntry checkout = entry.parseCheckout();
  825. if (checkout != null)
  826. if (checkoutNo-- == 1)
  827. return checkout.getFromBranch();
  828. }
  829. return null;
  830. }
  831. private RevCommit resolveReflog(RevWalk rw, Ref ref, String time)
  832. throws IOException {
  833. int number;
  834. try {
  835. number = Integer.parseInt(time);
  836. } catch (NumberFormatException nfe) {
  837. RevisionSyntaxException rse = new RevisionSyntaxException(
  838. MessageFormat.format(JGitText.get().invalidReflogRevision,
  839. time));
  840. rse.initCause(nfe);
  841. throw rse;
  842. }
  843. assert number >= 0;
  844. ReflogReader reader = getReflogReader(ref.getName());
  845. if (reader == null) {
  846. throw new RevisionSyntaxException(
  847. MessageFormat.format(JGitText.get().reflogEntryNotFound,
  848. Integer.valueOf(number), ref.getName()));
  849. }
  850. ReflogEntry entry = reader.getReverseEntry(number);
  851. if (entry == null)
  852. throw new RevisionSyntaxException(MessageFormat.format(
  853. JGitText.get().reflogEntryNotFound,
  854. Integer.valueOf(number), ref.getName()));
  855. return rw.parseCommit(entry.getNewId());
  856. }
  857. @Nullable
  858. private ObjectId resolveAbbreviation(String revstr) throws IOException,
  859. AmbiguousObjectException {
  860. AbbreviatedObjectId id = AbbreviatedObjectId.fromString(revstr);
  861. try (ObjectReader reader = newObjectReader()) {
  862. Collection<ObjectId> matches = reader.resolve(id);
  863. if (matches.isEmpty())
  864. return null;
  865. else if (matches.size() == 1)
  866. return matches.iterator().next();
  867. else
  868. throw new AmbiguousObjectException(id, matches);
  869. }
  870. }
  871. /**
  872. * Increment the use counter by one, requiring a matched {@link #close()}.
  873. */
  874. public void incrementOpen() {
  875. useCnt.incrementAndGet();
  876. }
  877. /**
  878. * {@inheritDoc}
  879. * <p>
  880. * Decrement the use count, and maybe close resources.
  881. */
  882. @Override
  883. public void close() {
  884. int newCount = useCnt.decrementAndGet();
  885. if (newCount == 0) {
  886. if (RepositoryCache.isCached(this)) {
  887. closedAt.set(System.currentTimeMillis());
  888. } else {
  889. doClose();
  890. }
  891. } else if (newCount == -1) {
  892. // should not happen, only log when useCnt became negative to
  893. // minimize number of log entries
  894. String message = MessageFormat.format(JGitText.get().corruptUseCnt,
  895. toString());
  896. if (LOG.isDebugEnabled()) {
  897. LOG.debug(message, new IllegalStateException());
  898. } else {
  899. LOG.warn(message);
  900. }
  901. if (RepositoryCache.isCached(this)) {
  902. closedAt.set(System.currentTimeMillis());
  903. }
  904. }
  905. }
  906. /**
  907. * Invoked when the use count drops to zero during {@link #close()}.
  908. * <p>
  909. * The default implementation closes the object and ref databases.
  910. */
  911. protected void doClose() {
  912. getObjectDatabase().close();
  913. getRefDatabase().close();
  914. }
  915. /** {@inheritDoc} */
  916. @Override
  917. @NonNull
  918. public String toString() {
  919. String desc;
  920. File directory = getDirectory();
  921. if (directory != null)
  922. desc = directory.getPath();
  923. else
  924. desc = getClass().getSimpleName() + "-" //$NON-NLS-1$
  925. + System.identityHashCode(this);
  926. return "Repository[" + desc + "]"; //$NON-NLS-1$ //$NON-NLS-2$
  927. }
  928. /**
  929. * Get the name of the reference that {@code HEAD} points to.
  930. * <p>
  931. * This is essentially the same as doing:
  932. *
  933. * <pre>
  934. * return exactRef(Constants.HEAD).getTarget().getName()
  935. * </pre>
  936. *
  937. * Except when HEAD is detached, in which case this method returns the
  938. * current ObjectId in hexadecimal string format.
  939. *
  940. * @return name of current branch (for example {@code refs/heads/master}),
  941. * an ObjectId in hex format if the current branch is detached, or
  942. * {@code null} if the repository is corrupt and has no HEAD
  943. * reference.
  944. * @throws java.io.IOException
  945. */
  946. @Nullable
  947. public String getFullBranch() throws IOException {
  948. Ref head = exactRef(Constants.HEAD);
  949. if (head == null) {
  950. return null;
  951. }
  952. if (head.isSymbolic()) {
  953. return head.getTarget().getName();
  954. }
  955. ObjectId objectId = head.getObjectId();
  956. if (objectId != null) {
  957. return objectId.name();
  958. }
  959. return null;
  960. }
  961. /**
  962. * Get the short name of the current branch that {@code HEAD} points to.
  963. * <p>
  964. * This is essentially the same as {@link #getFullBranch()}, except the
  965. * leading prefix {@code refs/heads/} is removed from the reference before
  966. * it is returned to the caller.
  967. *
  968. * @return name of current branch (for example {@code master}), an ObjectId
  969. * in hex format if the current branch is detached, or {@code null}
  970. * if the repository is corrupt and has no HEAD reference.
  971. * @throws java.io.IOException
  972. */
  973. @Nullable
  974. public String getBranch() throws IOException {
  975. String name = getFullBranch();
  976. if (name != null)
  977. return shortenRefName(name);
  978. return null;
  979. }
  980. /**
  981. * Get the initial branch name of a new repository
  982. *
  983. * @return the initial branch name of a new repository
  984. * @since 5.11
  985. */
  986. protected @NonNull String getInitialBranch() {
  987. return initialBranch;
  988. }
  989. /**
  990. * Objects known to exist but not expressed by {@link #getAllRefs()}.
  991. * <p>
  992. * When a repository borrows objects from another repository, it can
  993. * advertise that it safely has that other repository's references, without
  994. * exposing any other details about the other repository. This may help
  995. * a client trying to push changes avoid pushing more than it needs to.
  996. *
  997. * @return unmodifiable collection of other known objects.
  998. */
  999. @NonNull
  1000. public Set<ObjectId> getAdditionalHaves() {
  1001. return Collections.emptySet();
  1002. }
  1003. /**
  1004. * Get a ref by name.
  1005. *
  1006. * @param name
  1007. * the name of the ref to lookup. Must not be a short-hand
  1008. * form; e.g., "master" is not automatically expanded to
  1009. * "refs/heads/master".
  1010. * @return the Ref with the given name, or {@code null} if it does not exist
  1011. * @throws java.io.IOException
  1012. * @since 4.2
  1013. */
  1014. @Nullable
  1015. public final Ref exactRef(String name) throws IOException {
  1016. return getRefDatabase().exactRef(name);
  1017. }
  1018. /**
  1019. * Search for a ref by (possibly abbreviated) name.
  1020. *
  1021. * @param name
  1022. * the name of the ref to lookup. May be a short-hand form, e.g.
  1023. * "master" which is automatically expanded to
  1024. * "refs/heads/master" if "refs/heads/master" already exists.
  1025. * @return the Ref with the given name, or {@code null} if it does not exist
  1026. * @throws java.io.IOException
  1027. * @since 4.2
  1028. */
  1029. @Nullable
  1030. public final Ref findRef(String name) throws IOException {
  1031. return getRefDatabase().findRef(name);
  1032. }
  1033. /**
  1034. * Get mutable map of all known refs, including symrefs like HEAD that may
  1035. * not point to any object yet.
  1036. *
  1037. * @return mutable map of all known refs (heads, tags, remotes).
  1038. * @deprecated use {@code getRefDatabase().getRefs()} instead.
  1039. */
  1040. @Deprecated
  1041. @NonNull
  1042. public Map<String, Ref> getAllRefs() {
  1043. try {
  1044. return getRefDatabase().getRefs(RefDatabase.ALL);
  1045. } catch (IOException e) {
  1046. throw new UncheckedIOException(e);
  1047. }
  1048. }
  1049. /**
  1050. * Get mutable map of all tags
  1051. *
  1052. * @return mutable map of all tags; key is short tag name ("v1.0") and value
  1053. * of the entry contains the ref with the full tag name
  1054. * ("refs/tags/v1.0").
  1055. * @deprecated use {@code getRefDatabase().getRefsByPrefix(R_TAGS)} instead
  1056. */
  1057. @Deprecated
  1058. @NonNull
  1059. public Map<String, Ref> getTags() {
  1060. try {
  1061. return getRefDatabase().getRefs(Constants.R_TAGS);
  1062. } catch (IOException e) {
  1063. throw new UncheckedIOException(e);
  1064. }
  1065. }
  1066. /**
  1067. * Peel a possibly unpeeled reference to an annotated tag.
  1068. * <p>
  1069. * If the ref cannot be peeled (as it does not refer to an annotated tag)
  1070. * the peeled id stays null, but {@link org.eclipse.jgit.lib.Ref#isPeeled()}
  1071. * will be true.
  1072. *
  1073. * @param ref
  1074. * The ref to peel
  1075. * @return <code>ref</code> if <code>ref.isPeeled()</code> is true; else a
  1076. * new Ref object representing the same data as Ref, but isPeeled()
  1077. * will be true and getPeeledObjectId will contain the peeled object
  1078. * (or null).
  1079. * @deprecated use {@code getRefDatabase().peel(ref)} instead.
  1080. */
  1081. @Deprecated
  1082. @NonNull
  1083. public Ref peel(Ref ref) {
  1084. try {
  1085. return getRefDatabase().peel(ref);
  1086. } catch (IOException e) {
  1087. // Historical accident; if the reference cannot be peeled due
  1088. // to some sort of repository access problem we claim that the
  1089. // same as if the reference was not an annotated tag.
  1090. return ref;
  1091. }
  1092. }
  1093. /**
  1094. * Get a map with all objects referenced by a peeled ref.
  1095. *
  1096. * @return a map with all objects referenced by a peeled ref.
  1097. */
  1098. @NonNull
  1099. public Map<AnyObjectId, Set<Ref>> getAllRefsByPeeledObjectId() {
  1100. Map<String, Ref> allRefs = getAllRefs();
  1101. Map<AnyObjectId, Set<Ref>> ret = new HashMap<>(allRefs.size());
  1102. for (Ref ref : allRefs.values()) {
  1103. ref = peel(ref);
  1104. AnyObjectId target = ref.getPeeledObjectId();
  1105. if (target == null)
  1106. target = ref.getObjectId();
  1107. // We assume most Sets here are singletons
  1108. Set<Ref> oset = ret.put(target, Collections.singleton(ref));
  1109. if (oset != null) {
  1110. // that was not the case (rare)
  1111. if (oset.size() == 1) {
  1112. // Was a read-only singleton, we must copy to a new Set
  1113. oset = new HashSet<>(oset);
  1114. }
  1115. ret.put(target, oset);
  1116. oset.add(ref);
  1117. }
  1118. }
  1119. return ret;
  1120. }
  1121. /**
  1122. * Get the index file location or {@code null} if repository isn't local.
  1123. *
  1124. * @return the index file location or {@code null} if repository isn't
  1125. * local.
  1126. * @throws org.eclipse.jgit.errors.NoWorkTreeException
  1127. * if this is bare, which implies it has no working directory.
  1128. * See {@link #isBare()}.
  1129. */
  1130. @NonNull
  1131. public File getIndexFile() throws NoWorkTreeException {
  1132. if (isBare())
  1133. throw new NoWorkTreeException();
  1134. return indexFile;
  1135. }
  1136. /**
  1137. * Locate a reference to a commit and immediately parse its content.
  1138. * <p>
  1139. * This method only returns successfully if the commit object exists,
  1140. * is verified to be a commit, and was parsed without error.
  1141. *
  1142. * @param id
  1143. * name of the commit object.
  1144. * @return reference to the commit object. Never null.
  1145. * @throws org.eclipse.jgit.errors.MissingObjectException
  1146. * the supplied commit does not exist.
  1147. * @throws org.eclipse.jgit.errors.IncorrectObjectTypeException
  1148. * the supplied id is not a commit or an annotated tag.
  1149. * @throws java.io.IOException
  1150. * a pack file or loose object could not be read.
  1151. * @since 4.8
  1152. */
  1153. public RevCommit parseCommit(AnyObjectId id) throws IncorrectObjectTypeException,
  1154. IOException, MissingObjectException {
  1155. if (id instanceof RevCommit && ((RevCommit) id).getRawBuffer() != null) {
  1156. return (RevCommit) id;
  1157. }
  1158. try (RevWalk walk = new RevWalk(this)) {
  1159. return walk.parseCommit(id);
  1160. }
  1161. }
  1162. /**
  1163. * Create a new in-core index representation and read an index from disk.
  1164. * <p>
  1165. * The new index will be read before it is returned to the caller. Read
  1166. * failures are reported as exceptions and therefore prevent the method from
  1167. * returning a partially populated index.
  1168. *
  1169. * @return a cache representing the contents of the specified index file (if
  1170. * it exists) or an empty cache if the file does not exist.
  1171. * @throws org.eclipse.jgit.errors.NoWorkTreeException
  1172. * if this is bare, which implies it has no working directory.
  1173. * See {@link #isBare()}.
  1174. * @throws java.io.IOException
  1175. * the index file is present but could not be read.
  1176. * @throws org.eclipse.jgit.errors.CorruptObjectException
  1177. * the index file is using a format or extension that this
  1178. * library does not support.
  1179. */
  1180. @NonNull
  1181. public DirCache readDirCache() throws NoWorkTreeException,
  1182. CorruptObjectException, IOException {
  1183. return DirCache.read(this);
  1184. }
  1185. /**
  1186. * Create a new in-core index representation, lock it, and read from disk.
  1187. * <p>
  1188. * The new index will be locked and then read before it is returned to the
  1189. * caller. Read failures are reported as exceptions and therefore prevent
  1190. * the method from returning a partially populated index.
  1191. *
  1192. * @return a cache representing the contents of the specified index file (if
  1193. * it exists) or an empty cache if the file does not exist.
  1194. * @throws org.eclipse.jgit.errors.NoWorkTreeException
  1195. * if this is bare, which implies it has no working directory.
  1196. * See {@link #isBare()}.
  1197. * @throws java.io.IOException
  1198. * the index file is present but could not be read, or the lock
  1199. * could not be obtained.
  1200. * @throws org.eclipse.jgit.errors.CorruptObjectException
  1201. * the index file is using a format or extension that this
  1202. * library does not support.
  1203. */
  1204. @NonNull
  1205. public DirCache lockDirCache() throws NoWorkTreeException,
  1206. CorruptObjectException, IOException {
  1207. // we want DirCache to inform us so that we can inform registered
  1208. // listeners about index changes
  1209. IndexChangedListener l = (IndexChangedEvent event) -> {
  1210. notifyIndexChanged(true);
  1211. };
  1212. return DirCache.lock(this, l);
  1213. }
  1214. /**
  1215. * Get the repository state
  1216. *
  1217. * @return the repository state
  1218. */
  1219. @NonNull
  1220. public RepositoryState getRepositoryState() {
  1221. if (isBare() || getDirectory() == null)
  1222. return RepositoryState.BARE;
  1223. // Pre Git-1.6 logic
  1224. if (new File(getWorkTree(), ".dotest").exists()) //$NON-NLS-1$
  1225. return RepositoryState.REBASING;
  1226. if (new File(getDirectory(), ".dotest-merge").exists()) //$NON-NLS-1$
  1227. return RepositoryState.REBASING_INTERACTIVE;
  1228. // From 1.6 onwards
  1229. if (new File(getDirectory(),"rebase-apply/rebasing").exists()) //$NON-NLS-1$
  1230. return RepositoryState.REBASING_REBASING;
  1231. if (new File(getDirectory(),"rebase-apply/applying").exists()) //$NON-NLS-1$
  1232. return RepositoryState.APPLY;
  1233. if (new File(getDirectory(),"rebase-apply").exists()) //$NON-NLS-1$
  1234. return RepositoryState.REBASING;
  1235. if (new File(getDirectory(),"rebase-merge/interactive").exists()) //$NON-NLS-1$
  1236. return RepositoryState.REBASING_INTERACTIVE;
  1237. if (new File(getDirectory(),"rebase-merge").exists()) //$NON-NLS-1$
  1238. return RepositoryState.REBASING_MERGE;
  1239. // Both versions
  1240. if (new File(getDirectory(), Constants.MERGE_HEAD).exists()) {
  1241. // we are merging - now check whether we have unmerged paths
  1242. try {
  1243. if (!readDirCache().hasUnmergedPaths()) {
  1244. // no unmerged paths -> return the MERGING_RESOLVED state
  1245. return RepositoryState.MERGING_RESOLVED;
  1246. }
  1247. } catch (IOException e) {
  1248. throw new UncheckedIOException(e);
  1249. }
  1250. return RepositoryState.MERGING;
  1251. }
  1252. if (new File(getDirectory(), "BISECT_LOG").exists()) //$NON-NLS-1$
  1253. return RepositoryState.BISECTING;
  1254. if (new File(getDirectory(), Constants.CHERRY_PICK_HEAD).exists()) {
  1255. try {
  1256. if (!readDirCache().hasUnmergedPaths()) {
  1257. // no unmerged paths
  1258. return RepositoryState.CHERRY_PICKING_RESOLVED;
  1259. }
  1260. } catch (IOException e) {
  1261. throw new UncheckedIOException(e);
  1262. }
  1263. return RepositoryState.CHERRY_PICKING;
  1264. }
  1265. if (new File(getDirectory(), Constants.REVERT_HEAD).exists()) {
  1266. try {
  1267. if (!readDirCache().hasUnmergedPaths()) {
  1268. // no unmerged paths
  1269. return RepositoryState.REVERTING_RESOLVED;
  1270. }
  1271. } catch (IOException e) {
  1272. throw new UncheckedIOException(e);
  1273. }
  1274. return RepositoryState.REVERTING;
  1275. }
  1276. return RepositoryState.SAFE;
  1277. }
  1278. /**
  1279. * Check validity of a ref name. It must not contain character that has
  1280. * a special meaning in a Git object reference expression. Some other
  1281. * dangerous characters are also excluded.
  1282. *
  1283. * For portability reasons '\' is excluded
  1284. *
  1285. * @param refName a {@link java.lang.String} object.
  1286. * @return true if refName is a valid ref name
  1287. */
  1288. public static boolean isValidRefName(String refName) {
  1289. final int len = refName.length();
  1290. if (len == 0) {
  1291. return false;
  1292. }
  1293. if (refName.endsWith(LOCK_SUFFIX)) {
  1294. return false;
  1295. }
  1296. // Refs may be stored as loose files so invalid paths
  1297. // on the local system must also be invalid refs.
  1298. try {
  1299. SystemReader.getInstance().checkPath(refName);
  1300. } catch (CorruptObjectException e) {
  1301. return false;
  1302. }
  1303. int components = 1;
  1304. char p = '\0';
  1305. for (int i = 0; i < len; i++) {
  1306. final char c = refName.charAt(i);
  1307. if (c <= ' ')
  1308. return false;
  1309. switch (c) {
  1310. case '.':
  1311. switch (p) {
  1312. case '\0': case '/': case '.':
  1313. return false;
  1314. }
  1315. if (i == len -1)
  1316. return false;
  1317. break;
  1318. case '/':
  1319. if (i == 0 || i == len - 1)
  1320. return false;
  1321. if (p == '/')
  1322. return false;
  1323. components++;
  1324. break;
  1325. case '{':
  1326. if (p == '@')
  1327. return false;
  1328. break;
  1329. case '~': case '^': case ':':
  1330. case '?': case '[': case '*':
  1331. case '\\':
  1332. case '\u007F':
  1333. return false;
  1334. }
  1335. p = c;
  1336. }
  1337. return components > 1;
  1338. }
  1339. /**
  1340. * Normalizes the passed branch name into a possible valid branch name. The
  1341. * validity of the returned name should be checked by a subsequent call to
  1342. * {@link #isValidRefName(String)}.
  1343. * <p>
  1344. * Future implementations of this method could be more restrictive or more
  1345. * lenient about the validity of specific characters in the returned name.
  1346. * <p>
  1347. * The current implementation returns the trimmed input string if this is
  1348. * already a valid branch name. Otherwise it returns a trimmed string with
  1349. * special characters not allowed by {@link #isValidRefName(String)}
  1350. * replaced by hyphens ('-') and blanks replaced by underscores ('_').
  1351. * Leading and trailing slashes, dots, hyphens, and underscores are removed.
  1352. *
  1353. * @param name
  1354. * to normalize
  1355. * @return The normalized name or an empty String if it is {@code null} or
  1356. * empty.
  1357. * @since 4.7
  1358. * @see #isValidRefName(String)
  1359. */
  1360. public static String normalizeBranchName(String name) {
  1361. if (name == null || name.isEmpty()) {
  1362. return ""; //$NON-NLS-1$
  1363. }
  1364. String result = name.trim();
  1365. String fullName = result.startsWith(Constants.R_HEADS) ? result
  1366. : Constants.R_HEADS + result;
  1367. if (isValidRefName(fullName)) {
  1368. return result;
  1369. }
  1370. // All Unicode blanks to underscore
  1371. result = result.replaceAll("(?:\\h|\\v)+", "_"); //$NON-NLS-1$ //$NON-NLS-2$
  1372. StringBuilder b = new StringBuilder();
  1373. char p = '/';
  1374. for (int i = 0, len = result.length(); i < len; i++) {
  1375. char c = result.charAt(i);
  1376. if (c < ' ' || c == 127) {
  1377. continue;
  1378. }
  1379. // Substitute a dash for problematic characters
  1380. switch (c) {
  1381. case '\\':
  1382. case '^':
  1383. case '~':
  1384. case ':':
  1385. case '?':
  1386. case '*':
  1387. case '[':
  1388. case '@':
  1389. case '<':
  1390. case '>':
  1391. case '|':
  1392. case '"':
  1393. c = '-';
  1394. break;
  1395. default:
  1396. break;
  1397. }
  1398. // Collapse multiple slashes, dashes, dots, underscores, and omit
  1399. // dashes, dots, and underscores following a slash.
  1400. switch (c) {
  1401. case '/':
  1402. if (p == '/') {
  1403. continue;
  1404. }
  1405. p = '/';
  1406. break;
  1407. case '.':
  1408. case '_':
  1409. case '-':
  1410. if (p == '/' || p == '-') {
  1411. continue;
  1412. }
  1413. p = '-';
  1414. break;
  1415. default:
  1416. p = c;
  1417. break;
  1418. }
  1419. b.append(c);
  1420. }
  1421. // Strip trailing special characters, and avoid the .lock extension
  1422. result = b.toString().replaceFirst("[/_.-]+$", "") //$NON-NLS-1$ //$NON-NLS-2$
  1423. .replaceAll("\\.lock($|/)", "_lock$1"); //$NON-NLS-1$ //$NON-NLS-2$
  1424. return FORBIDDEN_BRANCH_NAME_COMPONENTS.matcher(result)
  1425. .replaceAll("$1+$2$3"); //$NON-NLS-1$
  1426. }
  1427. /**
  1428. * Strip work dir and return normalized repository path.
  1429. *
  1430. * @param workDir
  1431. * Work dir
  1432. * @param file
  1433. * File whose path shall be stripped of its workdir
  1434. * @return normalized repository relative path or the empty string if the
  1435. * file is not relative to the work directory.
  1436. */
  1437. @NonNull
  1438. public static String stripWorkDir(File workDir, File file) {
  1439. final String filePath = file.getPath();
  1440. final String workDirPath = workDir.getPath();
  1441. if (filePath.length() <= workDirPath.length()
  1442. || filePath.charAt(workDirPath.length()) != File.separatorChar
  1443. || !filePath.startsWith(workDirPath)) {
  1444. File absWd = workDir.isAbsolute() ? workDir
  1445. : workDir.getAbsoluteFile();
  1446. File absFile = file.isAbsolute() ? file : file.getAbsoluteFile();
  1447. if (absWd.equals(workDir) && absFile.equals(file)) {
  1448. return ""; //$NON-NLS-1$
  1449. }
  1450. return stripWorkDir(absWd, absFile);
  1451. }
  1452. String relName = filePath.substring(workDirPath.length() + 1);
  1453. if (File.separatorChar != '/') {
  1454. relName = relName.replace(File.separatorChar, '/');
  1455. }
  1456. return relName;
  1457. }
  1458. /**
  1459. * Whether this repository is bare
  1460. *
  1461. * @return true if this is bare, which implies it has no working directory.
  1462. */
  1463. public boolean isBare() {
  1464. return workTree == null;
  1465. }
  1466. /**
  1467. * Get the root directory of the working tree, where files are checked out
  1468. * for viewing and editing.
  1469. *
  1470. * @return the root directory of the working tree, where files are checked
  1471. * out for viewing and editing.
  1472. * @throws org.eclipse.jgit.errors.NoWorkTreeException
  1473. * if this is bare, which implies it has no working directory.
  1474. * See {@link #isBare()}.
  1475. */
  1476. @NonNull
  1477. public File getWorkTree() throws NoWorkTreeException {
  1478. if (isBare())
  1479. throw new NoWorkTreeException();
  1480. return workTree;
  1481. }
  1482. /**
  1483. * Force a scan for changed refs. Fires an IndexChangedEvent(false) if
  1484. * changes are detected.
  1485. *
  1486. * @throws java.io.IOException
  1487. */
  1488. public abstract void scanForRepoChanges() throws IOException;
  1489. /**
  1490. * Notify that the index changed by firing an IndexChangedEvent.
  1491. *
  1492. * @param internal
  1493. * {@code true} if the index was changed by the same
  1494. * JGit process
  1495. * @since 5.0
  1496. */
  1497. public abstract void notifyIndexChanged(boolean internal);
  1498. /**
  1499. * Get a shortened more user friendly ref name
  1500. *
  1501. * @param refName
  1502. * a {@link java.lang.String} object.
  1503. * @return a more user friendly ref name
  1504. */
  1505. @NonNull
  1506. public static String shortenRefName(String refName) {
  1507. if (refName.startsWith(Constants.R_HEADS))
  1508. return refName.substring(Constants.R_HEADS.length());
  1509. if (refName.startsWith(Constants.R_TAGS))
  1510. return refName.substring(Constants.R_TAGS.length());
  1511. if (refName.startsWith(Constants.R_REMOTES))
  1512. return refName.substring(Constants.R_REMOTES.length());
  1513. return refName;
  1514. }
  1515. /**
  1516. * Get a shortened more user friendly remote tracking branch name
  1517. *
  1518. * @param refName
  1519. * a {@link java.lang.String} object.
  1520. * @return the remote branch name part of <code>refName</code>, i.e. without
  1521. * the <code>refs/remotes/&lt;remote&gt;</code> prefix, if
  1522. * <code>refName</code> represents a remote tracking branch;
  1523. * otherwise {@code null}.
  1524. * @since 3.4
  1525. */
  1526. @Nullable
  1527. public String shortenRemoteBranchName(String refName) {
  1528. for (String remote : getRemoteNames()) {
  1529. String remotePrefix = Constants.R_REMOTES + remote + "/"; //$NON-NLS-1$
  1530. if (refName.startsWith(remotePrefix))
  1531. return refName.substring(remotePrefix.length());
  1532. }
  1533. return null;
  1534. }
  1535. /**
  1536. * Get remote name
  1537. *
  1538. * @param refName
  1539. * a {@link java.lang.String} object.
  1540. * @return the remote name part of <code>refName</code>, i.e. without the
  1541. * <code>refs/remotes/&lt;remote&gt;</code> prefix, if
  1542. * <code>refName</code> represents a remote tracking branch;
  1543. * otherwise {@code null}.
  1544. * @since 3.4
  1545. */
  1546. @Nullable
  1547. public String getRemoteName(String refName) {
  1548. for (String remote : getRemoteNames()) {
  1549. String remotePrefix = Constants.R_REMOTES + remote + "/"; //$NON-NLS-1$
  1550. if (refName.startsWith(remotePrefix))
  1551. return remote;
  1552. }
  1553. return null;
  1554. }
  1555. /**
  1556. * Read the {@code GIT_DIR/description} file for gitweb.
  1557. *
  1558. * @return description text; null if no description has been configured.
  1559. * @throws java.io.IOException
  1560. * description cannot be accessed.
  1561. * @since 4.6
  1562. */
  1563. @Nullable
  1564. public String getGitwebDescription() throws IOException {
  1565. return null;
  1566. }
  1567. /**
  1568. * Set the {@code GIT_DIR/description} file for gitweb.
  1569. *
  1570. * @param description
  1571. * new description; null to clear the description.
  1572. * @throws java.io.IOException
  1573. * description cannot be persisted.
  1574. * @since 4.6
  1575. */
  1576. public void setGitwebDescription(@Nullable String description)
  1577. throws IOException {
  1578. throw new IOException(JGitText.get().unsupportedRepositoryDescription);
  1579. }
  1580. /**
  1581. * Get the reflog reader
  1582. *
  1583. * @param refName
  1584. * a {@link java.lang.String} object.
  1585. * @return a {@link org.eclipse.jgit.lib.ReflogReader} for the supplied
  1586. * refname, or {@code null} if the named ref does not exist.
  1587. * @throws java.io.IOException
  1588. * the ref could not be accessed.
  1589. * @since 3.0
  1590. */
  1591. @Nullable
  1592. public abstract ReflogReader getReflogReader(String refName)
  1593. throws IOException;
  1594. /**
  1595. * Return the information stored in the file $GIT_DIR/MERGE_MSG. In this
  1596. * file operations triggering a merge will store a template for the commit
  1597. * message of the merge commit.
  1598. *
  1599. * @return a String containing the content of the MERGE_MSG file or
  1600. * {@code null} if this file doesn't exist
  1601. * @throws java.io.IOException
  1602. * @throws org.eclipse.jgit.errors.NoWorkTreeException
  1603. * if this is bare, which implies it has no working directory.
  1604. * See {@link #isBare()}.
  1605. */
  1606. @Nullable
  1607. public String readMergeCommitMsg() throws IOException, NoWorkTreeException {
  1608. return readCommitMsgFile(Constants.MERGE_MSG);
  1609. }
  1610. /**
  1611. * Write new content to the file $GIT_DIR/MERGE_MSG. In this file operations
  1612. * triggering a merge will store a template for the commit message of the
  1613. * merge commit. If <code>null</code> is specified as message the file will
  1614. * be deleted.
  1615. *
  1616. * @param msg
  1617. * the message which should be written or <code>null</code> to
  1618. * delete the file
  1619. * @throws java.io.IOException
  1620. */
  1621. public void writeMergeCommitMsg(String msg) throws IOException {
  1622. File mergeMsgFile = new File(gitDir, Constants.MERGE_MSG);
  1623. writeCommitMsg(mergeMsgFile, msg);
  1624. }
  1625. /**
  1626. * Return the information stored in the file $GIT_DIR/COMMIT_EDITMSG. In
  1627. * this file hooks triggered by an operation may read or modify the current
  1628. * commit message.
  1629. *
  1630. * @return a String containing the content of the COMMIT_EDITMSG file or
  1631. * {@code null} if this file doesn't exist
  1632. * @throws java.io.IOException
  1633. * @throws org.eclipse.jgit.errors.NoWorkTreeException
  1634. * if this is bare, which implies it has no working directory.
  1635. * See {@link #isBare()}.
  1636. * @since 4.0
  1637. */
  1638. @Nullable
  1639. public String readCommitEditMsg() throws IOException, NoWorkTreeException {
  1640. return readCommitMsgFile(Constants.COMMIT_EDITMSG);
  1641. }
  1642. /**
  1643. * Write new content to the file $GIT_DIR/COMMIT_EDITMSG. In this file hooks
  1644. * triggered by an operation may read or modify the current commit message.
  1645. * If {@code null} is specified as message the file will be deleted.
  1646. *
  1647. * @param msg
  1648. * the message which should be written or {@code null} to delete
  1649. * the file
  1650. * @throws java.io.IOException
  1651. * @since 4.0
  1652. */
  1653. public void writeCommitEditMsg(String msg) throws IOException {
  1654. File commiEditMsgFile = new File(gitDir, Constants.COMMIT_EDITMSG);
  1655. writeCommitMsg(commiEditMsgFile, msg);
  1656. }
  1657. /**
  1658. * Return the information stored in the file $GIT_DIR/MERGE_HEAD. In this
  1659. * file operations triggering a merge will store the IDs of all heads which
  1660. * should be merged together with HEAD.
  1661. *
  1662. * @return a list of commits which IDs are listed in the MERGE_HEAD file or
  1663. * {@code null} if this file doesn't exist. Also if the file exists
  1664. * but is empty {@code null} will be returned
  1665. * @throws java.io.IOException
  1666. * @throws org.eclipse.jgit.errors.NoWorkTreeException
  1667. * if this is bare, which implies it has no working directory.
  1668. * See {@link #isBare()}.
  1669. */
  1670. @Nullable
  1671. public List<ObjectId> readMergeHeads() throws IOException, NoWorkTreeException {
  1672. if (isBare() || getDirectory() == null)
  1673. throw new NoWorkTreeException();
  1674. byte[] raw = readGitDirectoryFile(Constants.MERGE_HEAD);
  1675. if (raw == null)
  1676. return null;
  1677. LinkedList<ObjectId> heads = new LinkedList<>();
  1678. for (int p = 0; p < raw.length;) {
  1679. heads.add(ObjectId.fromString(raw, p));
  1680. p = RawParseUtils
  1681. .nextLF(raw, p + Constants.OBJECT_ID_STRING_LENGTH);
  1682. }
  1683. return heads;
  1684. }
  1685. /**
  1686. * Write new merge-heads into $GIT_DIR/MERGE_HEAD. In this file operations
  1687. * triggering a merge will store the IDs of all heads which should be merged
  1688. * together with HEAD. If <code>null</code> is specified as list of commits
  1689. * the file will be deleted
  1690. *
  1691. * @param heads
  1692. * a list of commits which IDs should be written to
  1693. * $GIT_DIR/MERGE_HEAD or <code>null</code> to delete the file
  1694. * @throws java.io.IOException
  1695. */
  1696. public void writeMergeHeads(List<? extends ObjectId> heads) throws IOException {
  1697. writeHeadsFile(heads, Constants.MERGE_HEAD);
  1698. }
  1699. /**
  1700. * Return the information stored in the file $GIT_DIR/CHERRY_PICK_HEAD.
  1701. *
  1702. * @return object id from CHERRY_PICK_HEAD file or {@code null} if this file
  1703. * doesn't exist. Also if the file exists but is empty {@code null}
  1704. * will be returned
  1705. * @throws java.io.IOException
  1706. * @throws org.eclipse.jgit.errors.NoWorkTreeException
  1707. * if this is bare, which implies it has no working directory.
  1708. * See {@link #isBare()}.
  1709. */
  1710. @Nullable
  1711. public ObjectId readCherryPickHead() throws IOException,
  1712. NoWorkTreeException {
  1713. if (isBare() || getDirectory() == null)
  1714. throw new NoWorkTreeException();
  1715. byte[] raw = readGitDirectoryFile(Constants.CHERRY_PICK_HEAD);
  1716. if (raw == null)
  1717. return null;
  1718. return ObjectId.fromString(raw, 0);
  1719. }
  1720. /**
  1721. * Return the information stored in the file $GIT_DIR/REVERT_HEAD.
  1722. *
  1723. * @return object id from REVERT_HEAD file or {@code null} if this file
  1724. * doesn't exist. Also if the file exists but is empty {@code null}
  1725. * will be returned
  1726. * @throws java.io.IOException
  1727. * @throws org.eclipse.jgit.errors.NoWorkTreeException
  1728. * if this is bare, which implies it has no working directory.
  1729. * See {@link #isBare()}.
  1730. */
  1731. @Nullable
  1732. public ObjectId readRevertHead() throws IOException, NoWorkTreeException {
  1733. if (isBare() || getDirectory() == null)
  1734. throw new NoWorkTreeException();
  1735. byte[] raw = readGitDirectoryFile(Constants.REVERT_HEAD);
  1736. if (raw == null)
  1737. return null;
  1738. return ObjectId.fromString(raw, 0);
  1739. }
  1740. /**
  1741. * Write cherry pick commit into $GIT_DIR/CHERRY_PICK_HEAD. This is used in
  1742. * case of conflicts to store the cherry which was tried to be picked.
  1743. *
  1744. * @param head
  1745. * an object id of the cherry commit or <code>null</code> to
  1746. * delete the file
  1747. * @throws java.io.IOException
  1748. */
  1749. public void writeCherryPickHead(ObjectId head) throws IOException {
  1750. List<ObjectId> heads = (head != null) ? Collections.singletonList(head)
  1751. : null;
  1752. writeHeadsFile(heads, Constants.CHERRY_PICK_HEAD);
  1753. }
  1754. /**
  1755. * Write revert commit into $GIT_DIR/REVERT_HEAD. This is used in case of
  1756. * conflicts to store the revert which was tried to be picked.
  1757. *
  1758. * @param head
  1759. * an object id of the revert commit or <code>null</code> to
  1760. * delete the file
  1761. * @throws java.io.IOException
  1762. */
  1763. public void writeRevertHead(ObjectId head) throws IOException {
  1764. List<ObjectId> heads = (head != null) ? Collections.singletonList(head)
  1765. : null;
  1766. writeHeadsFile(heads, Constants.REVERT_HEAD);
  1767. }
  1768. /**
  1769. * Write original HEAD commit into $GIT_DIR/ORIG_HEAD.
  1770. *
  1771. * @param head
  1772. * an object id of the original HEAD commit or <code>null</code>
  1773. * to delete the file
  1774. * @throws java.io.IOException
  1775. */
  1776. public void writeOrigHead(ObjectId head) throws IOException {
  1777. List<ObjectId> heads = head != null ? Collections.singletonList(head)
  1778. : null;
  1779. writeHeadsFile(heads, Constants.ORIG_HEAD);
  1780. }
  1781. /**
  1782. * Return the information stored in the file $GIT_DIR/ORIG_HEAD.
  1783. *
  1784. * @return object id from ORIG_HEAD file or {@code null} if this file
  1785. * doesn't exist. Also if the file exists but is empty {@code null}
  1786. * will be returned
  1787. * @throws java.io.IOException
  1788. * @throws org.eclipse.jgit.errors.NoWorkTreeException
  1789. * if this is bare, which implies it has no working directory.
  1790. * See {@link #isBare()}.
  1791. */
  1792. @Nullable
  1793. public ObjectId readOrigHead() throws IOException, NoWorkTreeException {
  1794. if (isBare() || getDirectory() == null)
  1795. throw new NoWorkTreeException();
  1796. byte[] raw = readGitDirectoryFile(Constants.ORIG_HEAD);
  1797. return raw != null ? ObjectId.fromString(raw, 0) : null;
  1798. }
  1799. /**
  1800. * Return the information stored in the file $GIT_DIR/SQUASH_MSG. In this
  1801. * file operations triggering a squashed merge will store a template for the
  1802. * commit message of the squash commit.
  1803. *
  1804. * @return a String containing the content of the SQUASH_MSG file or
  1805. * {@code null} if this file doesn't exist
  1806. * @throws java.io.IOException
  1807. * @throws NoWorkTreeException
  1808. * if this is bare, which implies it has no working directory.
  1809. * See {@link #isBare()}.
  1810. */
  1811. @Nullable
  1812. public String readSquashCommitMsg() throws IOException {
  1813. return readCommitMsgFile(Constants.SQUASH_MSG);
  1814. }
  1815. /**
  1816. * Write new content to the file $GIT_DIR/SQUASH_MSG. In this file
  1817. * operations triggering a squashed merge will store a template for the
  1818. * commit message of the squash commit. If <code>null</code> is specified as
  1819. * message the file will be deleted.
  1820. *
  1821. * @param msg
  1822. * the message which should be written or <code>null</code> to
  1823. * delete the file
  1824. * @throws java.io.IOException
  1825. */
  1826. public void writeSquashCommitMsg(String msg) throws IOException {
  1827. File squashMsgFile = new File(gitDir, Constants.SQUASH_MSG);
  1828. writeCommitMsg(squashMsgFile, msg);
  1829. }
  1830. @Nullable
  1831. private String readCommitMsgFile(String msgFilename) throws IOException {
  1832. if (isBare() || getDirectory() == null)
  1833. throw new NoWorkTreeException();
  1834. File mergeMsgFile = new File(getDirectory(), msgFilename);
  1835. try {
  1836. return RawParseUtils.decode(IO.readFully(mergeMsgFile));
  1837. } catch (FileNotFoundException e) {
  1838. if (mergeMsgFile.exists()) {
  1839. throw e;
  1840. }
  1841. // the file has disappeared in the meantime ignore it
  1842. return null;
  1843. }
  1844. }
  1845. private void writeCommitMsg(File msgFile, String msg) throws IOException {
  1846. if (msg != null) {
  1847. try (FileOutputStream fos = new FileOutputStream(msgFile)) {
  1848. fos.write(msg.getBytes(UTF_8));
  1849. }
  1850. } else {
  1851. FileUtils.delete(msgFile, FileUtils.SKIP_MISSING);
  1852. }
  1853. }
  1854. /**
  1855. * Read a file from the git directory.
  1856. *
  1857. * @param filename
  1858. * @return the raw contents or {@code null} if the file doesn't exist or is
  1859. * empty
  1860. * @throws IOException
  1861. */
  1862. private byte[] readGitDirectoryFile(String filename) throws IOException {
  1863. File file = new File(getDirectory(), filename);
  1864. try {
  1865. byte[] raw = IO.readFully(file);
  1866. return raw.length > 0 ? raw : null;
  1867. } catch (FileNotFoundException notFound) {
  1868. if (file.exists()) {
  1869. throw notFound;
  1870. }
  1871. return null;
  1872. }
  1873. }
  1874. /**
  1875. * Write the given heads to a file in the git directory.
  1876. *
  1877. * @param heads
  1878. * a list of object ids to write or null if the file should be
  1879. * deleted.
  1880. * @param filename
  1881. * @throws FileNotFoundException
  1882. * @throws IOException
  1883. */
  1884. private void writeHeadsFile(List<? extends ObjectId> heads, String filename)
  1885. throws FileNotFoundException, IOException {
  1886. File headsFile = new File(getDirectory(), filename);
  1887. if (heads != null) {
  1888. try (OutputStream bos = new BufferedOutputStream(
  1889. new FileOutputStream(headsFile))) {
  1890. for (ObjectId id : heads) {
  1891. id.copyTo(bos);
  1892. bos.write('\n');
  1893. }
  1894. }
  1895. } else {
  1896. FileUtils.delete(headsFile, FileUtils.SKIP_MISSING);
  1897. }
  1898. }
  1899. /**
  1900. * Read a file formatted like the git-rebase-todo file. The "done" file is
  1901. * also formatted like the git-rebase-todo file. These files can be found in
  1902. * .git/rebase-merge/ or .git/rebase-append/ folders.
  1903. *
  1904. * @param path
  1905. * path to the file relative to the repository's git-dir. E.g.
  1906. * "rebase-merge/git-rebase-todo" or "rebase-append/done"
  1907. * @param includeComments
  1908. * <code>true</code> if also comments should be reported
  1909. * @return the list of steps
  1910. * @throws java.io.IOException
  1911. * @since 3.2
  1912. */
  1913. @NonNull
  1914. public List<RebaseTodoLine> readRebaseTodo(String path,
  1915. boolean includeComments)
  1916. throws IOException {
  1917. return new RebaseTodoFile(this).readRebaseTodo(path, includeComments);
  1918. }
  1919. /**
  1920. * Write a file formatted like a git-rebase-todo file.
  1921. *
  1922. * @param path
  1923. * path to the file relative to the repository's git-dir. E.g.
  1924. * "rebase-merge/git-rebase-todo" or "rebase-append/done"
  1925. * @param steps
  1926. * the steps to be written
  1927. * @param append
  1928. * whether to append to an existing file or to write a new file
  1929. * @throws java.io.IOException
  1930. * @since 3.2
  1931. */
  1932. public void writeRebaseTodoFile(String path, List<RebaseTodoLine> steps,
  1933. boolean append)
  1934. throws IOException {
  1935. new RebaseTodoFile(this).writeRebaseTodoFile(path, steps, append);
  1936. }
  1937. /**
  1938. * Get the names of all known remotes
  1939. *
  1940. * @return the names of all known remotes
  1941. * @since 3.4
  1942. */
  1943. @NonNull
  1944. public Set<String> getRemoteNames() {
  1945. return getConfig()
  1946. .getSubsections(ConfigConstants.CONFIG_REMOTE_SECTION);
  1947. }
  1948. /**
  1949. * Check whether any housekeeping is required; if yes, run garbage
  1950. * collection; if not, exit without performing any work. Some JGit commands
  1951. * run autoGC after performing operations that could create many loose
  1952. * objects.
  1953. * <p>
  1954. * Currently this option is supported for repositories of type
  1955. * {@code FileRepository} only. See
  1956. * {@link org.eclipse.jgit.internal.storage.file.GC#setAuto(boolean)} for
  1957. * configuration details.
  1958. *
  1959. * @param monitor
  1960. * to report progress
  1961. * @since 4.6
  1962. */
  1963. public void autoGC(ProgressMonitor monitor) {
  1964. // default does nothing
  1965. }
  1966. }