You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

Repository.java 38KB

Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014101510161017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041104210431044104510461047104810491050105110521053105410551056105710581059106010611062106310641065106610671068106910701071107210731074107510761077107810791080108110821083108410851086108710881089109010911092109310941095109610971098109911001101110211031104110511061107110811091110111111121113111411151116111711181119112011211122112311241125112611271128112911301131113211331134113511361137113811391140114111421143114411451146114711481149115011511152115311541155115611571158115911601161116211631164116511661167116811691170117111721173117411751176117711781179118011811182118311841185118611871188118911901191119211931194119511961197119811991200120112021203120412051206120712081209121012111212121312141215121612171218121912201221122212231224122512261227122812291230123112321233123412351236123712381239124012411242124312441245124612471248124912501251125212531254125512561257125812591260126112621263126412651266126712681269127012711272127312741275127612771278127912801281128212831284128512861287128812891290129112921293129412951296129712981299130013011302130313041305130613071308130913101311131213131314131513161317131813191320132113221323132413251326132713281329133013311332133313341335133613371338133913401341
  1. /*
  2. * Copyright (C) 2007, Dave Watson <dwatson@mimvista.com>
  3. * Copyright (C) 2008-2010, Google Inc.
  4. * Copyright (C) 2006-2010, Robin Rosenberg <robin.rosenberg@dewire.com>
  5. * Copyright (C) 2006-2008, Shawn O. Pearce <spearce@spearce.org>
  6. * and other copyright owners as documented in the project's IP log.
  7. *
  8. * This program and the accompanying materials are made available
  9. * under the terms of the Eclipse Distribution License v1.0 which
  10. * accompanies this distribution, is reproduced below, and is
  11. * available at http://www.eclipse.org/org/documents/edl-v10.php
  12. *
  13. * All rights reserved.
  14. *
  15. * Redistribution and use in source and binary forms, with or
  16. * without modification, are permitted provided that the following
  17. * conditions are met:
  18. *
  19. * - Redistributions of source code must retain the above copyright
  20. * notice, this list of conditions and the following disclaimer.
  21. *
  22. * - Redistributions in binary form must reproduce the above
  23. * copyright notice, this list of conditions and the following
  24. * disclaimer in the documentation and/or other materials provided
  25. * with the distribution.
  26. *
  27. * - Neither the name of the Eclipse Foundation, Inc. nor the
  28. * names of its contributors may be used to endorse or promote
  29. * products derived from this software without specific prior
  30. * written permission.
  31. *
  32. * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
  33. * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
  34. * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  35. * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  36. * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
  37. * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  38. * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  39. * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  40. * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
  41. * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
  42. * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
  43. * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
  44. * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  45. */
  46. package org.eclipse.jgit.lib;
  47. import java.io.File;
  48. import java.io.IOException;
  49. import java.util.ArrayList;
  50. import java.util.Collection;
  51. import java.util.Collections;
  52. import java.util.HashMap;
  53. import java.util.HashSet;
  54. import java.util.LinkedList;
  55. import java.util.List;
  56. import java.util.Map;
  57. import java.util.Set;
  58. import java.util.Vector;
  59. import java.util.concurrent.atomic.AtomicInteger;
  60. import org.eclipse.jgit.dircache.DirCache;
  61. import org.eclipse.jgit.errors.ConfigInvalidException;
  62. import org.eclipse.jgit.errors.IncorrectObjectTypeException;
  63. import org.eclipse.jgit.errors.RevisionSyntaxException;
  64. import org.eclipse.jgit.util.FS;
  65. import org.eclipse.jgit.util.SystemReader;
  66. /**
  67. * Represents a Git repository. A repository holds all objects and refs used for
  68. * managing source code (could by any type of file, but source code is what
  69. * SCM's are typically used for).
  70. *
  71. * In Git terms all data is stored in GIT_DIR, typically a directory called
  72. * .git. A work tree is maintained unless the repository is a bare repository.
  73. * Typically the .git directory is located at the root of the work dir.
  74. *
  75. * <ul>
  76. * <li>GIT_DIR
  77. * <ul>
  78. * <li>objects/ - objects</li>
  79. * <li>refs/ - tags and heads</li>
  80. * <li>config - configuration</li>
  81. * <li>info/ - more configurations</li>
  82. * </ul>
  83. * </li>
  84. * </ul>
  85. * <p>
  86. * This class is thread-safe.
  87. * <p>
  88. * This implementation only handles a subtly undocumented subset of git features.
  89. *
  90. */
  91. public class Repository {
  92. private final AtomicInteger useCnt = new AtomicInteger(1);
  93. private final File gitDir;
  94. private final FileBasedConfig userConfig;
  95. private final RepositoryConfig config;
  96. private final RefDatabase refs;
  97. private final ObjectDirectory objectDatabase;
  98. private GitIndex index;
  99. private final List<RepositoryListener> listeners = new Vector<RepositoryListener>(); // thread safe
  100. static private final List<RepositoryListener> allListeners = new Vector<RepositoryListener>(); // thread safe
  101. private File workDir;
  102. private File indexFile;
  103. /**
  104. * Construct a representation of a Git repository.
  105. *
  106. * The work tree, object directory, alternate object directories and index
  107. * file locations are deduced from the given git directory and the default
  108. * rules.
  109. *
  110. * @param d
  111. * GIT_DIR (the location of the repository metadata).
  112. * @throws IOException
  113. * the repository appears to already exist but cannot be
  114. * accessed.
  115. */
  116. public Repository(final File d) throws IOException {
  117. this(d, null, null, null, null); // go figure it out
  118. }
  119. /**
  120. * Construct a representation of a Git repository.
  121. *
  122. * The work tree, object directory, alternate object directories and index
  123. * file locations are deduced from the given git directory and the default
  124. * rules.
  125. *
  126. * @param d
  127. * GIT_DIR (the location of the repository metadata). May be
  128. * null work workTree is set
  129. * @param workTree
  130. * GIT_WORK_TREE (the root of the checkout). May be null for
  131. * default value.
  132. * @throws IOException
  133. * the repository appears to already exist but cannot be
  134. * accessed.
  135. */
  136. public Repository(final File d, final File workTree) throws IOException {
  137. this(d, workTree, null, null, null); // go figure it out
  138. }
  139. /**
  140. * Construct a representation of a Git repository using the given parameters
  141. * possibly overriding default conventions.
  142. *
  143. * @param d
  144. * GIT_DIR (the location of the repository metadata). May be null
  145. * for default value in which case it depends on GIT_WORK_TREE.
  146. * @param workTree
  147. * GIT_WORK_TREE (the root of the checkout). May be null for
  148. * default value if GIT_DIR is
  149. * @param objectDir
  150. * GIT_OBJECT_DIRECTORY (where objects and are stored). May be
  151. * null for default value. Relative names ares resolved against
  152. * GIT_WORK_TREE
  153. * @param alternateObjectDir
  154. * GIT_ALTERNATE_OBJECT_DIRECTORIES (where more objects are read
  155. * from). May be null for default value. Relative names ares
  156. * resolved against GIT_WORK_TREE
  157. * @param indexFile
  158. * GIT_INDEX_FILE (the location of the index file). May be null
  159. * for default value. Relative names ares resolved against
  160. * GIT_WORK_TREE.
  161. * @throws IOException
  162. * the repository appears to already exist but cannot be
  163. * accessed.
  164. */
  165. public Repository(final File d, final File workTree, final File objectDir,
  166. final File[] alternateObjectDir, final File indexFile) throws IOException {
  167. if (workTree != null) {
  168. workDir = workTree;
  169. if (d == null)
  170. gitDir = new File(workTree, Constants.DOT_GIT);
  171. else
  172. gitDir = d;
  173. } else {
  174. if (d != null)
  175. gitDir = d;
  176. else
  177. throw new IllegalArgumentException("Either GIT_DIR or GIT_WORK_TREE must be passed to Repository constructor");
  178. }
  179. userConfig = SystemReader.getInstance().openUserConfig();
  180. config = new RepositoryConfig(userConfig, FS.resolve(gitDir, "config"));
  181. loadUserConfig();
  182. loadConfig();
  183. if (workDir == null) {
  184. String workTreeConfig = getConfig().getString("core", null, "worktree");
  185. if (workTreeConfig != null) {
  186. workDir = FS.resolve(d, workTreeConfig);
  187. } else {
  188. workDir = gitDir.getParentFile();
  189. }
  190. }
  191. refs = new RefDirectory(this);
  192. if (objectDir != null)
  193. objectDatabase = new ObjectDirectory(FS.resolve(objectDir, ""),
  194. alternateObjectDir);
  195. else
  196. objectDatabase = new ObjectDirectory(FS.resolve(gitDir, "objects"),
  197. alternateObjectDir);
  198. if (indexFile != null)
  199. this.indexFile = indexFile;
  200. else
  201. this.indexFile = new File(gitDir, "index");
  202. if (objectDatabase.exists()) {
  203. final String repositoryFormatVersion = getConfig().getString(
  204. "core", null, "repositoryFormatVersion");
  205. if (!"0".equals(repositoryFormatVersion)) {
  206. throw new IOException("Unknown repository format \""
  207. + repositoryFormatVersion + "\"; expected \"0\".");
  208. }
  209. }
  210. }
  211. private void loadUserConfig() throws IOException {
  212. try {
  213. userConfig.load();
  214. } catch (ConfigInvalidException e1) {
  215. IOException e2 = new IOException("User config file "
  216. + userConfig.getFile().getAbsolutePath() + " invalid: "
  217. + e1);
  218. e2.initCause(e1);
  219. throw e2;
  220. }
  221. }
  222. private void loadConfig() throws IOException {
  223. try {
  224. config.load();
  225. } catch (ConfigInvalidException e1) {
  226. IOException e2 = new IOException("Unknown repository format");
  227. e2.initCause(e1);
  228. throw e2;
  229. }
  230. }
  231. /**
  232. * Create a new Git repository initializing the necessary files and
  233. * directories. Repository with working tree is created using this method.
  234. *
  235. * @throws IOException
  236. * @see #create(boolean)
  237. */
  238. public synchronized void create() throws IOException {
  239. create(false);
  240. }
  241. /**
  242. * Create a new Git repository initializing the necessary files and
  243. * directories.
  244. *
  245. * @param bare
  246. * if true, a bare repository is created.
  247. *
  248. * @throws IOException
  249. * in case of IO problem
  250. */
  251. public void create(boolean bare) throws IOException {
  252. final RepositoryConfig cfg = getConfig();
  253. if (cfg.getFile().exists()) {
  254. throw new IllegalStateException("Repository already exists: "
  255. + gitDir);
  256. }
  257. gitDir.mkdirs();
  258. refs.create();
  259. objectDatabase.create();
  260. new File(gitDir, "branches").mkdir();
  261. RefUpdate head = updateRef(Constants.HEAD);
  262. head.disableRefLog();
  263. head.link(Constants.R_HEADS + Constants.MASTER);
  264. cfg.setInt("core", null, "repositoryformatversion", 0);
  265. cfg.setBoolean("core", null, "filemode", true);
  266. if (bare)
  267. cfg.setBoolean("core", null, "bare", true);
  268. cfg.setBoolean("core", null, "logallrefupdates", !bare);
  269. cfg.setBoolean("core", null, "autocrlf", false);
  270. cfg.save();
  271. }
  272. /**
  273. * @return GIT_DIR
  274. */
  275. public File getDirectory() {
  276. return gitDir;
  277. }
  278. /**
  279. * @return the directory containing the objects owned by this repository.
  280. */
  281. public File getObjectsDirectory() {
  282. return objectDatabase.getDirectory();
  283. }
  284. /**
  285. * @return the object database which stores this repository's data.
  286. */
  287. public ObjectDatabase getObjectDatabase() {
  288. return objectDatabase;
  289. }
  290. /** @return the reference database which stores the reference namespace. */
  291. public RefDatabase getRefDatabase() {
  292. return refs;
  293. }
  294. /**
  295. * @return the configuration of this repository
  296. */
  297. public RepositoryConfig getConfig() {
  298. if (userConfig.isOutdated()) {
  299. try {
  300. loadUserConfig();
  301. } catch (IOException e) {
  302. throw new RuntimeException(e);
  303. }
  304. }
  305. if (config.isOutdated()) {
  306. try {
  307. loadConfig();
  308. } catch (IOException e) {
  309. throw new RuntimeException(e);
  310. }
  311. }
  312. return config;
  313. }
  314. /**
  315. * Construct a filename where the loose object having a specified SHA-1
  316. * should be stored. If the object is stored in a shared repository the path
  317. * to the alternative repo will be returned. If the object is not yet store
  318. * a usable path in this repo will be returned. It is assumed that callers
  319. * will look for objects in a pack first.
  320. *
  321. * @param objectId
  322. * @return suggested file name
  323. */
  324. public File toFile(final AnyObjectId objectId) {
  325. return objectDatabase.fileFor(objectId);
  326. }
  327. /**
  328. * @param objectId
  329. * @return true if the specified object is stored in this repo or any of the
  330. * known shared repositories.
  331. */
  332. public boolean hasObject(final AnyObjectId objectId) {
  333. return objectDatabase.hasObject(objectId);
  334. }
  335. /**
  336. * @param id
  337. * SHA-1 of an object.
  338. *
  339. * @return a {@link ObjectLoader} for accessing the data of the named
  340. * object, or null if the object does not exist.
  341. * @throws IOException
  342. */
  343. public ObjectLoader openObject(final AnyObjectId id)
  344. throws IOException {
  345. final WindowCursor wc = new WindowCursor();
  346. try {
  347. return openObject(wc, id);
  348. } finally {
  349. wc.release();
  350. }
  351. }
  352. /**
  353. * @param curs
  354. * temporary working space associated with the calling thread.
  355. * @param id
  356. * SHA-1 of an object.
  357. *
  358. * @return a {@link ObjectLoader} for accessing the data of the named
  359. * object, or null if the object does not exist.
  360. * @throws IOException
  361. */
  362. public ObjectLoader openObject(final WindowCursor curs, final AnyObjectId id)
  363. throws IOException {
  364. return objectDatabase.openObject(curs, id);
  365. }
  366. /**
  367. * Open object in all packs containing specified object.
  368. *
  369. * @param objectId
  370. * id of object to search for
  371. * @param curs
  372. * temporary working space associated with the calling thread.
  373. * @return collection of loaders for this object, from all packs containing
  374. * this object
  375. * @throws IOException
  376. */
  377. public Collection<PackedObjectLoader> openObjectInAllPacks(
  378. final AnyObjectId objectId, final WindowCursor curs)
  379. throws IOException {
  380. Collection<PackedObjectLoader> result = new LinkedList<PackedObjectLoader>();
  381. openObjectInAllPacks(objectId, result, curs);
  382. return result;
  383. }
  384. /**
  385. * Open object in all packs containing specified object.
  386. *
  387. * @param objectId
  388. * id of object to search for
  389. * @param resultLoaders
  390. * result collection of loaders for this object, filled with
  391. * loaders from all packs containing specified object
  392. * @param curs
  393. * temporary working space associated with the calling thread.
  394. * @throws IOException
  395. */
  396. void openObjectInAllPacks(final AnyObjectId objectId,
  397. final Collection<PackedObjectLoader> resultLoaders,
  398. final WindowCursor curs) throws IOException {
  399. objectDatabase.openObjectInAllPacks(resultLoaders, curs, objectId);
  400. }
  401. /**
  402. * @param id
  403. * SHA'1 of a blob
  404. * @return an {@link ObjectLoader} for accessing the data of a named blob
  405. * @throws IOException
  406. */
  407. public ObjectLoader openBlob(final ObjectId id) throws IOException {
  408. return openObject(id);
  409. }
  410. /**
  411. * @param id
  412. * SHA'1 of a tree
  413. * @return an {@link ObjectLoader} for accessing the data of a named tree
  414. * @throws IOException
  415. */
  416. public ObjectLoader openTree(final ObjectId id) throws IOException {
  417. return openObject(id);
  418. }
  419. /**
  420. * Access a Commit object using a symbolic reference. This reference may
  421. * be a SHA-1 or ref in combination with a number of symbols translating
  422. * from one ref or SHA1-1 to another, such as HEAD^ etc.
  423. *
  424. * @param revstr a reference to a git commit object
  425. * @return a Commit named by the specified string
  426. * @throws IOException for I/O error or unexpected object type.
  427. *
  428. * @see #resolve(String)
  429. */
  430. public Commit mapCommit(final String revstr) throws IOException {
  431. final ObjectId id = resolve(revstr);
  432. return id != null ? mapCommit(id) : null;
  433. }
  434. /**
  435. * Access any type of Git object by id and
  436. *
  437. * @param id
  438. * SHA-1 of object to read
  439. * @param refName optional, only relevant for simple tags
  440. * @return The Git object if found or null
  441. * @throws IOException
  442. */
  443. public Object mapObject(final ObjectId id, final String refName) throws IOException {
  444. final ObjectLoader or = openObject(id);
  445. if (or == null)
  446. return null;
  447. final byte[] raw = or.getBytes();
  448. switch (or.getType()) {
  449. case Constants.OBJ_TREE:
  450. return makeTree(id, raw);
  451. case Constants.OBJ_COMMIT:
  452. return makeCommit(id, raw);
  453. case Constants.OBJ_TAG:
  454. return makeTag(id, refName, raw);
  455. case Constants.OBJ_BLOB:
  456. return raw;
  457. default:
  458. throw new IncorrectObjectTypeException(id,
  459. "COMMIT nor TREE nor BLOB nor TAG");
  460. }
  461. }
  462. /**
  463. * Access a Commit by SHA'1 id.
  464. * @param id
  465. * @return Commit or null
  466. * @throws IOException for I/O error or unexpected object type.
  467. */
  468. public Commit mapCommit(final ObjectId id) throws IOException {
  469. final ObjectLoader or = openObject(id);
  470. if (or == null)
  471. return null;
  472. final byte[] raw = or.getBytes();
  473. if (Constants.OBJ_COMMIT == or.getType())
  474. return new Commit(this, id, raw);
  475. throw new IncorrectObjectTypeException(id, Constants.TYPE_COMMIT);
  476. }
  477. private Commit makeCommit(final ObjectId id, final byte[] raw) {
  478. Commit ret = new Commit(this, id, raw);
  479. return ret;
  480. }
  481. /**
  482. * Access a Tree object using a symbolic reference. This reference may
  483. * be a SHA-1 or ref in combination with a number of symbols translating
  484. * from one ref or SHA1-1 to another, such as HEAD^{tree} etc.
  485. *
  486. * @param revstr a reference to a git commit object
  487. * @return a Tree named by the specified string
  488. * @throws IOException
  489. *
  490. * @see #resolve(String)
  491. */
  492. public Tree mapTree(final String revstr) throws IOException {
  493. final ObjectId id = resolve(revstr);
  494. return id != null ? mapTree(id) : null;
  495. }
  496. /**
  497. * Access a Tree by SHA'1 id.
  498. * @param id
  499. * @return Tree or null
  500. * @throws IOException for I/O error or unexpected object type.
  501. */
  502. public Tree mapTree(final ObjectId id) throws IOException {
  503. final ObjectLoader or = openObject(id);
  504. if (or == null)
  505. return null;
  506. final byte[] raw = or.getBytes();
  507. switch (or.getType()) {
  508. case Constants.OBJ_TREE:
  509. return new Tree(this, id, raw);
  510. case Constants.OBJ_COMMIT:
  511. return mapTree(ObjectId.fromString(raw, 5));
  512. default:
  513. throw new IncorrectObjectTypeException(id, Constants.TYPE_TREE);
  514. }
  515. }
  516. private Tree makeTree(final ObjectId id, final byte[] raw) throws IOException {
  517. Tree ret = new Tree(this, id, raw);
  518. return ret;
  519. }
  520. private Tag makeTag(final ObjectId id, final String refName, final byte[] raw) {
  521. Tag ret = new Tag(this, id, refName, raw);
  522. return ret;
  523. }
  524. /**
  525. * Access a tag by symbolic name.
  526. *
  527. * @param revstr
  528. * @return a Tag or null
  529. * @throws IOException on I/O error or unexpected type
  530. */
  531. public Tag mapTag(String revstr) throws IOException {
  532. final ObjectId id = resolve(revstr);
  533. return id != null ? mapTag(revstr, id) : null;
  534. }
  535. /**
  536. * Access a Tag by SHA'1 id
  537. * @param refName
  538. * @param id
  539. * @return Commit or null
  540. * @throws IOException for I/O error or unexpected object type.
  541. */
  542. public Tag mapTag(final String refName, final ObjectId id) throws IOException {
  543. final ObjectLoader or = openObject(id);
  544. if (or == null)
  545. return null;
  546. final byte[] raw = or.getBytes();
  547. if (Constants.OBJ_TAG == or.getType())
  548. return new Tag(this, id, refName, raw);
  549. return new Tag(this, id, refName, null);
  550. }
  551. /**
  552. * Create a command to update, create or delete a ref in this repository.
  553. *
  554. * @param ref
  555. * name of the ref the caller wants to modify.
  556. * @return an update command. The caller must finish populating this command
  557. * and then invoke one of the update methods to actually make a
  558. * change.
  559. * @throws IOException
  560. * a symbolic ref was passed in and could not be resolved back
  561. * to the base ref, as the symbolic ref could not be read.
  562. */
  563. public RefUpdate updateRef(final String ref) throws IOException {
  564. return updateRef(ref, false);
  565. }
  566. /**
  567. * Create a command to update, create or delete a ref in this repository.
  568. *
  569. * @param ref
  570. * name of the ref the caller wants to modify.
  571. * @param detach
  572. * true to create a detached head
  573. * @return an update command. The caller must finish populating this command
  574. * and then invoke one of the update methods to actually make a
  575. * change.
  576. * @throws IOException
  577. * a symbolic ref was passed in and could not be resolved back
  578. * to the base ref, as the symbolic ref could not be read.
  579. */
  580. public RefUpdate updateRef(final String ref, final boolean detach) throws IOException {
  581. return refs.newUpdate(ref, detach);
  582. }
  583. /**
  584. * Create a command to rename a ref in this repository
  585. *
  586. * @param fromRef
  587. * name of ref to rename from
  588. * @param toRef
  589. * name of ref to rename to
  590. * @return an update command that knows how to rename a branch to another.
  591. * @throws IOException
  592. * the rename could not be performed.
  593. *
  594. */
  595. public RefRename renameRef(final String fromRef, final String toRef) throws IOException {
  596. return refs.newRename(fromRef, toRef);
  597. }
  598. /**
  599. * Parse a git revision string and return an object id.
  600. *
  601. * Currently supported is combinations of these.
  602. * <ul>
  603. * <li>SHA-1 - a SHA-1</li>
  604. * <li>refs/... - a ref name</li>
  605. * <li>ref^n - nth parent reference</li>
  606. * <li>ref~n - distance via parent reference</li>
  607. * <li>ref@{n} - nth version of ref</li>
  608. * <li>ref^{tree} - tree references by ref</li>
  609. * <li>ref^{commit} - commit references by ref</li>
  610. * </ul>
  611. *
  612. * Not supported is
  613. * <ul>
  614. * <li>timestamps in reflogs, ref@{full or relative timestamp}</li>
  615. * <li>abbreviated SHA-1's</li>
  616. * </ul>
  617. *
  618. * @param revstr A git object references expression
  619. * @return an ObjectId or null if revstr can't be resolved to any ObjectId
  620. * @throws IOException on serious errors
  621. */
  622. public ObjectId resolve(final String revstr) throws IOException {
  623. char[] rev = revstr.toCharArray();
  624. Object ref = null;
  625. ObjectId refId = null;
  626. for (int i = 0; i < rev.length; ++i) {
  627. switch (rev[i]) {
  628. case '^':
  629. if (refId == null) {
  630. String refstr = new String(rev,0,i);
  631. refId = resolveSimple(refstr);
  632. if (refId == null)
  633. return null;
  634. }
  635. if (i + 1 < rev.length) {
  636. switch (rev[i + 1]) {
  637. case '0':
  638. case '1':
  639. case '2':
  640. case '3':
  641. case '4':
  642. case '5':
  643. case '6':
  644. case '7':
  645. case '8':
  646. case '9':
  647. int j;
  648. ref = mapObject(refId, null);
  649. while (ref instanceof Tag) {
  650. Tag tag = (Tag)ref;
  651. refId = tag.getObjId();
  652. ref = mapObject(refId, null);
  653. }
  654. if (!(ref instanceof Commit))
  655. throw new IncorrectObjectTypeException(refId, Constants.TYPE_COMMIT);
  656. for (j=i+1; j<rev.length; ++j) {
  657. if (!Character.isDigit(rev[j]))
  658. break;
  659. }
  660. String parentnum = new String(rev, i+1, j-i-1);
  661. int pnum;
  662. try {
  663. pnum = Integer.parseInt(parentnum);
  664. } catch (NumberFormatException e) {
  665. throw new RevisionSyntaxException(
  666. "Invalid commit parent number",
  667. revstr);
  668. }
  669. if (pnum != 0) {
  670. final ObjectId parents[] = ((Commit) ref)
  671. .getParentIds();
  672. if (pnum > parents.length)
  673. refId = null;
  674. else
  675. refId = parents[pnum - 1];
  676. }
  677. i = j - 1;
  678. break;
  679. case '{':
  680. int k;
  681. String item = null;
  682. for (k=i+2; k<rev.length; ++k) {
  683. if (rev[k] == '}') {
  684. item = new String(rev, i+2, k-i-2);
  685. break;
  686. }
  687. }
  688. i = k;
  689. if (item != null)
  690. if (item.equals("tree")) {
  691. ref = mapObject(refId, null);
  692. while (ref instanceof Tag) {
  693. Tag t = (Tag)ref;
  694. refId = t.getObjId();
  695. ref = mapObject(refId, null);
  696. }
  697. if (ref instanceof Treeish)
  698. refId = ((Treeish)ref).getTreeId();
  699. else
  700. throw new IncorrectObjectTypeException(refId, Constants.TYPE_TREE);
  701. }
  702. else if (item.equals("commit")) {
  703. ref = mapObject(refId, null);
  704. while (ref instanceof Tag) {
  705. Tag t = (Tag)ref;
  706. refId = t.getObjId();
  707. ref = mapObject(refId, null);
  708. }
  709. if (!(ref instanceof Commit))
  710. throw new IncorrectObjectTypeException(refId, Constants.TYPE_COMMIT);
  711. }
  712. else if (item.equals("blob")) {
  713. ref = mapObject(refId, null);
  714. while (ref instanceof Tag) {
  715. Tag t = (Tag)ref;
  716. refId = t.getObjId();
  717. ref = mapObject(refId, null);
  718. }
  719. if (!(ref instanceof byte[]))
  720. throw new IncorrectObjectTypeException(refId, Constants.TYPE_BLOB);
  721. }
  722. else if (item.equals("")) {
  723. ref = mapObject(refId, null);
  724. while (ref instanceof Tag) {
  725. Tag t = (Tag)ref;
  726. refId = t.getObjId();
  727. ref = mapObject(refId, null);
  728. }
  729. }
  730. else
  731. throw new RevisionSyntaxException(revstr);
  732. else
  733. throw new RevisionSyntaxException(revstr);
  734. break;
  735. default:
  736. ref = mapObject(refId, null);
  737. if (ref instanceof Commit) {
  738. final ObjectId parents[] = ((Commit) ref)
  739. .getParentIds();
  740. if (parents.length == 0)
  741. refId = null;
  742. else
  743. refId = parents[0];
  744. } else
  745. throw new IncorrectObjectTypeException(refId, Constants.TYPE_COMMIT);
  746. }
  747. } else {
  748. ref = mapObject(refId, null);
  749. while (ref instanceof Tag) {
  750. Tag tag = (Tag)ref;
  751. refId = tag.getObjId();
  752. ref = mapObject(refId, null);
  753. }
  754. if (ref instanceof Commit) {
  755. final ObjectId parents[] = ((Commit) ref)
  756. .getParentIds();
  757. if (parents.length == 0)
  758. refId = null;
  759. else
  760. refId = parents[0];
  761. } else
  762. throw new IncorrectObjectTypeException(refId, Constants.TYPE_COMMIT);
  763. }
  764. break;
  765. case '~':
  766. if (ref == null) {
  767. String refstr = new String(rev,0,i);
  768. refId = resolveSimple(refstr);
  769. if (refId == null)
  770. return null;
  771. ref = mapObject(refId, null);
  772. }
  773. while (ref instanceof Tag) {
  774. Tag tag = (Tag)ref;
  775. refId = tag.getObjId();
  776. ref = mapObject(refId, null);
  777. }
  778. if (!(ref instanceof Commit))
  779. throw new IncorrectObjectTypeException(refId, Constants.TYPE_COMMIT);
  780. int l;
  781. for (l = i + 1; l < rev.length; ++l) {
  782. if (!Character.isDigit(rev[l]))
  783. break;
  784. }
  785. String distnum = new String(rev, i+1, l-i-1);
  786. int dist;
  787. try {
  788. dist = Integer.parseInt(distnum);
  789. } catch (NumberFormatException e) {
  790. throw new RevisionSyntaxException(
  791. "Invalid ancestry length", revstr);
  792. }
  793. while (dist > 0) {
  794. final ObjectId[] parents = ((Commit) ref).getParentIds();
  795. if (parents.length == 0) {
  796. refId = null;
  797. break;
  798. }
  799. refId = parents[0];
  800. ref = mapCommit(refId);
  801. --dist;
  802. }
  803. i = l - 1;
  804. break;
  805. case '@':
  806. int m;
  807. String time = null;
  808. for (m=i+2; m<rev.length; ++m) {
  809. if (rev[m] == '}') {
  810. time = new String(rev, i+2, m-i-2);
  811. break;
  812. }
  813. }
  814. if (time != null)
  815. throw new RevisionSyntaxException("reflogs not yet supported by revision parser", revstr);
  816. i = m - 1;
  817. break;
  818. default:
  819. if (refId != null)
  820. throw new RevisionSyntaxException(revstr);
  821. }
  822. }
  823. if (refId == null)
  824. refId = resolveSimple(revstr);
  825. return refId;
  826. }
  827. private ObjectId resolveSimple(final String revstr) throws IOException {
  828. if (ObjectId.isId(revstr))
  829. return ObjectId.fromString(revstr);
  830. final Ref r = refs.getRef(revstr);
  831. return r != null ? r.getObjectId() : null;
  832. }
  833. /** Increment the use counter by one, requiring a matched {@link #close()}. */
  834. public void incrementOpen() {
  835. useCnt.incrementAndGet();
  836. }
  837. /**
  838. * Close all resources used by this repository
  839. */
  840. public void close() {
  841. if (useCnt.decrementAndGet() == 0) {
  842. objectDatabase.close();
  843. refs.close();
  844. }
  845. }
  846. /**
  847. * Add a single existing pack to the list of available pack files.
  848. *
  849. * @param pack
  850. * path of the pack file to open.
  851. * @param idx
  852. * path of the corresponding index file.
  853. * @throws IOException
  854. * index file could not be opened, read, or is not recognized as
  855. * a Git pack file index.
  856. */
  857. public void openPack(final File pack, final File idx) throws IOException {
  858. objectDatabase.openPack(pack, idx);
  859. }
  860. public String toString() {
  861. return "Repository[" + getDirectory() + "]";
  862. }
  863. /**
  864. * Get the name of the reference that {@code HEAD} points to.
  865. * <p>
  866. * This is essentially the same as doing:
  867. *
  868. * <pre>
  869. * return getRef(Constants.HEAD).getTarget().getName()
  870. * </pre>
  871. *
  872. * Except when HEAD is detached, in which case this method returns the
  873. * current ObjectId in hexadecimal string format.
  874. *
  875. * @return name of current branch (for example {@code refs/heads/master}) or
  876. * an ObjectId in hex format if the current branch is detached.
  877. * @throws IOException
  878. */
  879. public String getFullBranch() throws IOException {
  880. Ref head = getRef(Constants.HEAD);
  881. if (head == null)
  882. return null;
  883. if (head.isSymbolic())
  884. return head.getTarget().getName();
  885. if (head.getObjectId() != null)
  886. return head.getObjectId().name();
  887. return null;
  888. }
  889. /**
  890. * Get the short name of the current branch that {@code HEAD} points to.
  891. * <p>
  892. * This is essentially the same as {@link #getFullBranch()}, except the
  893. * leading prefix {@code refs/heads/} is removed from the reference before
  894. * it is returned to the caller.
  895. *
  896. * @return name of current branch (for example {@code master}), or an
  897. * ObjectId in hex format if the current branch is detached.
  898. * @throws IOException
  899. */
  900. public String getBranch() throws IOException {
  901. String name = getFullBranch();
  902. if (name != null)
  903. return shortenRefName(name);
  904. return name;
  905. }
  906. /**
  907. * Get a ref by name.
  908. *
  909. * @param name
  910. * the name of the ref to lookup. May be a short-hand form, e.g.
  911. * "master" which is is automatically expanded to
  912. * "refs/heads/master" if "refs/heads/master" already exists.
  913. * @return the Ref with the given name, or null if it does not exist
  914. * @throws IOException
  915. */
  916. public Ref getRef(final String name) throws IOException {
  917. return refs.getRef(name);
  918. }
  919. /**
  920. * @return mutable map of all known refs (heads, tags, remotes).
  921. */
  922. public Map<String, Ref> getAllRefs() {
  923. try {
  924. return refs.getRefs(RefDatabase.ALL);
  925. } catch (IOException e) {
  926. return new HashMap<String, Ref>();
  927. }
  928. }
  929. /**
  930. * @return mutable map of all tags; key is short tag name ("v1.0") and value
  931. * of the entry contains the ref with the full tag name
  932. * ("refs/tags/v1.0").
  933. */
  934. public Map<String, Ref> getTags() {
  935. try {
  936. return refs.getRefs(Constants.R_TAGS);
  937. } catch (IOException e) {
  938. return new HashMap<String, Ref>();
  939. }
  940. }
  941. /**
  942. * Peel a possibly unpeeled reference to an annotated tag.
  943. * <p>
  944. * If the ref cannot be peeled (as it does not refer to an annotated tag)
  945. * the peeled id stays null, but {@link Ref#isPeeled()} will be true.
  946. *
  947. * @param ref
  948. * The ref to peel
  949. * @return <code>ref</code> if <code>ref.isPeeled()</code> is true; else a
  950. * new Ref object representing the same data as Ref, but isPeeled()
  951. * will be true and getPeeledObjectId will contain the peeled object
  952. * (or null).
  953. */
  954. public Ref peel(final Ref ref) {
  955. try {
  956. return refs.peel(ref);
  957. } catch (IOException e) {
  958. // Historical accident; if the reference cannot be peeled due
  959. // to some sort of repository access problem we claim that the
  960. // same as if the reference was not an annotated tag.
  961. return ref;
  962. }
  963. }
  964. /**
  965. * @return a map with all objects referenced by a peeled ref.
  966. */
  967. public Map<AnyObjectId, Set<Ref>> getAllRefsByPeeledObjectId() {
  968. Map<String, Ref> allRefs = getAllRefs();
  969. Map<AnyObjectId, Set<Ref>> ret = new HashMap<AnyObjectId, Set<Ref>>(allRefs.size());
  970. for (Ref ref : allRefs.values()) {
  971. ref = peel(ref);
  972. AnyObjectId target = ref.getPeeledObjectId();
  973. if (target == null)
  974. target = ref.getObjectId();
  975. // We assume most Sets here are singletons
  976. Set<Ref> oset = ret.put(target, Collections.singleton(ref));
  977. if (oset != null) {
  978. // that was not the case (rare)
  979. if (oset.size() == 1) {
  980. // Was a read-only singleton, we must copy to a new Set
  981. oset = new HashSet<Ref>(oset);
  982. }
  983. ret.put(target, oset);
  984. oset.add(ref);
  985. }
  986. }
  987. return ret;
  988. }
  989. /**
  990. * @return a representation of the index associated with this repo
  991. * @throws IOException
  992. */
  993. public GitIndex getIndex() throws IOException {
  994. if (index == null) {
  995. index = new GitIndex(this);
  996. index.read();
  997. } else {
  998. index.rereadIfNecessary();
  999. }
  1000. return index;
  1001. }
  1002. /**
  1003. * @return the index file location
  1004. */
  1005. public File getIndexFile() {
  1006. return indexFile;
  1007. }
  1008. static byte[] gitInternalSlash(byte[] bytes) {
  1009. if (File.separatorChar == '/')
  1010. return bytes;
  1011. for (int i=0; i<bytes.length; ++i)
  1012. if (bytes[i] == File.separatorChar)
  1013. bytes[i] = '/';
  1014. return bytes;
  1015. }
  1016. /**
  1017. * @return an important state
  1018. */
  1019. public RepositoryState getRepositoryState() {
  1020. // Pre Git-1.6 logic
  1021. if (new File(getWorkDir(), ".dotest").exists())
  1022. return RepositoryState.REBASING;
  1023. if (new File(gitDir,".dotest-merge").exists())
  1024. return RepositoryState.REBASING_INTERACTIVE;
  1025. // From 1.6 onwards
  1026. if (new File(getDirectory(),"rebase-apply/rebasing").exists())
  1027. return RepositoryState.REBASING_REBASING;
  1028. if (new File(getDirectory(),"rebase-apply/applying").exists())
  1029. return RepositoryState.APPLY;
  1030. if (new File(getDirectory(),"rebase-apply").exists())
  1031. return RepositoryState.REBASING;
  1032. if (new File(getDirectory(),"rebase-merge/interactive").exists())
  1033. return RepositoryState.REBASING_INTERACTIVE;
  1034. if (new File(getDirectory(),"rebase-merge").exists())
  1035. return RepositoryState.REBASING_MERGE;
  1036. // Both versions
  1037. if (new File(gitDir, "MERGE_HEAD").exists()) {
  1038. // we are merging - now check whether we have unmerged paths
  1039. try {
  1040. if (!DirCache.read(this).hasUnmergedPaths()) {
  1041. // no unmerged paths -> return the MERGING_RESOLVED state
  1042. return RepositoryState.MERGING_RESOLVED;
  1043. }
  1044. } catch (IOException e) {
  1045. // Can't decide whether unmerged paths exists. Return
  1046. // MERGING state to be on the safe side (in state MERGING
  1047. // you are not allow to do anything)
  1048. e.printStackTrace();
  1049. }
  1050. return RepositoryState.MERGING;
  1051. }
  1052. if (new File(gitDir,"BISECT_LOG").exists())
  1053. return RepositoryState.BISECTING;
  1054. return RepositoryState.SAFE;
  1055. }
  1056. /**
  1057. * Check validity of a ref name. It must not contain character that has
  1058. * a special meaning in a Git object reference expression. Some other
  1059. * dangerous characters are also excluded.
  1060. *
  1061. * For portability reasons '\' is excluded
  1062. *
  1063. * @param refName
  1064. *
  1065. * @return true if refName is a valid ref name
  1066. */
  1067. public static boolean isValidRefName(final String refName) {
  1068. final int len = refName.length();
  1069. if (len == 0)
  1070. return false;
  1071. if (refName.endsWith(LockFile.SUFFIX))
  1072. return false;
  1073. int components = 1;
  1074. char p = '\0';
  1075. for (int i = 0; i < len; i++) {
  1076. final char c = refName.charAt(i);
  1077. if (c <= ' ')
  1078. return false;
  1079. switch (c) {
  1080. case '.':
  1081. switch (p) {
  1082. case '\0': case '/': case '.':
  1083. return false;
  1084. }
  1085. if (i == len -1)
  1086. return false;
  1087. break;
  1088. case '/':
  1089. if (i == 0 || i == len - 1)
  1090. return false;
  1091. components++;
  1092. break;
  1093. case '{':
  1094. if (p == '@')
  1095. return false;
  1096. break;
  1097. case '~': case '^': case ':':
  1098. case '?': case '[': case '*':
  1099. case '\\':
  1100. return false;
  1101. }
  1102. p = c;
  1103. }
  1104. return components > 1;
  1105. }
  1106. /**
  1107. * Strip work dir and return normalized repository path.
  1108. *
  1109. * @param workDir Work dir
  1110. * @param file File whose path shall be stripped of its workdir
  1111. * @return normalized repository relative path or the empty
  1112. * string if the file is not relative to the work directory.
  1113. */
  1114. public static String stripWorkDir(File workDir, File file) {
  1115. final String filePath = file.getPath();
  1116. final String workDirPath = workDir.getPath();
  1117. if (filePath.length() <= workDirPath.length() ||
  1118. filePath.charAt(workDirPath.length()) != File.separatorChar ||
  1119. !filePath.startsWith(workDirPath)) {
  1120. File absWd = workDir.isAbsolute() ? workDir : workDir.getAbsoluteFile();
  1121. File absFile = file.isAbsolute() ? file : file.getAbsoluteFile();
  1122. if (absWd == workDir && absFile == file)
  1123. return "";
  1124. return stripWorkDir(absWd, absFile);
  1125. }
  1126. String relName = filePath.substring(workDirPath.length() + 1);
  1127. if (File.separatorChar != '/')
  1128. relName = relName.replace(File.separatorChar, '/');
  1129. return relName;
  1130. }
  1131. /**
  1132. * @return the workdir file, i.e. where the files are checked out
  1133. */
  1134. public File getWorkDir() {
  1135. return workDir;
  1136. }
  1137. /**
  1138. * Override default workdir
  1139. *
  1140. * @param workTree
  1141. * the work tree directory
  1142. */
  1143. public void setWorkDir(File workTree) {
  1144. this.workDir = workTree;
  1145. }
  1146. /**
  1147. * Register a {@link RepositoryListener} which will be notified
  1148. * when ref changes are detected.
  1149. *
  1150. * @param l
  1151. */
  1152. public void addRepositoryChangedListener(final RepositoryListener l) {
  1153. listeners.add(l);
  1154. }
  1155. /**
  1156. * Remove a registered {@link RepositoryListener}
  1157. * @param l
  1158. */
  1159. public void removeRepositoryChangedListener(final RepositoryListener l) {
  1160. listeners.remove(l);
  1161. }
  1162. /**
  1163. * Register a global {@link RepositoryListener} which will be notified
  1164. * when a ref changes in any repository are detected.
  1165. *
  1166. * @param l
  1167. */
  1168. public static void addAnyRepositoryChangedListener(final RepositoryListener l) {
  1169. allListeners.add(l);
  1170. }
  1171. /**
  1172. * Remove a globally registered {@link RepositoryListener}
  1173. * @param l
  1174. */
  1175. public static void removeAnyRepositoryChangedListener(final RepositoryListener l) {
  1176. allListeners.remove(l);
  1177. }
  1178. void fireRefsChanged() {
  1179. final RefsChangedEvent event = new RefsChangedEvent(this);
  1180. List<RepositoryListener> all;
  1181. synchronized (listeners) {
  1182. all = new ArrayList<RepositoryListener>(listeners);
  1183. }
  1184. synchronized (allListeners) {
  1185. all.addAll(allListeners);
  1186. }
  1187. for (final RepositoryListener l : all) {
  1188. l.refsChanged(event);
  1189. }
  1190. }
  1191. void fireIndexChanged() {
  1192. final IndexChangedEvent event = new IndexChangedEvent(this);
  1193. List<RepositoryListener> all;
  1194. synchronized (listeners) {
  1195. all = new ArrayList<RepositoryListener>(listeners);
  1196. }
  1197. synchronized (allListeners) {
  1198. all.addAll(allListeners);
  1199. }
  1200. for (final RepositoryListener l : all) {
  1201. l.indexChanged(event);
  1202. }
  1203. }
  1204. /**
  1205. * Force a scan for changed refs.
  1206. *
  1207. * @throws IOException
  1208. */
  1209. public void scanForRepoChanges() throws IOException {
  1210. getAllRefs(); // This will look for changes to refs
  1211. getIndex(); // This will detect changes in the index
  1212. }
  1213. /**
  1214. * @param refName
  1215. *
  1216. * @return a more user friendly ref name
  1217. */
  1218. public String shortenRefName(String refName) {
  1219. if (refName.startsWith(Constants.R_HEADS))
  1220. return refName.substring(Constants.R_HEADS.length());
  1221. if (refName.startsWith(Constants.R_TAGS))
  1222. return refName.substring(Constants.R_TAGS.length());
  1223. if (refName.startsWith(Constants.R_REMOTES))
  1224. return refName.substring(Constants.R_REMOTES.length());
  1225. return refName;
  1226. }
  1227. /**
  1228. * @param refName
  1229. * @return a {@link ReflogReader} for the supplied refname, or null if the
  1230. * named ref does not exist.
  1231. * @throws IOException the ref could not be accessed.
  1232. */
  1233. public ReflogReader getReflogReader(String refName) throws IOException {
  1234. Ref ref = getRef(refName);
  1235. if (ref != null)
  1236. return new ReflogReader(this, ref.getName());
  1237. return null;
  1238. }
  1239. }