You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

T0003_BasicTest.java 28KB

Config: Rewrite subsection and value escaping and parsing Previously, Config was using the same method for both escaping and parsing subsection names and config values. The goal was presumably code savings, but unfortunately, these two pieces of the git config format are simply different. In git v2.15.1, Documentation/config.txt says the following about subsection names: "Subsection names are case sensitive and can contain any characters except newline (doublequote `"` and backslash can be included by escaping them as `\"` and `\\`, respectively). Section headers cannot span multiple lines. Variables may belong directly to a section or to a given subsection." And, later in the same documentation section, about values: "A line that defines a value can be continued to the next line by ending it with a `\`; the backquote and the end-of-line are stripped. Leading whitespaces after 'name =', the remainder of the line after the first comment character '#' or ';', and trailing whitespaces of the line are discarded unless they are enclosed in double quotes. Internal whitespaces within the value are retained verbatim. Inside double quotes, double quote `"` and backslash `\` characters must be escaped: use `\"` for `"` and `\\` for `\`. The following escape sequences (beside `\"` and `\\`) are recognized: `\n` for newline character (NL), `\t` for horizontal tabulation (HT, TAB) and `\b` for backspace (BS). Other char escape sequences (including octal escape sequences) are invalid." The main important differences are that subsection names have a limited set of supported escape sequences, and do not support newlines at all, either escaped or unescaped. Arguably, it would be easy to support escaped newlines, but C git simply does not: $ git config -f foo.config $'foo.bar\nbaz.quux' value error: invalid key (newline): foo.bar baz.quux I468106ac was an attempt to fix one bug in escapeValue, around leading whitespace, without having to rewrite the whole escaping/parsing code. Unfortunately, because escapeValue was used for escaping subsection names as well, this made it possible to write invalid config files, any time Config#toText is called with a subsection name with trailing whitespace, like {foo }. Rather than pile hacks on top of hacks, fix it for real by largely rewriting the escaping and parsing code. In addition to fixing escape sequences, fix (and write tests for) a few more issues in the old implementation: * Now that we can properly parse it, always emit newlines as "\n" from escapeValue, rather than the weird (but still supported) syntax with a non-quoted trailing literal "\n\" before the newline. In addition to producing more readable output and matching the behavior of C git, this makes the escaping code much simpler. * Disallow '\0' entirely within both subsection names and values, since due to Unix command line argument conventions it is impossible to pass such values to "git config". * Properly preserve intra-value whitespace when parsing, rather than collapsing it all to a single space. Change-Id: I304f626b9d0ad1592c4e4e449a11b136c0f8b3e3
6 years ago
Config: Rewrite subsection and value escaping and parsing Previously, Config was using the same method for both escaping and parsing subsection names and config values. The goal was presumably code savings, but unfortunately, these two pieces of the git config format are simply different. In git v2.15.1, Documentation/config.txt says the following about subsection names: "Subsection names are case sensitive and can contain any characters except newline (doublequote `"` and backslash can be included by escaping them as `\"` and `\\`, respectively). Section headers cannot span multiple lines. Variables may belong directly to a section or to a given subsection." And, later in the same documentation section, about values: "A line that defines a value can be continued to the next line by ending it with a `\`; the backquote and the end-of-line are stripped. Leading whitespaces after 'name =', the remainder of the line after the first comment character '#' or ';', and trailing whitespaces of the line are discarded unless they are enclosed in double quotes. Internal whitespaces within the value are retained verbatim. Inside double quotes, double quote `"` and backslash `\` characters must be escaped: use `\"` for `"` and `\\` for `\`. The following escape sequences (beside `\"` and `\\`) are recognized: `\n` for newline character (NL), `\t` for horizontal tabulation (HT, TAB) and `\b` for backspace (BS). Other char escape sequences (including octal escape sequences) are invalid." The main important differences are that subsection names have a limited set of supported escape sequences, and do not support newlines at all, either escaped or unescaped. Arguably, it would be easy to support escaped newlines, but C git simply does not: $ git config -f foo.config $'foo.bar\nbaz.quux' value error: invalid key (newline): foo.bar baz.quux I468106ac was an attempt to fix one bug in escapeValue, around leading whitespace, without having to rewrite the whole escaping/parsing code. Unfortunately, because escapeValue was used for escaping subsection names as well, this made it possible to write invalid config files, any time Config#toText is called with a subsection name with trailing whitespace, like {foo }. Rather than pile hacks on top of hacks, fix it for real by largely rewriting the escaping and parsing code. In addition to fixing escape sequences, fix (and write tests for) a few more issues in the old implementation: * Now that we can properly parse it, always emit newlines as "\n" from escapeValue, rather than the weird (but still supported) syntax with a non-quoted trailing literal "\n\" before the newline. In addition to producing more readable output and matching the behavior of C git, this makes the escaping code much simpler. * Disallow '\0' entirely within both subsection names and values, since due to Unix command line argument conventions it is impossible to pass such values to "git config". * Properly preserve intra-value whitespace when parsing, rather than collapsing it all to a single space. Change-Id: I304f626b9d0ad1592c4e4e449a11b136c0f8b3e3
6 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777
  1. /*
  2. * Copyright (C) 2007, Dave Watson <dwatson@mimvista.com>
  3. * Copyright (C) 2007-2010, Robin Rosenberg <robin.rosenberg@dewire.com>
  4. * Copyright (C) 2006-2008, Shawn O. Pearce <spearce@spearce.org>
  5. * Copyright (C) 2010, Chris Aniszczyk <caniszczyk@gmail.com> and others
  6. *
  7. * This program and the accompanying materials are made available under the
  8. * terms of the Eclipse Distribution License v. 1.0 which is available at
  9. * https://www.eclipse.org/org/documents/edl-v10.php.
  10. *
  11. * SPDX-License-Identifier: BSD-3-Clause
  12. */
  13. package org.eclipse.jgit.internal.storage.file;
  14. import static java.nio.charset.StandardCharsets.ISO_8859_1;
  15. import static java.nio.charset.StandardCharsets.UTF_8;
  16. import static org.junit.Assert.assertEquals;
  17. import static org.junit.Assert.assertFalse;
  18. import static org.junit.Assert.assertNotNull;
  19. import static org.junit.Assert.assertNotSame;
  20. import static org.junit.Assert.assertThrows;
  21. import static org.junit.Assert.assertTrue;
  22. import static org.junit.Assert.fail;
  23. import java.io.File;
  24. import java.io.FileInputStream;
  25. import java.io.IOException;
  26. import java.io.UnsupportedEncodingException;
  27. import java.time.Instant;
  28. import org.eclipse.jgit.errors.ConfigInvalidException;
  29. import org.eclipse.jgit.errors.IncorrectObjectTypeException;
  30. import org.eclipse.jgit.errors.MissingObjectException;
  31. import org.eclipse.jgit.internal.JGitText;
  32. import org.eclipse.jgit.lib.AnyObjectId;
  33. import org.eclipse.jgit.lib.CommitBuilder;
  34. import org.eclipse.jgit.lib.Constants;
  35. import org.eclipse.jgit.lib.FileMode;
  36. import org.eclipse.jgit.lib.ObjectDatabase;
  37. import org.eclipse.jgit.lib.ObjectId;
  38. import org.eclipse.jgit.lib.ObjectInserter;
  39. import org.eclipse.jgit.lib.PersonIdent;
  40. import org.eclipse.jgit.lib.RefUpdate;
  41. import org.eclipse.jgit.lib.Repository;
  42. import org.eclipse.jgit.lib.TagBuilder;
  43. import org.eclipse.jgit.lib.TreeFormatter;
  44. import org.eclipse.jgit.revwalk.RevCommit;
  45. import org.eclipse.jgit.revwalk.RevTag;
  46. import org.eclipse.jgit.revwalk.RevWalk;
  47. import org.eclipse.jgit.storage.file.FileBasedConfig;
  48. import org.eclipse.jgit.storage.file.FileRepositoryBuilder;
  49. import org.eclipse.jgit.test.resources.SampleDataRepositoryTestCase;
  50. import org.eclipse.jgit.util.FS;
  51. import org.eclipse.jgit.util.FileUtils;
  52. import org.eclipse.jgit.util.IO;
  53. import org.junit.Test;
  54. public class T0003_BasicTest extends SampleDataRepositoryTestCase {
  55. @Test
  56. public void test001_Initalize() {
  57. final File gitdir = new File(trash, Constants.DOT_GIT);
  58. final File hooks = new File(gitdir, "hooks");
  59. final File objects = new File(gitdir, Constants.OBJECTS);
  60. final File objects_pack = new File(objects, "pack");
  61. final File objects_info = new File(objects, "info");
  62. final File refs = new File(gitdir, "refs");
  63. final File refs_heads = new File(refs, "heads");
  64. final File refs_tags = new File(refs, "tags");
  65. final File HEAD = new File(gitdir, "HEAD");
  66. assertTrue("Exists " + trash, trash.isDirectory());
  67. assertTrue("Exists " + hooks, hooks.isDirectory());
  68. assertTrue("Exists " + objects, objects.isDirectory());
  69. assertTrue("Exists " + objects_pack, objects_pack.isDirectory());
  70. assertTrue("Exists " + objects_info, objects_info.isDirectory());
  71. assertEquals(2L, objects.listFiles().length);
  72. assertTrue("Exists " + refs, refs.isDirectory());
  73. assertTrue("Exists " + refs_heads, refs_heads.isDirectory());
  74. assertTrue("Exists " + refs_tags, refs_tags.isDirectory());
  75. assertTrue("Exists " + HEAD, HEAD.isFile());
  76. assertEquals(23, HEAD.length());
  77. }
  78. @Test
  79. public void test000_openRepoBadArgs() throws IOException {
  80. try {
  81. new FileRepositoryBuilder().build();
  82. fail("Must pass either GIT_DIR or GIT_WORK_TREE");
  83. } catch (IllegalArgumentException e) {
  84. assertEquals(JGitText.get().eitherGitDirOrWorkTreeRequired, e
  85. .getMessage());
  86. }
  87. }
  88. /**
  89. * Check the default rules for looking up directories and files within a
  90. * repo when the gitDir is given.
  91. *
  92. * @throws IOException
  93. */
  94. @Test
  95. public void test000_openrepo_default_gitDirSet() throws IOException {
  96. File repo1Parent = new File(trash.getParentFile(), "r1");
  97. try (Repository repo1initial = new FileRepository(
  98. new File(repo1Parent, Constants.DOT_GIT))) {
  99. repo1initial.create();
  100. }
  101. File theDir = new File(repo1Parent, Constants.DOT_GIT);
  102. try (FileRepository r = (FileRepository) new FileRepositoryBuilder()
  103. .setGitDir(theDir).build()) {
  104. assertEqualsPath(theDir, r.getDirectory());
  105. assertEqualsPath(repo1Parent, r.getWorkTree());
  106. assertEqualsPath(new File(theDir, "index"), r.getIndexFile());
  107. assertEqualsPath(new File(theDir, Constants.OBJECTS),
  108. r.getObjectDatabase().getDirectory());
  109. }
  110. }
  111. /**
  112. * Check that we can pass both a git directory and a work tree repo when the
  113. * gitDir is given.
  114. *
  115. * @throws IOException
  116. */
  117. @Test
  118. public void test000_openrepo_default_gitDirAndWorkTreeSet()
  119. throws IOException {
  120. File repo1Parent = new File(trash.getParentFile(), "r1");
  121. try (Repository repo1initial = new FileRepository(
  122. new File(repo1Parent, Constants.DOT_GIT))) {
  123. repo1initial.create();
  124. }
  125. File theDir = new File(repo1Parent, Constants.DOT_GIT);
  126. try (FileRepository r = (FileRepository) new FileRepositoryBuilder()
  127. .setGitDir(theDir).setWorkTree(repo1Parent.getParentFile())
  128. .build()) {
  129. assertEqualsPath(theDir, r.getDirectory());
  130. assertEqualsPath(repo1Parent.getParentFile(), r.getWorkTree());
  131. assertEqualsPath(new File(theDir, "index"), r.getIndexFile());
  132. assertEqualsPath(new File(theDir, Constants.OBJECTS),
  133. r.getObjectDatabase().getDirectory());
  134. }
  135. }
  136. /**
  137. * Check the default rules for looking up directories and files within a
  138. * repo when the workTree is given.
  139. *
  140. * @throws IOException
  141. */
  142. @Test
  143. public void test000_openrepo_default_workDirSet() throws IOException {
  144. File repo1Parent = new File(trash.getParentFile(), "r1");
  145. try (Repository repo1initial = new FileRepository(
  146. new File(repo1Parent, Constants.DOT_GIT))) {
  147. repo1initial.create();
  148. }
  149. File theDir = new File(repo1Parent, Constants.DOT_GIT);
  150. try (FileRepository r = (FileRepository) new FileRepositoryBuilder()
  151. .setWorkTree(repo1Parent).build()) {
  152. assertEqualsPath(theDir, r.getDirectory());
  153. assertEqualsPath(repo1Parent, r.getWorkTree());
  154. assertEqualsPath(new File(theDir, "index"), r.getIndexFile());
  155. assertEqualsPath(new File(theDir, Constants.OBJECTS),
  156. r.getObjectDatabase().getDirectory());
  157. }
  158. }
  159. /**
  160. * Check that worktree config has an effect, given absolute path.
  161. *
  162. * @throws IOException
  163. */
  164. @Test
  165. public void test000_openrepo_default_absolute_workdirconfig()
  166. throws IOException {
  167. File repo1Parent = new File(trash.getParentFile(), "r1");
  168. File workdir = new File(trash.getParentFile(), "rw");
  169. FileUtils.mkdir(workdir);
  170. try (FileRepository repo1initial = new FileRepository(
  171. new File(repo1Parent, Constants.DOT_GIT))) {
  172. repo1initial.create();
  173. final FileBasedConfig cfg = repo1initial.getConfig();
  174. cfg.setString("core", null, "worktree", workdir.getAbsolutePath());
  175. cfg.save();
  176. }
  177. File theDir = new File(repo1Parent, Constants.DOT_GIT);
  178. try (FileRepository r = (FileRepository) new FileRepositoryBuilder()
  179. .setGitDir(theDir).build()) {
  180. assertEqualsPath(theDir, r.getDirectory());
  181. assertEqualsPath(workdir, r.getWorkTree());
  182. assertEqualsPath(new File(theDir, "index"), r.getIndexFile());
  183. assertEqualsPath(new File(theDir, Constants.OBJECTS),
  184. r.getObjectDatabase().getDirectory());
  185. }
  186. }
  187. /**
  188. * Check that worktree config has an effect, given a relative path.
  189. *
  190. * @throws IOException
  191. */
  192. @Test
  193. public void test000_openrepo_default_relative_workdirconfig()
  194. throws IOException {
  195. File repo1Parent = new File(trash.getParentFile(), "r1");
  196. File workdir = new File(trash.getParentFile(), "rw");
  197. FileUtils.mkdir(workdir);
  198. try (FileRepository repo1initial = new FileRepository(
  199. new File(repo1Parent, Constants.DOT_GIT))) {
  200. repo1initial.create();
  201. final FileBasedConfig cfg = repo1initial.getConfig();
  202. cfg.setString("core", null, "worktree", "../../rw");
  203. cfg.save();
  204. }
  205. File theDir = new File(repo1Parent, Constants.DOT_GIT);
  206. try (FileRepository r = (FileRepository) new FileRepositoryBuilder()
  207. .setGitDir(theDir).build()) {
  208. assertEqualsPath(theDir, r.getDirectory());
  209. assertEqualsPath(workdir, r.getWorkTree());
  210. assertEqualsPath(new File(theDir, "index"), r.getIndexFile());
  211. assertEqualsPath(new File(theDir, Constants.OBJECTS),
  212. r.getObjectDatabase().getDirectory());
  213. }
  214. }
  215. /**
  216. * Check that the given index file is honored and the alternate object
  217. * directories too
  218. *
  219. * @throws IOException
  220. */
  221. @Test
  222. public void test000_openrepo_alternate_index_file_and_objdirs()
  223. throws IOException {
  224. File repo1Parent = new File(trash.getParentFile(), "r1");
  225. File indexFile = new File(trash, "idx");
  226. File objDir = new File(trash, "../obj");
  227. File altObjDir = db.getObjectDatabase().getDirectory();
  228. try (Repository repo1initial = new FileRepository(
  229. new File(repo1Parent, Constants.DOT_GIT))) {
  230. repo1initial.create();
  231. }
  232. File theDir = new File(repo1Parent, Constants.DOT_GIT);
  233. try (FileRepository r = (FileRepository) new FileRepositoryBuilder() //
  234. .setGitDir(theDir).setObjectDirectory(objDir) //
  235. .addAlternateObjectDirectory(altObjDir) //
  236. .setIndexFile(indexFile) //
  237. .build()) {
  238. assertEqualsPath(theDir, r.getDirectory());
  239. assertEqualsPath(theDir.getParentFile(), r.getWorkTree());
  240. assertEqualsPath(indexFile, r.getIndexFile());
  241. assertEqualsPath(objDir, r.getObjectDatabase().getDirectory());
  242. assertNotNull(r.open(ObjectId
  243. .fromString("6db9c2ebf75590eef973081736730a9ea169a0c4")));
  244. }
  245. }
  246. protected void assertEqualsPath(File expected, File actual)
  247. throws IOException {
  248. assertEquals(expected.getCanonicalPath(), actual.getCanonicalPath());
  249. }
  250. @Test
  251. public void test002_WriteEmptyTree() throws IOException {
  252. // One of our test packs contains the empty tree object. If the pack is
  253. // open when we create it we won't write the object file out as a loose
  254. // object (as it already exists in the pack).
  255. //
  256. try (Repository newdb = createBareRepository()) {
  257. try (ObjectInserter oi = newdb.newObjectInserter()) {
  258. final ObjectId treeId = oi.insert(new TreeFormatter());
  259. assertEquals("4b825dc642cb6eb9a060e54bf8d69288fbee4904",
  260. treeId.name());
  261. }
  262. final File o = new File(
  263. new File(new File(newdb.getDirectory(), Constants.OBJECTS),
  264. "4b"),
  265. "825dc642cb6eb9a060e54bf8d69288fbee4904");
  266. assertTrue("Exists " + o, o.isFile());
  267. assertTrue("Read-only " + o, !o.canWrite());
  268. }
  269. }
  270. @Test
  271. public void test002_WriteEmptyTree2() throws IOException {
  272. // File shouldn't exist as it is in a test pack.
  273. //
  274. final ObjectId treeId = insertTree(new TreeFormatter());
  275. assertEquals("4b825dc642cb6eb9a060e54bf8d69288fbee4904", treeId.name());
  276. final File o = new File(new File(
  277. new File(db.getDirectory(), Constants.OBJECTS), "4b"),
  278. "825dc642cb6eb9a060e54bf8d69288fbee4904");
  279. assertFalse("Exists " + o, o.isFile());
  280. }
  281. @Test
  282. public void test002_CreateBadTree() throws Exception {
  283. // We won't create a tree entry with an empty filename
  284. //
  285. final TreeFormatter formatter = new TreeFormatter();
  286. assertThrows(JGitText.get().invalidTreeZeroLengthName,
  287. IllegalArgumentException.class,
  288. () -> formatter.append("", FileMode.TREE, ObjectId.fromString(
  289. "4b825dc642cb6eb9a060e54bf8d69288fbee4904")));
  290. }
  291. @Test
  292. public void test006_ReadUglyConfig() throws IOException,
  293. ConfigInvalidException {
  294. final File cfg = new File(db.getDirectory(), Constants.CONFIG);
  295. final FileBasedConfig c = new FileBasedConfig(cfg, db.getFS());
  296. final String configStr = " [core];comment\n\tfilemode = yes\n"
  297. + "[user]\n"
  298. + " email = A U Thor <thor@example.com> # Just an example...\n"
  299. + " name = \"A Thor \\\\ \\\"\\t \"\n"
  300. + " defaultCheckInComment = a many line\\n\\\ncomment\\n\\\n"
  301. + " to test\n";
  302. write(cfg, configStr);
  303. c.load();
  304. assertEquals("yes", c.getString("core", null, "filemode"));
  305. assertEquals("A U Thor <thor@example.com>", c.getString("user", null,
  306. "email"));
  307. assertEquals("A Thor \\ \"\t ", c.getString("user", null, "name"));
  308. assertEquals("a many line\ncomment\n to test", c.getString("user",
  309. null, "defaultCheckInComment"));
  310. c.save();
  311. // Saving normalizes out the weird "\\n\\\n" to a single escaped newline,
  312. // and quotes the whole string.
  313. final String expectedStr = " [core];comment\n\tfilemode = yes\n"
  314. + "[user]\n"
  315. + " email = A U Thor <thor@example.com> # Just an example...\n"
  316. + " name = \"A Thor \\\\ \\\"\\t \"\n"
  317. + " defaultCheckInComment = a many line\\ncomment\\n to test\n";
  318. assertEquals(expectedStr, new String(IO.readFully(cfg), UTF_8));
  319. }
  320. @Test
  321. public void test007_Open() throws IOException {
  322. try (FileRepository db2 = new FileRepository(db.getDirectory())) {
  323. assertEquals(db.getDirectory(), db2.getDirectory());
  324. assertEquals(db.getObjectDatabase().getDirectory(), db2
  325. .getObjectDatabase().getDirectory());
  326. assertNotSame(db.getConfig(), db2.getConfig());
  327. }
  328. }
  329. @Test
  330. public void test008_FailOnWrongVersion() throws IOException {
  331. final File cfg = new File(db.getDirectory(), Constants.CONFIG);
  332. final String badvers = "ihopethisisneveraversion";
  333. final String configStr = "[core]\n" + "\trepositoryFormatVersion="
  334. + badvers + "\n";
  335. write(cfg, configStr);
  336. try (FileRepository unused = new FileRepository(db.getDirectory())) {
  337. fail("incorrectly opened a bad repository");
  338. } catch (IllegalArgumentException ioe) {
  339. assertNotNull(ioe.getMessage());
  340. }
  341. }
  342. @Test
  343. public void test009_CreateCommitOldFormat() throws IOException {
  344. final ObjectId treeId = insertTree(new TreeFormatter());
  345. final CommitBuilder c = new CommitBuilder();
  346. c.setAuthor(new PersonIdent(author, 1154236443000L, -4 * 60));
  347. c.setCommitter(new PersonIdent(committer, 1154236443000L, -4 * 60));
  348. c.setMessage("A Commit\n");
  349. c.setTreeId(treeId);
  350. assertEquals(treeId, c.getTreeId());
  351. ObjectId actid = insertCommit(c);
  352. final ObjectId cmtid = ObjectId
  353. .fromString("9208b2459ea6609a5af68627cc031796d0d9329b");
  354. assertEquals(cmtid, actid);
  355. // Verify the commit we just wrote is in the correct format.
  356. ObjectDatabase odb = db.getObjectDatabase();
  357. assertTrue("is ObjectDirectory", odb instanceof ObjectDirectory);
  358. try (XInputStream xis = new XInputStream(
  359. new FileInputStream(((ObjectDirectory) odb).fileFor(cmtid)))) {
  360. assertEquals(0x78, xis.readUInt8());
  361. assertEquals(0x9c, xis.readUInt8());
  362. assertEquals(0, 0x789c % 31);
  363. }
  364. // Verify we can read it.
  365. RevCommit c2 = parseCommit(actid);
  366. assertNotNull(c2);
  367. assertEquals(c.getMessage(), c2.getFullMessage());
  368. assertEquals(c.getTreeId(), c2.getTree());
  369. assertEquals(c.getAuthor(), c2.getAuthorIdent());
  370. assertEquals(c.getCommitter(), c2.getCommitterIdent());
  371. }
  372. @Test
  373. public void test020_createBlobTag() throws IOException {
  374. final ObjectId emptyId = insertEmptyBlob();
  375. final TagBuilder t = new TagBuilder();
  376. t.setObjectId(emptyId, Constants.OBJ_BLOB);
  377. t.setTag("test020");
  378. t.setTagger(new PersonIdent(author, 1154236443000L, -4 * 60));
  379. t.setMessage("test020 tagged\n");
  380. ObjectId actid = insertTag(t);
  381. assertEquals("6759556b09fbb4fd8ae5e315134481cc25d46954", actid.name());
  382. RevTag mapTag = parseTag(actid);
  383. assertEquals(Constants.OBJ_BLOB, mapTag.getObject().getType());
  384. assertEquals("test020 tagged\n", mapTag.getFullMessage());
  385. assertEquals(new PersonIdent(author, 1154236443000L, -4 * 60), mapTag
  386. .getTaggerIdent());
  387. assertEquals("e69de29bb2d1d6434b8b29ae775ad8c2e48c5391", mapTag
  388. .getObject().getId().name());
  389. }
  390. @Test
  391. public void test021_createTreeTag() throws IOException {
  392. final ObjectId emptyId = insertEmptyBlob();
  393. TreeFormatter almostEmptyTree = new TreeFormatter();
  394. almostEmptyTree.append("empty", FileMode.REGULAR_FILE, emptyId);
  395. final ObjectId almostEmptyTreeId = insertTree(almostEmptyTree);
  396. final TagBuilder t = new TagBuilder();
  397. t.setObjectId(almostEmptyTreeId, Constants.OBJ_TREE);
  398. t.setTag("test021");
  399. t.setTagger(new PersonIdent(author, 1154236443000L, -4 * 60));
  400. t.setMessage("test021 tagged\n");
  401. ObjectId actid = insertTag(t);
  402. assertEquals("b0517bc8dbe2096b419d42424cd7030733f4abe5", actid.name());
  403. RevTag mapTag = parseTag(actid);
  404. assertEquals(Constants.OBJ_TREE, mapTag.getObject().getType());
  405. assertEquals("test021 tagged\n", mapTag.getFullMessage());
  406. assertEquals(new PersonIdent(author, 1154236443000L, -4 * 60), mapTag
  407. .getTaggerIdent());
  408. assertEquals("417c01c8795a35b8e835113a85a5c0c1c77f67fb", mapTag
  409. .getObject().getId().name());
  410. }
  411. @Test
  412. public void test022_createCommitTag() throws IOException {
  413. final ObjectId emptyId = insertEmptyBlob();
  414. TreeFormatter almostEmptyTree = new TreeFormatter();
  415. almostEmptyTree.append("empty", FileMode.REGULAR_FILE, emptyId);
  416. final ObjectId almostEmptyTreeId = insertTree(almostEmptyTree);
  417. final CommitBuilder almostEmptyCommit = new CommitBuilder();
  418. almostEmptyCommit.setAuthor(new PersonIdent(author, 1154236443000L,
  419. -2 * 60)); // not exactly the same
  420. almostEmptyCommit.setCommitter(new PersonIdent(author, 1154236443000L,
  421. -2 * 60));
  422. almostEmptyCommit.setMessage("test022\n");
  423. almostEmptyCommit.setTreeId(almostEmptyTreeId);
  424. ObjectId almostEmptyCommitId = insertCommit(almostEmptyCommit);
  425. final TagBuilder t = new TagBuilder();
  426. t.setObjectId(almostEmptyCommitId, Constants.OBJ_COMMIT);
  427. t.setTag("test022");
  428. t.setTagger(new PersonIdent(author, 1154236443000L, -4 * 60));
  429. t.setMessage("test022 tagged\n");
  430. ObjectId actid = insertTag(t);
  431. assertEquals("0ce2ebdb36076ef0b38adbe077a07d43b43e3807", actid.name());
  432. RevTag mapTag = parseTag(actid);
  433. assertEquals(Constants.OBJ_COMMIT, mapTag.getObject().getType());
  434. assertEquals("test022 tagged\n", mapTag.getFullMessage());
  435. assertEquals(new PersonIdent(author, 1154236443000L, -4 * 60), mapTag
  436. .getTaggerIdent());
  437. assertEquals("b5d3b45a96b340441f5abb9080411705c51cc86c", mapTag
  438. .getObject().getId().name());
  439. }
  440. @Test
  441. public void test023_createCommitNonAnullii() throws IOException {
  442. final ObjectId emptyId = insertEmptyBlob();
  443. TreeFormatter almostEmptyTree = new TreeFormatter();
  444. almostEmptyTree.append("empty", FileMode.REGULAR_FILE, emptyId);
  445. final ObjectId almostEmptyTreeId = insertTree(almostEmptyTree);
  446. CommitBuilder commit = new CommitBuilder();
  447. commit.setTreeId(almostEmptyTreeId);
  448. commit.setAuthor(new PersonIdent("Joe H\u00e4cker", "joe@example.com",
  449. 4294967295000L, 60));
  450. commit.setCommitter(new PersonIdent("Joe Hacker", "joe2@example.com",
  451. 4294967295000L, 60));
  452. commit.setEncoding(UTF_8);
  453. commit.setMessage("\u00dcbergeeks");
  454. ObjectId cid = insertCommit(commit);
  455. assertEquals("4680908112778718f37e686cbebcc912730b3154", cid.name());
  456. RevCommit loadedCommit = parseCommit(cid);
  457. assertEquals(commit.getMessage(), loadedCommit.getFullMessage());
  458. }
  459. @Test
  460. public void test024_createCommitNonAscii() throws IOException {
  461. final ObjectId emptyId = insertEmptyBlob();
  462. TreeFormatter almostEmptyTree = new TreeFormatter();
  463. almostEmptyTree.append("empty", FileMode.REGULAR_FILE, emptyId);
  464. final ObjectId almostEmptyTreeId = insertTree(almostEmptyTree);
  465. CommitBuilder commit = new CommitBuilder();
  466. commit.setTreeId(almostEmptyTreeId);
  467. commit.setAuthor(new PersonIdent("Joe H\u00e4cker", "joe@example.com",
  468. 4294967295000L, 60));
  469. commit.setCommitter(new PersonIdent("Joe Hacker", "joe2@example.com",
  470. 4294967295000L, 60));
  471. commit.setEncoding(ISO_8859_1);
  472. commit.setMessage("\u00dcbergeeks");
  473. ObjectId cid = insertCommit(commit);
  474. assertEquals("2979b39d385014b33287054b87f77bcb3ecb5ebf", cid.name());
  475. }
  476. @Test
  477. public void test025_computeSha1NoStore() {
  478. byte[] data = "test025 some data, more than 16 bytes to get good coverage"
  479. .getBytes(ISO_8859_1);
  480. try (ObjectInserter.Formatter formatter = new ObjectInserter.Formatter()) {
  481. final ObjectId id = formatter.idFor(Constants.OBJ_BLOB, data);
  482. assertEquals("4f561df5ecf0dfbd53a0dc0f37262fef075d9dde", id.name());
  483. }
  484. }
  485. @Test
  486. public void test026_CreateCommitMultipleparents() throws IOException {
  487. final ObjectId treeId;
  488. try (ObjectInserter oi = db.newObjectInserter()) {
  489. final ObjectId blobId = oi.insert(Constants.OBJ_BLOB,
  490. "and this is the data in me\n".getBytes(UTF_8
  491. .name()));
  492. TreeFormatter fmt = new TreeFormatter();
  493. fmt.append("i-am-a-file", FileMode.REGULAR_FILE, blobId);
  494. treeId = oi.insert(fmt);
  495. oi.flush();
  496. }
  497. assertEquals(ObjectId
  498. .fromString("00b1f73724f493096d1ffa0b0f1f1482dbb8c936"), treeId);
  499. final CommitBuilder c1 = new CommitBuilder();
  500. c1.setAuthor(new PersonIdent(author, 1154236443000L, -4 * 60));
  501. c1.setCommitter(new PersonIdent(committer, 1154236443000L, -4 * 60));
  502. c1.setMessage("A Commit\n");
  503. c1.setTreeId(treeId);
  504. assertEquals(treeId, c1.getTreeId());
  505. ObjectId actid1 = insertCommit(c1);
  506. final ObjectId cmtid1 = ObjectId
  507. .fromString("803aec4aba175e8ab1d666873c984c0308179099");
  508. assertEquals(cmtid1, actid1);
  509. final CommitBuilder c2 = new CommitBuilder();
  510. c2.setAuthor(new PersonIdent(author, 1154236443000L, -4 * 60));
  511. c2.setCommitter(new PersonIdent(committer, 1154236443000L, -4 * 60));
  512. c2.setMessage("A Commit 2\n");
  513. c2.setTreeId(treeId);
  514. assertEquals(treeId, c2.getTreeId());
  515. c2.setParentIds(actid1);
  516. ObjectId actid2 = insertCommit(c2);
  517. final ObjectId cmtid2 = ObjectId
  518. .fromString("95d068687c91c5c044fb8c77c5154d5247901553");
  519. assertEquals(cmtid2, actid2);
  520. RevCommit rm2 = parseCommit(cmtid2);
  521. assertNotSame(c2, rm2); // assert the parsed objects is not from the
  522. // cache
  523. assertEquals(c2.getAuthor(), rm2.getAuthorIdent());
  524. assertEquals(actid2, rm2.getId());
  525. assertEquals(c2.getMessage(), rm2.getFullMessage());
  526. assertEquals(c2.getTreeId(), rm2.getTree().getId());
  527. assertEquals(1, rm2.getParentCount());
  528. assertEquals(actid1, rm2.getParent(0));
  529. final CommitBuilder c3 = new CommitBuilder();
  530. c3.setAuthor(new PersonIdent(author, 1154236443000L, -4 * 60));
  531. c3.setCommitter(new PersonIdent(committer, 1154236443000L, -4 * 60));
  532. c3.setMessage("A Commit 3\n");
  533. c3.setTreeId(treeId);
  534. assertEquals(treeId, c3.getTreeId());
  535. c3.setParentIds(actid1, actid2);
  536. ObjectId actid3 = insertCommit(c3);
  537. final ObjectId cmtid3 = ObjectId
  538. .fromString("ce6e1ce48fbeeb15a83f628dc8dc2debefa066f4");
  539. assertEquals(cmtid3, actid3);
  540. RevCommit rm3 = parseCommit(cmtid3);
  541. assertNotSame(c3, rm3); // assert the parsed objects is not from the
  542. // cache
  543. assertEquals(c3.getAuthor(), rm3.getAuthorIdent());
  544. assertEquals(actid3, rm3.getId());
  545. assertEquals(c3.getMessage(), rm3.getFullMessage());
  546. assertEquals(c3.getTreeId(), rm3.getTree().getId());
  547. assertEquals(2, rm3.getParentCount());
  548. assertEquals(actid1, rm3.getParent(0));
  549. assertEquals(actid2, rm3.getParent(1));
  550. final CommitBuilder c4 = new CommitBuilder();
  551. c4.setAuthor(new PersonIdent(author, 1154236443000L, -4 * 60));
  552. c4.setCommitter(new PersonIdent(committer, 1154236443000L, -4 * 60));
  553. c4.setMessage("A Commit 4\n");
  554. c4.setTreeId(treeId);
  555. assertEquals(treeId, c3.getTreeId());
  556. c4.setParentIds(actid1, actid2, actid3);
  557. ObjectId actid4 = insertCommit(c4);
  558. final ObjectId cmtid4 = ObjectId
  559. .fromString("d1fca9fe3fef54e5212eb67902c8ed3e79736e27");
  560. assertEquals(cmtid4, actid4);
  561. RevCommit rm4 = parseCommit(cmtid4);
  562. assertNotSame(c4, rm3); // assert the parsed objects is not from the
  563. // cache
  564. assertEquals(c4.getAuthor(), rm4.getAuthorIdent());
  565. assertEquals(actid4, rm4.getId());
  566. assertEquals(c4.getMessage(), rm4.getFullMessage());
  567. assertEquals(c4.getTreeId(), rm4.getTree().getId());
  568. assertEquals(3, rm4.getParentCount());
  569. assertEquals(actid1, rm4.getParent(0));
  570. assertEquals(actid2, rm4.getParent(1));
  571. assertEquals(actid3, rm4.getParent(2));
  572. }
  573. @Test
  574. public void test027_UnpackedRefHigherPriorityThanPacked()
  575. throws IOException {
  576. String unpackedId = "7f822839a2fe9760f386cbbbcb3f92c5fe81def7";
  577. write(new File(db.getDirectory(), "refs/heads/a"), unpackedId + "\n");
  578. ObjectId resolved = db.resolve("refs/heads/a");
  579. assertEquals(unpackedId, resolved.name());
  580. }
  581. @Test
  582. public void test028_LockPackedRef() throws IOException {
  583. ObjectId id1;
  584. ObjectId id2;
  585. try (ObjectInserter ins = db.newObjectInserter()) {
  586. id1 = ins.insert(
  587. Constants.OBJ_BLOB, "contents1".getBytes(UTF_8));
  588. id2 = ins.insert(
  589. Constants.OBJ_BLOB, "contents2".getBytes(UTF_8));
  590. ins.flush();
  591. }
  592. writeTrashFile(".git/packed-refs",
  593. id1.name() + " refs/heads/foobar");
  594. writeTrashFile(".git/HEAD", "ref: refs/heads/foobar\n");
  595. BUG_WorkAroundRacyGitIssues("packed-refs");
  596. BUG_WorkAroundRacyGitIssues("HEAD");
  597. ObjectId resolve = db.resolve("HEAD");
  598. assertEquals(id1, resolve);
  599. RefUpdate lockRef = db.updateRef("HEAD");
  600. lockRef.setNewObjectId(id2);
  601. assertEquals(RefUpdate.Result.FORCED, lockRef.forceUpdate());
  602. assertTrue(new File(db.getDirectory(), "refs/heads/foobar").exists());
  603. assertEquals(id2, db.resolve("refs/heads/foobar"));
  604. // Again. The ref already exists
  605. RefUpdate lockRef2 = db.updateRef("HEAD");
  606. lockRef2.setNewObjectId(id1);
  607. assertEquals(RefUpdate.Result.FORCED, lockRef2.forceUpdate());
  608. assertTrue(new File(db.getDirectory(), "refs/heads/foobar").exists());
  609. assertEquals(id1, db.resolve("refs/heads/foobar"));
  610. }
  611. @Test
  612. public void test30_stripWorkDir() {
  613. File relCwd = new File(".");
  614. File absCwd = relCwd.getAbsoluteFile();
  615. File absBase = new File(new File(absCwd, "repo"), "workdir");
  616. File relBase = new File(new File(relCwd, "repo"), "workdir");
  617. assertEquals(absBase.getAbsolutePath(), relBase.getAbsolutePath());
  618. File relBaseFile = new File(new File(relBase, "other"), "module.c");
  619. File absBaseFile = new File(new File(absBase, "other"), "module.c");
  620. assertEquals("other/module.c", Repository.stripWorkDir(relBase,
  621. relBaseFile));
  622. assertEquals("other/module.c", Repository.stripWorkDir(relBase,
  623. absBaseFile));
  624. assertEquals("other/module.c", Repository.stripWorkDir(absBase,
  625. relBaseFile));
  626. assertEquals("other/module.c", Repository.stripWorkDir(absBase,
  627. absBaseFile));
  628. File relNonFile = new File(new File(relCwd, "not-repo"), ".gitignore");
  629. File absNonFile = new File(new File(absCwd, "not-repo"), ".gitignore");
  630. assertEquals("", Repository.stripWorkDir(relBase, relNonFile));
  631. assertEquals("", Repository.stripWorkDir(absBase, absNonFile));
  632. assertEquals("", Repository.stripWorkDir(db.getWorkTree(), db
  633. .getWorkTree()));
  634. File file = new File(new File(db.getWorkTree(), "subdir"), "File.java");
  635. assertEquals("subdir/File.java", Repository.stripWorkDir(db
  636. .getWorkTree(), file));
  637. }
  638. private ObjectId insertEmptyBlob() throws IOException {
  639. final ObjectId emptyId;
  640. try (ObjectInserter oi = db.newObjectInserter()) {
  641. emptyId = oi.insert(Constants.OBJ_BLOB, new byte[] {});
  642. oi.flush();
  643. }
  644. return emptyId;
  645. }
  646. private ObjectId insertTree(TreeFormatter tree) throws IOException {
  647. try (ObjectInserter oi = db.newObjectInserter()) {
  648. ObjectId id = oi.insert(tree);
  649. oi.flush();
  650. return id;
  651. }
  652. }
  653. private ObjectId insertCommit(CommitBuilder builder)
  654. throws IOException, UnsupportedEncodingException {
  655. try (ObjectInserter oi = db.newObjectInserter()) {
  656. ObjectId id = oi.insert(builder);
  657. oi.flush();
  658. return id;
  659. }
  660. }
  661. private RevCommit parseCommit(AnyObjectId id)
  662. throws MissingObjectException, IncorrectObjectTypeException,
  663. IOException {
  664. try (RevWalk rw = new RevWalk(db)) {
  665. return rw.parseCommit(id);
  666. }
  667. }
  668. private ObjectId insertTag(TagBuilder tag) throws IOException,
  669. UnsupportedEncodingException {
  670. try (ObjectInserter oi = db.newObjectInserter()) {
  671. ObjectId id = oi.insert(tag);
  672. oi.flush();
  673. return id;
  674. }
  675. }
  676. private RevTag parseTag(AnyObjectId id) throws MissingObjectException,
  677. IncorrectObjectTypeException, IOException {
  678. try (RevWalk rw = new RevWalk(db)) {
  679. return rw.parseTag(id);
  680. }
  681. }
  682. /**
  683. * Kick the timestamp of a local file.
  684. * <p>
  685. * We shouldn't have to make these method calls. The cache is using file
  686. * system timestamps, and on many systems unit tests run faster than the
  687. * modification clock. Dumping the cache after we make an edit behind
  688. * RefDirectory's back allows the tests to pass.
  689. *
  690. * @param name
  691. * the file in the repository to force a time change on.
  692. * @throws IOException
  693. */
  694. private void BUG_WorkAroundRacyGitIssues(String name) throws IOException {
  695. File path = new File(db.getDirectory(), name);
  696. FS fs = db.getFS();
  697. Instant old = fs.lastModifiedInstant(path);
  698. long set = 1250379778668L; // Sat Aug 15 20:12:58 GMT-03:30 2009
  699. fs.setLastModified(path.toPath(), Instant.ofEpochMilli(set));
  700. assertFalse("time changed", old.equals(fs.lastModifiedInstant(path)));
  701. }
  702. }