Du kan inte välja fler än 25 ämnen Ämnen måste starta med en bokstav eller siffra, kan innehålla bindestreck ('-') och vara max 35 tecken långa.

WalkRemoteObjectDatabase.java 19KB

Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
Rewrite reference handling to be abstract and accurate This commit actually does three major changes to the way references are handled within JGit. Unfortunately they were easier to do as a single massive commit than to break them up into smaller units. Disambiguate symbolic references: --------------------------------- Reporting a symbolic reference such as HEAD as though it were any other normal reference like refs/heads/master causes subtle programming errors. We have been bitten by this error on several occasions, as have some downstream applications written by myself. Instead of reporting HEAD as a reference whose name differs from its "original name", report it as an actual SymbolicRef object that the application can test the type and examine the target of. With this change, Ref is now an abstract type with different subclasses for the different types. In the classical example of "HEAD" being a symbolic reference to branch "refs/heads/master", the Repository.getAllRefs() method will now return: Map<String, Ref> all = repository.getAllRefs(); SymbolicRef HEAD = (SymbolicRef) all.get("HEAD"); ObjectIdRef master = (ObjectIdRef) all.get("refs/heads/master"); assertSame(master, HEAD.getTarget()); assertSame(master.getObjectId(), HEAD.getObjectId()); assertEquals("HEAD", HEAD.getName()); assertEquals("refs/heads/master", master.getName()); A nice side-effect of this change is the storage type of the symbolic reference is no longer ambiguous with the storge type of the underlying reference it targets. In the above example, if master was only available in the packed-refs file, then the following is also true: assertSame(Ref.Storage.LOOSE, HEAD.getStorage()); assertSame(Ref.Storage.PACKED, master.getStorage()); (Prior to this change we returned the ambiguous storage of LOOSE_PACKED for HEAD, which was confusing since it wasn't actually true on disk). Another nice side-effect of this change is all intermediate symbolic references are preserved, and are therefore visible to the application when they walk the target chain. We can now correctly inspect chains of symbolic references. As a result of this change the Ref.getOrigName() method has been removed from the API. Applications should identify a symbolic reference by testing for isSymbolic() and not by using an arcane string comparsion between properties. Abstract the RefDatabase storage: --------------------------------- RefDatabase is now abstract, similar to ObjectDatabase, and a new concrete implementation called RefDirectory is used for the traditional on-disk storage layout. In the future we plan to support additional implementations, such as a pure in-memory RefDatabase for unit testing purposes. Optimize RefDirectory: ---------------------- The implementation of the in-memory reference cache, reading, and update routines has been completely rewritten. Much of the code was heavily borrowed or cribbed from the prior implementation, so copyright notices have been left intact as much as possible. The RefDirectory cache no longer confuses symbolic references with normal references. This permits the cache to resolve the value of a symbolic reference as late as possible, ensuring it is always current, without needing to maintain reverse pointers. The cache is now 2 sorted RefLists, rather than 3 HashMaps. Using sorted lists allows the implementation to reduce the in-memory footprint when storing many refs. Using specialized types for the elements allows the code to avoid additional map lookups for auxiliary stat information. To improve scan time during getRefs(), the lists are returned via a copy-on-write contract. Most callers of getRefs() do not modify the returned collections, so the copy-on-write semantics improves access on repositories with a large number of packed references. Iterator traversals of the returned Map<String,Ref> are performed using a simple merge-join of the two cache lists, ensuring we can perform the entire traversal in linear time as a function of the number of references: O(PackedRefs + LooseRefs). Scans of the loose reference space to update the cache run in O(LooseRefs log LooseRefs) time, as the directory contents are sorted before being merged against the in-memory cache. Since the majority of stable references are kept packed, there typically are only a handful of reference names to be sorted, so the sorting cost should not be very high. Locking is reduced during getRefs() by taking advantage of the copy-on-write semantics of the improved cache data structure. This permits concurrent readers to pull back references without blocking each other. If there is contention updating the cache during a scan, one or more updates are simply skipped and will get picked up again in a future scan. Writing to the $GIT_DIR/packed-refs during reference delete is now fully atomic. The file is locked, reparsed fresh, and written back out if a change is necessary. This avoids all race conditions with concurrent external updates of the packed-refs file. The RefLogWriter class has been fully folded into RefDirectory and is therefore deleted. Maintaining the reference's log is the responsiblity of the database implementation, and not all implementations will use java.io for access. Future work still remains to be done to abstract the ReflogReader class away from local disk IO. Change-Id: I26b9287c45a4b2d2be35ba2849daa316f5eec85d Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 år sedan
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519
  1. /*
  2. * Copyright (C) 2008, Shawn O. Pearce <spearce@spearce.org>
  3. * and other copyright owners as documented in the project's IP log.
  4. *
  5. * This program and the accompanying materials are made available
  6. * under the terms of the Eclipse Distribution License v1.0 which
  7. * accompanies this distribution, is reproduced below, and is
  8. * available at http://www.eclipse.org/org/documents/edl-v10.php
  9. *
  10. * All rights reserved.
  11. *
  12. * Redistribution and use in source and binary forms, with or
  13. * without modification, are permitted provided that the following
  14. * conditions are met:
  15. *
  16. * - Redistributions of source code must retain the above copyright
  17. * notice, this list of conditions and the following disclaimer.
  18. *
  19. * - Redistributions in binary form must reproduce the above
  20. * copyright notice, this list of conditions and the following
  21. * disclaimer in the documentation and/or other materials provided
  22. * with the distribution.
  23. *
  24. * - Neither the name of the Eclipse Foundation, Inc. nor the
  25. * names of its contributors may be used to endorse or promote
  26. * products derived from this software without specific prior
  27. * written permission.
  28. *
  29. * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
  30. * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
  31. * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  32. * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  33. * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
  34. * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  35. * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  36. * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  37. * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
  38. * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
  39. * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
  40. * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
  41. * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  42. */
  43. package org.eclipse.jgit.transport;
  44. import static java.nio.charset.StandardCharsets.UTF_8;
  45. import java.io.BufferedReader;
  46. import java.io.ByteArrayOutputStream;
  47. import java.io.FileNotFoundException;
  48. import java.io.IOException;
  49. import java.io.InputStream;
  50. import java.io.InputStreamReader;
  51. import java.io.OutputStream;
  52. import java.text.MessageFormat;
  53. import java.util.ArrayList;
  54. import java.util.Collection;
  55. import java.util.Map;
  56. import org.eclipse.jgit.errors.TransportException;
  57. import org.eclipse.jgit.internal.JGitText;
  58. import org.eclipse.jgit.internal.storage.file.RefDirectory;
  59. import org.eclipse.jgit.lib.Constants;
  60. import org.eclipse.jgit.lib.ObjectId;
  61. import org.eclipse.jgit.lib.ObjectIdRef;
  62. import org.eclipse.jgit.lib.ProgressMonitor;
  63. import org.eclipse.jgit.lib.Ref;
  64. import org.eclipse.jgit.util.IO;
  65. /**
  66. * Transfers object data through a dumb transport.
  67. * <p>
  68. * Implementations are responsible for resolving path names relative to the
  69. * <code>objects/</code> subdirectory of a single remote Git repository or
  70. * naked object database and make the content available as a Java input stream
  71. * for reading during fetch. The actual object traversal logic to determine the
  72. * names of files to retrieve is handled through the generic, protocol
  73. * independent {@link WalkFetchConnection}.
  74. */
  75. abstract class WalkRemoteObjectDatabase {
  76. static final String ROOT_DIR = "../"; //$NON-NLS-1$
  77. static final String INFO_PACKS = "info/packs"; //$NON-NLS-1$
  78. static final String INFO_ALTERNATES = "info/alternates"; //$NON-NLS-1$
  79. static final String INFO_HTTP_ALTERNATES = "info/http-alternates"; //$NON-NLS-1$
  80. static final String INFO_REFS = ROOT_DIR + Constants.INFO_REFS;
  81. abstract URIish getURI();
  82. /**
  83. * Obtain the list of available packs (if any).
  84. * <p>
  85. * Pack names should be the file name in the packs directory, that is
  86. * <code>pack-035760ab452d6eebd123add421f253ce7682355a.pack</code>. Index
  87. * names should not be included in the returned collection.
  88. *
  89. * @return list of pack names; null or empty list if none are available.
  90. * @throws IOException
  91. * The connection is unable to read the remote repository's list
  92. * of available pack files.
  93. */
  94. abstract Collection<String> getPackNames() throws IOException;
  95. /**
  96. * Obtain alternate connections to alternate object databases (if any).
  97. * <p>
  98. * Alternates are typically read from the file {@link #INFO_ALTERNATES} or
  99. * {@link #INFO_HTTP_ALTERNATES}. The content of each line must be resolved
  100. * by the implementation and a new database reference should be returned to
  101. * represent the additional location.
  102. * <p>
  103. * Alternates may reuse the same network connection handle, however the
  104. * fetch connection will {@link #close()} each created alternate.
  105. *
  106. * @return list of additional object databases the caller could fetch from;
  107. * null or empty list if none are configured.
  108. * @throws IOException
  109. * The connection is unable to read the remote repository's list
  110. * of configured alternates.
  111. */
  112. abstract Collection<WalkRemoteObjectDatabase> getAlternates()
  113. throws IOException;
  114. /**
  115. * Open a single file for reading.
  116. * <p>
  117. * Implementors should make every attempt possible to ensure
  118. * {@link FileNotFoundException} is used when the remote object does not
  119. * exist. However when fetching over HTTP some misconfigured servers may
  120. * generate a 200 OK status message (rather than a 404 Not Found) with an
  121. * HTML formatted message explaining the requested resource does not exist.
  122. * Callers such as {@link WalkFetchConnection} are prepared to handle this
  123. * by validating the content received, and assuming content that fails to
  124. * match its hash is an incorrectly phrased FileNotFoundException.
  125. * <p>
  126. * This method is recommended for already compressed files like loose objects
  127. * and pack files. For text files, see {@link #openReader(String)}.
  128. *
  129. * @param path
  130. * location of the file to read, relative to this objects
  131. * directory (e.g.
  132. * <code>cb/95df6ab7ae9e57571511ef451cf33767c26dd2</code> or
  133. * <code>pack/pack-035760ab452d6eebd123add421f253ce7682355a.pack</code>).
  134. * @return a stream to read from the file. Never null.
  135. * @throws FileNotFoundException
  136. * the requested file does not exist at the given location.
  137. * @throws IOException
  138. * The connection is unable to read the remote's file, and the
  139. * failure occurred prior to being able to determine if the file
  140. * exists, or after it was determined to exist but before the
  141. * stream could be created.
  142. */
  143. abstract FileStream open(String path) throws FileNotFoundException,
  144. IOException;
  145. /**
  146. * Create a new connection for a discovered alternate object database
  147. * <p>
  148. * This method is typically called by {@link #readAlternates(String)} when
  149. * subclasses us the generic alternate parsing logic for their
  150. * implementation of {@link #getAlternates()}.
  151. *
  152. * @param location
  153. * the location of the new alternate, relative to the current
  154. * object database.
  155. * @return a new database connection that can read from the specified
  156. * alternate.
  157. * @throws IOException
  158. * The database connection cannot be established with the
  159. * alternate, such as if the alternate location does not
  160. * actually exist and the connection's constructor attempts to
  161. * verify that.
  162. */
  163. abstract WalkRemoteObjectDatabase openAlternate(String location)
  164. throws IOException;
  165. /**
  166. * Close any resources used by this connection.
  167. * <p>
  168. * If the remote repository is contacted by a network socket this method
  169. * must close that network socket, disconnecting the two peers. If the
  170. * remote repository is actually local (same system) this method must close
  171. * any open file handles used to read the "remote" repository.
  172. */
  173. abstract void close();
  174. /**
  175. * Delete a file from the object database.
  176. * <p>
  177. * Path may start with <code>../</code> to request deletion of a file that
  178. * resides in the repository itself.
  179. * <p>
  180. * When possible empty directories must be removed, up to but not including
  181. * the current object database directory itself.
  182. * <p>
  183. * This method does not support deletion of directories.
  184. *
  185. * @param path
  186. * name of the item to be removed, relative to the current object
  187. * database.
  188. * @throws IOException
  189. * deletion is not supported, or deletion failed.
  190. */
  191. void deleteFile(String path) throws IOException {
  192. throw new IOException(MessageFormat.format(JGitText.get().deletingNotSupported, path));
  193. }
  194. /**
  195. * Open a remote file for writing.
  196. * <p>
  197. * Path may start with <code>../</code> to request writing of a file that
  198. * resides in the repository itself.
  199. * <p>
  200. * The requested path may or may not exist. If the path already exists as a
  201. * file the file should be truncated and completely replaced.
  202. * <p>
  203. * This method creates any missing parent directories, if necessary.
  204. *
  205. * @param path
  206. * name of the file to write, relative to the current object
  207. * database.
  208. * @return stream to write into this file. Caller must close the stream to
  209. * complete the write request. The stream is not buffered and each
  210. * write may cause a network request/response so callers should
  211. * buffer to smooth out small writes.
  212. * @param monitor
  213. * (optional) progress monitor to post write completion to during
  214. * the stream's close method.
  215. * @param monitorTask
  216. * (optional) task name to display during the close method.
  217. * @throws IOException
  218. * writing is not supported, or attempting to write the file
  219. * failed, possibly due to permissions or remote disk full, etc.
  220. */
  221. OutputStream writeFile(final String path, final ProgressMonitor monitor,
  222. final String monitorTask) throws IOException {
  223. throw new IOException(MessageFormat.format(JGitText.get().writingNotSupported, path));
  224. }
  225. /**
  226. * Atomically write a remote file.
  227. * <p>
  228. * This method attempts to perform as atomic of an update as it can,
  229. * reducing (or eliminating) the time that clients might be able to see
  230. * partial file content. This method is not suitable for very large
  231. * transfers as the complete content must be passed as an argument.
  232. * <p>
  233. * Path may start with <code>../</code> to request writing of a file that
  234. * resides in the repository itself.
  235. * <p>
  236. * The requested path may or may not exist. If the path already exists as a
  237. * file the file should be truncated and completely replaced.
  238. * <p>
  239. * This method creates any missing parent directories, if necessary.
  240. *
  241. * @param path
  242. * name of the file to write, relative to the current object
  243. * database.
  244. * @param data
  245. * complete new content of the file.
  246. * @throws IOException
  247. * writing is not supported, or attempting to write the file
  248. * failed, possibly due to permissions or remote disk full, etc.
  249. */
  250. void writeFile(String path, byte[] data) throws IOException {
  251. try (OutputStream os = writeFile(path, null, null)) {
  252. os.write(data);
  253. }
  254. }
  255. /**
  256. * Delete a loose ref from the remote repository.
  257. *
  258. * @param name
  259. * name of the ref within the ref space, for example
  260. * <code>refs/heads/pu</code>.
  261. * @throws IOException
  262. * deletion is not supported, or deletion failed.
  263. */
  264. void deleteRef(String name) throws IOException {
  265. deleteFile(ROOT_DIR + name);
  266. }
  267. /**
  268. * Delete a reflog from the remote repository.
  269. *
  270. * @param name
  271. * name of the ref within the ref space, for example
  272. * <code>refs/heads/pu</code>.
  273. * @throws IOException
  274. * deletion is not supported, or deletion failed.
  275. */
  276. void deleteRefLog(String name) throws IOException {
  277. deleteFile(ROOT_DIR + Constants.LOGS + "/" + name); //$NON-NLS-1$
  278. }
  279. /**
  280. * Overwrite (or create) a loose ref in the remote repository.
  281. * <p>
  282. * This method creates any missing parent directories, if necessary.
  283. *
  284. * @param name
  285. * name of the ref within the ref space, for example
  286. * <code>refs/heads/pu</code>.
  287. * @param value
  288. * new value to store in this ref. Must not be null.
  289. * @throws IOException
  290. * writing is not supported, or attempting to write the file
  291. * failed, possibly due to permissions or remote disk full, etc.
  292. */
  293. void writeRef(String name, ObjectId value) throws IOException {
  294. final ByteArrayOutputStream b;
  295. b = new ByteArrayOutputStream(Constants.OBJECT_ID_STRING_LENGTH + 1);
  296. value.copyTo(b);
  297. b.write('\n');
  298. writeFile(ROOT_DIR + name, b.toByteArray());
  299. }
  300. /**
  301. * Rebuild the {@link #INFO_PACKS} for dumb transport clients.
  302. * <p>
  303. * This method rebuilds the contents of the {@link #INFO_PACKS} file to
  304. * match the passed list of pack names.
  305. *
  306. * @param packNames
  307. * names of available pack files, in the order they should appear
  308. * in the file. Valid pack name strings are of the form
  309. * <code>pack-035760ab452d6eebd123add421f253ce7682355a.pack</code>.
  310. * @throws IOException
  311. * writing is not supported, or attempting to write the file
  312. * failed, possibly due to permissions or remote disk full, etc.
  313. */
  314. void writeInfoPacks(Collection<String> packNames) throws IOException {
  315. final StringBuilder w = new StringBuilder();
  316. for (String n : packNames) {
  317. w.append("P "); //$NON-NLS-1$
  318. w.append(n);
  319. w.append('\n');
  320. }
  321. writeFile(INFO_PACKS, Constants.encodeASCII(w.toString()));
  322. }
  323. /**
  324. * Open a buffered reader around a file.
  325. * <p>
  326. * This method is suitable for for reading line-oriented resources like
  327. * <code>info/packs</code>, <code>info/refs</code>, and the alternates list.
  328. *
  329. * @return a stream to read from the file. Never null.
  330. * @param path
  331. * location of the file to read, relative to this objects
  332. * directory (e.g. <code>info/packs</code>).
  333. * @throws FileNotFoundException
  334. * the requested file does not exist at the given location.
  335. * @throws IOException
  336. * The connection is unable to read the remote's file, and the
  337. * failure occurred prior to being able to determine if the file
  338. * exists, or after it was determined to exist but before the
  339. * stream could be created.
  340. */
  341. BufferedReader openReader(String path) throws IOException {
  342. final InputStream is = open(path).in;
  343. return new BufferedReader(new InputStreamReader(is, UTF_8));
  344. }
  345. /**
  346. * Read a standard Git alternates file to discover other object databases.
  347. * <p>
  348. * This method is suitable for reading the standard formats of the
  349. * alternates file, such as found in <code>objects/info/alternates</code>
  350. * or <code>objects/info/http-alternates</code> within a Git repository.
  351. * <p>
  352. * Alternates appear one per line, with paths expressed relative to this
  353. * object database.
  354. *
  355. * @param listPath
  356. * location of the alternate file to read, relative to this
  357. * object database (e.g. <code>info/alternates</code>).
  358. * @return the list of discovered alternates. Empty list if the file exists,
  359. * but no entries were discovered.
  360. * @throws FileNotFoundException
  361. * the requested file does not exist at the given location.
  362. * @throws IOException
  363. * The connection is unable to read the remote's file, and the
  364. * failure occurred prior to being able to determine if the file
  365. * exists, or after it was determined to exist but before the
  366. * stream could be created.
  367. */
  368. Collection<WalkRemoteObjectDatabase> readAlternates(final String listPath)
  369. throws IOException {
  370. try (BufferedReader br = openReader(listPath)) {
  371. final Collection<WalkRemoteObjectDatabase> alts = new ArrayList<>();
  372. for (;;) {
  373. String line = br.readLine();
  374. if (line == null)
  375. break;
  376. if (!line.endsWith("/")) //$NON-NLS-1$
  377. line += "/"; //$NON-NLS-1$
  378. alts.add(openAlternate(line));
  379. }
  380. return alts;
  381. }
  382. }
  383. /**
  384. * Read a standard Git packed-refs file to discover known references.
  385. *
  386. * @param avail
  387. * return collection of references. Any existing entries will be
  388. * replaced if they are found in the packed-refs file.
  389. * @throws org.eclipse.jgit.errors.TransportException
  390. * an error occurred reading from the packed refs file.
  391. */
  392. protected void readPackedRefs(Map<String, Ref> avail)
  393. throws TransportException {
  394. try (BufferedReader br = openReader(ROOT_DIR + Constants.PACKED_REFS)) {
  395. readPackedRefsImpl(avail, br);
  396. } catch (FileNotFoundException notPacked) {
  397. // Perhaps it wasn't worthwhile, or is just an older repository.
  398. } catch (IOException e) {
  399. throw new TransportException(getURI(), JGitText.get().errorInPackedRefs, e);
  400. }
  401. }
  402. private void readPackedRefsImpl(final Map<String, Ref> avail,
  403. final BufferedReader br) throws IOException {
  404. Ref last = null;
  405. boolean peeled = false;
  406. for (;;) {
  407. String line = br.readLine();
  408. if (line == null)
  409. break;
  410. if (line.charAt(0) == '#') {
  411. if (line.startsWith(RefDirectory.PACKED_REFS_HEADER)) {
  412. line = line.substring(RefDirectory.PACKED_REFS_HEADER.length());
  413. peeled = line.contains(RefDirectory.PACKED_REFS_PEELED);
  414. }
  415. continue;
  416. }
  417. if (line.charAt(0) == '^') {
  418. if (last == null)
  419. throw new TransportException(JGitText.get().peeledLineBeforeRef);
  420. final ObjectId id = ObjectId.fromString(line.substring(1));
  421. last = new ObjectIdRef.PeeledTag(Ref.Storage.PACKED, last
  422. .getName(), last.getObjectId(), id);
  423. avail.put(last.getName(), last);
  424. continue;
  425. }
  426. final int sp = line.indexOf(' ');
  427. if (sp < 0)
  428. throw new TransportException(MessageFormat.format(JGitText.get().unrecognizedRef, line));
  429. final ObjectId id = ObjectId.fromString(line.substring(0, sp));
  430. final String name = line.substring(sp + 1);
  431. if (peeled)
  432. last = new ObjectIdRef.PeeledNonTag(Ref.Storage.PACKED, name, id);
  433. else
  434. last = new ObjectIdRef.Unpeeled(Ref.Storage.PACKED, name, id);
  435. avail.put(last.getName(), last);
  436. }
  437. }
  438. static final class FileStream {
  439. final InputStream in;
  440. final long length;
  441. /**
  442. * Create a new stream of unknown length.
  443. *
  444. * @param i
  445. * stream containing the file data. This stream will be
  446. * closed by the caller when reading is complete.
  447. */
  448. FileStream(InputStream i) {
  449. in = i;
  450. length = -1;
  451. }
  452. /**
  453. * Create a new stream of known length.
  454. *
  455. * @param i
  456. * stream containing the file data. This stream will be
  457. * closed by the caller when reading is complete.
  458. * @param n
  459. * total number of bytes available for reading through
  460. * <code>i</code>.
  461. */
  462. FileStream(InputStream i, long n) {
  463. in = i;
  464. length = n;
  465. }
  466. byte[] toArray() throws IOException {
  467. try {
  468. if (length >= 0) {
  469. final byte[] r = new byte[(int) length];
  470. IO.readFully(in, r, 0, r.length);
  471. return r;
  472. }
  473. final ByteArrayOutputStream r = new ByteArrayOutputStream();
  474. final byte[] buf = new byte[2048];
  475. int n;
  476. while ((n = in.read(buf)) >= 0)
  477. r.write(buf, 0, n);
  478. return r.toByteArray();
  479. } finally {
  480. in.close();
  481. }
  482. }
  483. }
  484. }