You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

Store Git on any DHT jgit.storage.dht is a storage provider implementation for JGit that permits storing the Git repository in a distributed hashtable, NoSQL system, or other database. The actual underlying storage system is undefined, and can be plugged in by implementing 7 small interfaces: * Database * RepositoryIndexTable * RepositoryTable * RefTable * ChunkTable * ObjectIndexTable * WriteBuffer The storage provider interface tries to assume very little about the underlying storage system, and requires only three key features: * key -> value lookup (a hashtable is suitable) * atomic updates on single rows * asynchronous operations (Java's ExecutorService is easy to use) Most NoSQL database products offer all 3 of these features in their clients, and so does any decent network based cache system like the open source memcache product. Relying only on key equality for data retrevial makes it simple for the storage engine to distribute across multiple machines. Traditional SQL systems could also be used with a JDBC based spi implementation. Before submitting this change I have implemented six storage systems for the spi layer: * Apache HBase[1] * Apache Cassandra[2] * Google Bigtable[3] * an in-memory implementation for unit testing * a JDBC implementation for SQL * a generic cache provider that can ride on top of memcache All six systems came in with an spi layer around 1000 lines of code to implement the above 7 interfaces. This is a huge reduction in size compared to prior attempts to implement a new JGit storage layer. As this package shows, a complete JGit storage implementation is more than 17,000 lines of fairly complex code. A simple cache is provided in storage.dht.spi.cache. Implementers can use CacheDatabase to wrap any other type of Database and perform fast reads against a network based cache service, such as the open source memcached[4]. An implementation of CacheService must be provided to glue this spi onto the network cache. [1] https://github.com/spearce/jgit_hbase [2] https://github.com/spearce/jgit_cassandra [3] http://labs.google.com/papers/bigtable.html [4] http://memcached.org/ Change-Id: I0aa4072781f5ccc019ca421c036adff2c40c4295 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
13 anni fa
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889
  1. JGit Storage on DHT
  2. -------------------
  3. This implementation still has some pending issues:
  4. * DhtInserter must skip existing objects
  5. DirCache writes all trees to the ObjectInserter, letting the
  6. inserter figure out which trees we already have, and which are new.
  7. DhtInserter should buffer trees into a chunk, then before writing
  8. the chunk to the DHT do a batch lookup to find the existing
  9. ObjectInfo (if any). If any exist, the chunk should be compacted to
  10. eliminate these objects, and if there is room in the chunk for more
  11. objects, it should go back to the DhtInserter to be filled further
  12. before flushing.
  13. This implies the DhtInserter needs to work on multiple chunks at
  14. once, and may need to combine chunks together when there is more
  15. than one partial chunk.
  16. * DhtPackParser must check for collisions
  17. Because ChunkCache blindly assumes any copy of an object is an OK
  18. copy of an object, DhtPackParser needs to validate all new objects
  19. at the end of its importing phase, before it links the objects into
  20. the ObjectIndexTable. Most objects won't already exist, but some
  21. may, and those that do must either be removed from their chunk, or
  22. have their content byte-for-byte validated.
  23. Removal from a chunk just means deleting it from the chunk's local
  24. index, and not writing it to the global ObjectIndexTable. This
  25. creates a hole in the chunk which is wasted space, and that isn't
  26. very useful. Fortunately objects that fit fully within one chunk
  27. may be easy to inflate and double check, as they are small. Objects
  28. that are big span multiple chunks, and the new chunks can simply be
  29. deleted from the ChunkTable, leaving the original chunks.
  30. Deltas can be checked quickly by inflating the delta and checking
  31. only the insertion point text, comparing that to the existing data
  32. in the repository. Unfortunately the repository is likely to use a
  33. different delta representation, which means at least one of them
  34. will need to be fully inflated to check the delta against.
  35. * DhtPackParser should handle small-huge-small-huge
  36. Multiple chunks need to be open at once, in case we get a bad
  37. pattern of small-object, huge-object, small-object, huge-object. In
  38. this case the small-objects should be put together into the same
  39. chunk, to prevent having too many tiny chunks. This is tricky to do
  40. with OFS_DELTA. A long OFS_DELTA requires all prior chunks to be
  41. closed out so we know their lengths.
  42. * RepresentationSelector performance bad on Cassandra
  43. The 1.8 million batch lookups done for linux-2.6 kills Cassandra, it
  44. cannot handle this read load.
  45. * READ_REPAIR isn't fully accurate
  46. There are a lot of places where the generic DHT code should be
  47. helping to validate the local replica is consistent, and where it is
  48. not, help the underlying storage system to heal the local replica by
  49. reading from a remote replica and putting it back to the local one.
  50. Most of this should be handled in the DHT SPI layer, but the generic
  51. DHT code should be giving better hints during get() method calls.
  52. * LOCAL / WORLD writes
  53. Many writes should be done locally first, before they replicate to
  54. the other replicas, as they might be backed out on an abort.
  55. Likewise some writes must take place across sufficient replicas to
  56. ensure the write is not lost... and this may include ensuring that
  57. earlier local-only writes have actually been committed to all
  58. replicas. This committing to replicas might be happening in the
  59. background automatically after the local write (e.g. Cassandra will
  60. start to send writes made by one node to other nodes, but doesn't
  61. promise they finish). But parts of the code may need to force this
  62. replication to complete before the higher level git operation ends.
  63. * Forks/alternates
  64. Forking is common, but we should avoid duplicating content into the
  65. fork if the base repository has it. This requires some sort of
  66. change to the key structure so that chunks are owned by an object
  67. pool, and the object pool owns the repositories that use it. GC
  68. proceeds at the object pool level, rather than the repository level,
  69. but might want to take some of the reference namespace into account
  70. to avoid placing forked less-common content near primary content.