mirrors/jgit - jgit - source @ dussan.org

Grafico dei commit

Autore	SHA1	Messaggio	Data
Matthias Sohn	27ee334213	Don't remove pack from pack list for problems which could be transient If we hit a corrupt object or invalid pack remove the pack from the pack list. Other IOException could be transient hence we should not remove the pack from the list to avoid the problem reported on the Gerrit list [1]. It looks like in the reported case the pack was removed from the pack list causing MissingObjectExceptions which disappear when the server is restarted. [1] https://groups.google.com/forum/#!topic/repo-discuss/Qdmbl-YZ4NU Change-Id: I331626110d54b190e46cddc2c40f29ddeb9613cd Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	9 anni fa
Matthias Sohn	9b86ebb4f6	Log reason for ignoring pack when IOException occurred This should help to identify the root cause of the problem discussed on the Gerrit list [1]. [1] https://groups.google.com/forum/#!topic/repo-discuss/Qdmbl-YZ4NU Change-Id: I871f70e4bb1227952e1544b789013583b14e2b96 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	9 anni fa
Christian Halstrick	1b9130e8db	Make sure modifications to config-param trustFolderStat are detected ObjectDirectory.searchPacksAgain() should always read trustFolderStat from the config and not rely on a cached value. Change-Id: I90edbaae3c64eea0c9894d05acde4267991575ee	9 anni fa
Christian Halstrick	0fc8b05a71	Introduce config parameter core.trustfolderstat JGit's ObjectDirectory implements the optimization that it remembers the pack folders (.git/objects/pack) lastModified timestamp and doesn't check for new packfiles in this folder if the lastModified attribute has not changed. In environments using NFS this can cause trouble. If multiple JGit instances from multiple machines work on the same repository and one instance creates a new ref and a new packfile (e.g. by doing a fetch) then the other machines may detect the new ref but can't resolve the referenced object because it doesn't detect that pack folder has a new packfile. That's because NFS may cache file/folder metadata for quite a long time and the pack folders modification time is not updated although a new packfile is there and could be read. The new config parameter core.trustfolderstat controls this behaviour. The default is true and jgits behaviours is unchanged. But if this parameter is set to false then jgit doesn't trust the pack directories lastmodified anymore. Instead it will always iterate through the content of that folder to detect new packfiles. Change-Id: Ie3b4e92933286aa9916070a22422e629b3147f54 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	9 anni fa
Marc Strapetz	59a2dc801c	Files should be deleted with "retry" option Some of our Windows users have reported sporadic file system access problems related to ObjectDirectory(Inserter) file deletion code in combination with antiviral/firewall tools. For one of these users the problem was fairly reproducible and changing deletion to RETRY solved his problem. Change-Id: I1e4001d5557fca693b7bac401268599467cb0c9e Signed-off-by: Marc Strapetz <marc.strapetz@syntevo.com>‌	10 anni fa
Robin Rosenberg	5ef6d69532	Use the new FS.exists method in commonly occuring places Allegedly this should improve performance, but I could not see it. Change-Id: Id2057cb2cfcb46e94ff954483ce23f9c4a7edc5e	11 anni fa
Shawn Pearce	d1aacc415a	Fix MissingObjectException race in ObjectDirectory Johannes Carlsson identified a race condition[1] that can lead to spurious MissingObjectExceptions at read time. If two threads are active inside of ObjectDirectory looking for a packed object and the packList is currently the empty NO_PACKS list, thread A will find no object and eventually consider tryAgain1(). If thread A is put to sleep and this point and thread B also does not find the object, loads the packs, when thread A wakes up its tryAgain1 would return false and the thread never considers the packs. Rework the internal API of ObjectDirectory to keep a handle on the exact PackList that was iterated by thread A, allowing it to always retry walking through the packs if the new PackList is different. This had some ripple effect into the CachedObjectDirectory and the shared FileObjectDatabase interface. The new code should be slightly easier to follow, especially from the perspective of the CachedObjectDirectory trying to minimize the number of open system calls it makes to files matching "$GIT_DIR/objects/??/?x{38}". [1] http://dev.eclipse.org/mhonarc/lists/jgit-dev/msg02401.html Change-Id: I9a1c9d6ad6cb38404b7b9178167b714077561353	10 anni fa
Shawn Pearce	f32b861243	JGit 3.0: move internal classes into an internal subpackage This breaks all existing callers once. Applications are not supposed to build against the internal storage API unless they can accept API churn and make necessary updates as versions change. Change-Id: I2ab1327c202ef2003565e1b0770a583970e432e9	11 anni fa
Shawn Pearce	3760e4319b	Remove cached_packs support in favor of bitmaps The bitmap code in PackWriter knows exactly when to use a pack as a "cached pack". It enables cached pack usage only when the pack has a bitmap and its entire closure of objects needs to be sent. This is a much simpler code path to maintain, and JGit actually has a way to write the necessary index. Change-Id: I2645d482f8733fdf0c4120cc59ba9aa4d4ba6881	11 anni fa
Colby Ranger	3b325917a5	Added read/write support for pack bitmap index. A pack bitmap index is an additional index of compressed bitmaps of the object graph. Furthermore, a logical API of the index functionality is included, as it is expected to be used by the PackWriter. Compressed bitmaps are created using the javaewah library, which is a word-aligned compressed variant of the Java bitset class based on run-length encoding. The library only works with positive integer values. Thus, the maximum number of ObjectIds in a pack file that this index can currently support is limited to Integer.MAX_VALUE. Every ObjectId is given an integer mapping. The integer is the position of the ObjectId in the complete ObjectId list, sorted by offset, for the pack file. That integer is what the bitmaps use to reference the ObjectId. Currently, the new index format can only be used with pack files that contain a complete closure of the object graph e.g. the result of a garbage collection. The index file includes four bitmaps for the Git object types i.e. commits, trees, blobs, and tags. In addition, a collection of bitmaps keyed by an ObjectId is also included. The bitmap for each entry in the collection represents the full closure of ObjectIds reachable from the keyed ObjectId (including the keyed ObjectId itself). The bitmaps are further compressed by XORing the current bitmaps against prior bitmaps in the index, and selecting the smallest representation. The XOR'd bitmap and offset from the current entry to the position of the bitmap to XOR against is the actual representation of the entry in the index file. Each entry contains one byte, which is currently used to note whether the bitmap should be blindly reused. Change-Id: Id328724bf6b4c8366a088233098c18643edcf40f	11 anni fa
Colby Ranger	4a317a1790	Include supported extensions in PackFile constructor. Previously a PackFile class was assumed to only support a .pack and .idx file. Update the constructor to enumerate the supported extensions for the pack file. This will allow the bitmap code to only be executed if the bitmap extension file is known to exist. Change-Id: Ie59041dffec5f60d7ea2771026ffd945106bd4bf	11 anni fa
Roberto Tyley	5dcc8693d7	Fix concurrent creation of fan-out object directories If multiple threads attempted to insert loose objects into the same new fan-out directory, the creation of that directory was subject to a race condition that could lead to an unnecessary IOException being thrown - because an inserter could not 'create' a directory that had just been generated by a different thread. All we require is that the directory does indeed exist, so not being able to _create_ it is not actually a fatal problem. Setting 'skipExisting' to 'true' on the call to mkdir() fixes the issue. I found this issue as a real world occurrence while working on The BFG Repo Cleaner (https://github.com/rtyley/bfg-repo-cleaner), a tool which concurrently performs a lot of object creation. In order to demonstrate the problem here I've added a small test case which reliably reproduces the issue on the few different hardware systems I've tried. The error thrown when the race-condition arises is this: java.io.IOException: Creating directory /home/roberto/repo.git/objects/e6 failed at org.eclipse.jgit.util.FileUtils.mkdir(FileUtils.java:182) at org.eclipse.jgit.storage.file.ObjectDirectory.insertUnpackedObject(ObjectDirectory.java:590) at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insertOneObject(ObjectDirectoryInserter.java:113) at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insert(ObjectDirectoryInserter.java:91) at org.eclipse.jgit.lib.ObjectInserter.insert(ObjectInserter.java:329) Change-Id: I88eac49bc600c56ba9ad290e6133d8a7113125ab	11 anni fa
Colby Ranger	82ecfb3e31	Remove packIndex field from FileObjDatabase openPack method. Previously, the FileObjDatabase required both the pack file path and index file path to be passed to openPack(). A future change to add a bitmap index will add a .bitmap file parallel to the pack file (similar to the .idx file). Update the PackFile to support automatically loading pack index extensions based on the pack file path. Change-Id: Ifc8fc3e57f4afa177ba5a88df87334dbfa799f01	11 anni fa
Robin Rosenberg	c310fa0c80	Mark non-externalizable strings as such A few classes such as Constanrs are marked with @SuppressWarnings, as are toString() methods with many liternal, but otherwise $NLS-n$ is used for string containing text that should not be translated. A few literals may fall into the gray zone, but mostly I've tried to only tag the obvious ones. Change-Id: I22e50a77e2bf9e0b842a66bdf674e8fa1692f590	11 anni fa
Marc Strapetz	67edd3eda7	RevWalk support for shallow clones StartGenerator now processes .git/shallow to have the RevWalk stop for shallow commits. See RevWalkShallowTest for tests. Bug: 394543 CQ: 6908 Change-Id: Ia5af1dab3fe9c7888f44eeecab1e1bcf2e8e48fe Signed-off-by: Chris Aniszczyk <zx@twitter.com>	11 anni fa
Robin Rosenberg	95d311f888	Move JGitText to an internal package Change-Id: I763590a45d75f00a09097ab6f89581a3bbd3c797	12 anni fa
Shawn O. Pearce	461b012e95	PackWriter: Support reuse of entire packs The most expensive part of packing a repository for transport to another system is enumerating all of the objects in the repository. Once this gets to the size of the linux-2.6 repository (1.8 million objects), enumeration can take several CPU minutes and costs a lot of temporary working set memory. Teach PackWriter to efficiently reuse an existing "cached pack" by answering a clone request with a thin pack followed by a larger cached pack appended to the end. This requires the repository owner to first construct the cached pack by hand, and record the tip commits inside of $GIT_DIR/objects/info/cached-packs: cd $GIT_DIR root=$(git rev-parse master) tmp=objects/.tmp-$$ names=$(echo $root \| git pack-objects --keep-true-parents --revs $tmp) for n in $names; do chmod a-w $tmp-$n.pack $tmp-$n.idx touch objects/pack/pack-$n.keep mv $tmp-$n.pack objects/pack/pack-$n.pack mv $tmp-$n.idx objects/pack/pack-$n.idx done (echo "+ $root"; for n in $names; do echo "P $n"; done; echo) >>objects/info/cached-packs git repack -a -d When a clone request needs to include $root, the corresponding cached pack will be copied as-is, rather than enumerating all of the objects that are reachable from $root. For a linux-2.6 kernel repository that should be about 376 MiB, the above process creates two packs of 368 MiB and 38 MiB[1]. This is a local disk usage increase of ~26 MiB, due to reduced delta compression between the large cached pack and the smaller recent activity pack. The overhead is similar to 1 full copy of the compressed project sources. With this cached pack in hand, JGit daemon completes a clone request in 1m17s less time, but a slightly larger data transfer (+2.39 MiB): Before: remote: Counting objects: 1861830, done remote: Finding sources: 100% (1861830/1861830) remote: Getting sizes: 100% (88243/88243) remote: Compressing objects: 100% (88184/88184) Receiving objects: 100% (1861830/1861830), 376.01 MiB \| 19.01 MiB/s, done. remote: Total `1861830` (delta 4706), reused `1851053` (delta `1553844`) Resolving deltas: 100% (1564621/1564621), done. real 3m19.005s After: remote: Counting objects: 1601, done remote: Counting objects: 1828460, done remote: Finding sources: 100% (50475/50475) remote: Getting sizes: 100% (18843/18843) remote: Compressing objects: 100% (7585/7585) remote: Total `1861830` (delta 2407), reused `1856197` (delta 37510) Receiving objects: 100% (1861830/1861830), 378.40 MiB \| 31.31 MiB/s, done. Resolving deltas: 100% (1559477/1559477), done. real 2m2.938s Repository owners can periodically refresh their cached packs by repacking their repository, folding all newer objects into a larger cached pack. Since repacking is already considered to be a normal Git maintenance activity, this isn't a very big burden. [1] In this test $root was set back about two weeks. Change-Id: Ib87131d5c4b5e8c5cacb0f4fe16ff4ece554734b Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 anni fa
Matthias Sohn	38eec8f4a2	[findbugs] Do not ignore exceptional return value of mkdir java.io.File.mkdir() and mkdirs() report failure as an exceptional return value false. Fix the code which silently ignored this exceptional return value. Change-Id: I41244f4b9d66176e68e2c07e2329cf08492f8619 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 anni fa
Robin Rosenberg	24e7f0f6fa	Fix tests broken by fix for adding files in a network share The change Ie0350e032a97e0d09626d6143c5c692873a5f6a2 was not done properly. The renamed file was not write protected, and this broke a test. Bug: 335388 Change-Id: I41b2235b7677bc5fddc70dda2a56cdd2cb53ce5d Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>	13 anni fa
Robin Rosenberg	c4c8d80fd3	Fix adding files in a network share We cannot always rename read-only files on network shares, so rename the temp file for a new loose object first, and then set it as read-only. Bug: 335388 Change-Id: Ie0350e032a97e0d09626d6143c5c692873a5f6a2 Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>	13 anni fa
Shawn O. Pearce	1bf0c3cdb1	Refactor IndexPack to not require local filesystem By moving the logic that parses a pack stream from the network (or a bundle) into a type that can be constructed by an ObjectInserter, repository implementations have a chance to inject their own logic for storing object data received into the destination repository. The API isn't completely generic yet, there are still quite a few assumptions that the PackParser subclass is storing the data onto the local filesystem as a single file. But its about the simplest split of IndexPack I can come up with without completely ripping the code apart. Change-Id: I5b167c9cc6d7a7c56d0197c62c0fd0036a83ec6c Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>	13 anni fa
Shawn O. Pearce	c8db22f355	Extract pack directory last modified check code Pulling the last modified checking logic out of ObjectDirectory makes it possible to reuse this code for other files, such as the $GIT_DIR/config or $GIT_DIR/packed-refs files. Change-Id: If2f27a89fc3b7adde7e65ff40bbca5d55b98b772 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 anni fa
Matthias Sohn	45731756a5	[findbugs] Do not ignore exceptional return value java.io.File.delete() reports failure as an exceptional return value false. Fix the code which silently ignored this exceptional return value. Also remove some duplicate deletion helper methods. Change-Id: I80ed20ca1f07a2bc6e779957a4ad0c713789c5be Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 anni fa
Shawn O. Pearce	d00420ae6e	Make ObjectDirectory getPacks() work the first time If an object hasn't been accessed yet the pack list for a repository may not have been scanned from disk. If an application (e.g. the dumb transport servlet support code) asks for the pack list for an ObjectDirectory, we should load it immediately. Change-Id: I93d7b1bca422d905948e8e83b2afa83c8894a68b Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>	13 anni fa
Shawn O. Pearce	e51e06946f	Update CachedObjectDirectory when inserting objects If an ObjectInserter is created from a CachedObjectDirectory, we need to ensure the cache is updated whenever a new loose object is actually added to the loose objects directory, otherwise a future read from an ObjectReader on the CachedObjectDirectory might not be able to open the newly created object. We mostly had the infrastructure in place to implement this due to the injection of unpacked large deltas, but we didn't have a way to pass the ObjectId from ObjectDirectoryInserter to CachedObjectDirectory, because the inserter was using the underlying ObjectDirectory and not the CachedObjectDirectory. Redirecting to CachedObjectDirectory ensures the cache is updated. Change-Id: I1f7bdfacc7ad77ebdb885f655e549cc570652225 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 anni fa
Shawn O. Pearce	5fce8d81d8	Fix cloning of repositories with big objects When running IndexPack we use a CachedObjectDirectory, which knows what objects are loose and tries to avoid stat(2) calls for objects that do not exist in the repository, as stat(2) on Win32 is very slow. However large delta objects found in a pack file are expanded into a loose object, in order to avoid costly delta chain processing when that object is used as a base for another delta. If this expand occurs while working with the CachedObjectDirectory, we need to update the cached directory data to include this new object, otherwise it won't be available when we try to open it during the object verify phase. Bug: 324868 Change-Id: Idf0c76d4849d69aa415ead32e46a435622395d68 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 anni fa
Shawn O. Pearce	41dd9ed1c0	Unpack and cache large deltas as loose objects Instead of spooling large delta bases into temporary files and then immediately deleting them afterwards, spool the large delta out to a normal loose object. Later any requests for that large delta can be answered by reading from the loose object, which is much easier to stream efficiently for readers. Since the object is now duplicated, once in the pack as a delta and again as a loose object, any future prune-packed will automatically delete the loose object variant, releasing the wasted disk space. As prune-packed is run automatically during either repack or gc, and gc --auto triggers automatically based on the number of loose objects, we get automatic cache management for free. Large objects that were unpacked will be periodically cleared out, and will simply be restored later if they are needed again. After a short offline discussion with Junio Hamano today, we may want to propose a change to prune-packed to hold onto larger loose objects which also exist in pack files as deltas, if the loose object was recently accessed or modified in the last 2 days. Change-Id: I3668a3967c807010f48cd69f994dcbaaf582337c Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 anni fa
Shawn O. Pearce	3f66e65e71	Remember loose objects and fast-track their lookup Recently created objects are usually what branches point to, and are usually written out as loose objects. But due to the high cost of asking the operating system if a file exists, these are the last thing that ObjectDirectory examines when looking for an object by its ObjectId. Caching recently seen loose objects permits the opening code to jump directly to the loose object, accelerating lookup for branch heads that are accessed often. To avoid exploding the cache its limited to approximately 2048 entries. When more ids are added, the table is simply cleared and reset in size. Change-Id: I18f483217412b102f754ffd496c87061d592e535 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 anni fa
Shawn O. Pearce	e29cd27961	Move ObjectDirectory streaming limit to WindowCacheConfig IDEs like Eclipse offer up the settings in WindowCacheConfig to the user as a global set of options that are configured for the entire JVM process, not per-repository, as the cache is shared across the entire JVM. The limit on how much we are willing to allocate for an object buffer is similar to the limit on how much we can use for data caches, allocating that much space impacts the entire JVM and not just a single repository, so it should be a global limit. Change-Id: I22eafb3e223bf8dea57ece82cd5df8bfe5badebc Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 anni fa
Shawn O. Pearce	1c3f3fdbd2	Fix ObjectDirectory abbreviation resolution to notice new packs If we can't resolve an abbreviation, it might be because there is a new pack file we haven't picked up yet. Try scanning the packs again and recheck each pack if there were differences from the last scan we did. Because of this, we don't have to open a pack during the test where we generate a pack on the fly. We'll miss on the first loop during which the PackList is the NO_PACKS magic initialization constant, and pick up the newly created index during this retry logic. Change-Id: I7b97efb29a695ee60c90818be380f7ea23ad13a3 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 anni fa
Shawn O. Pearce	a5c18fcfc7	Fully implement SHA-1 abbreviations ObjectReader implementations are now responsible for creating the unique abbreviation of an ObjectId, or for resolving an abbreviation back to its full form. In this latter case the reader can offer up multiple candidates to the caller, who may be able to disambiguate them based on context. Repository.resolve() doesn't take multiple candidates into account right now, but it could in the future by looking for a remaining ^0 or ^{commit} suffix and take an expansion if there is only one commit that matches the input abbreviation. It could also use the distance from an annotated tag to resolve "tag-NNN-gcommit" style strings that are often output by `git describe`. Change-Id: Icd3250adc8177ae05278b858933afdca0cbbdb56 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	13 anni fa
Shawn O. Pearce	b584cb8754	Add getObjectSize to ObjectReader This is an informational function used by PackWriter to help it better organize objects for delta compression. Storage systems can implement it to provide up more detailed size information, or they can simply rely on the default behavior that uses the ObjectLoader obtained from open. For local file storage, we can obtain this information faster through specialized routines that parse a pack object header. Change-Id: I13a09b4effb71ea5151b51547f7d091564531e58 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anni fa
Shawn O. Pearce	113577617b	Use core.streamFileThreshold to set our streaming limit We default this to 1 MiB for now, but we allow users to modify it through the Repository's configuration file to be a different value. A new repository listener is used to identify when the setting has been updated and trigger a reconfiguration of any active ObjectReaders. To prevent a horrible explosion we cap core.streamFileThreshold at no more than 1/4 of the maximum JVM heap size. We do this because we need at least 2 byte arrays equal in size to the stream threshold for the worst case delta inflation scenario, and our host application probably also needs some amount of the heap for their working set size. Change-Id: I103b3a541dc970bbf1a6d92917a12c5a1ee34d6c Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anni fa
Shawn O. Pearce	13e0218a25	Replace PackedObjectLoader with ObjectLoader.SmallObject The class is identical, but ObjectLoader.SmallObject is part of our public API for storage implementations to build on top of. Change-Id: I381a3953b14870b6d3d74a9c295769ace78869dc Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anni fa
Shawn O. Pearce	fa23482ca7	Support large loose objects as streams Big loose objects can now be streamed if they are over the large object size threshold. This prevents the JVM heap from exploding with a very large byte array to hold the slurped file, and then again with its uncompressed copy. We may have slightly slowed down the simple case for small loose objects, as the loader no longer slurps the entire thing and decompresses in memory. To try and keep good performance for the very common small objects that are below 8 KiB in size, buffers are set to 8 KiB, causing the reader to slurp most of the file anyway. However the data has to be copied at least once, from the BufferedInputStream into the InflaterInputStream. New unit tests are supplied to get nearly 100% code coverage on the unpacked code paths, for both standard and pack style loose objects. We tested a fair chunk of the code elsewhere, but these new tests are better isolated to the specific branches in the code path. Change-Id: I87b764ab1b84225e9b5619a2a55fd8eaa640e1fe Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anni fa
Shawn O. Pearce	ea21c111cb	Move PackWriter over to storage.pack.PackWriter Similar to what we did with the file code, move the pack writer into its own package so the related classes and their package private methods are hidden from the rest of the library. Change-Id: Ic1b5c7c8c8d266e90c910d8d68dfc8e93586854f Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anni fa
Shawn O. Pearce	71aace52f7	Simplify ObjectLoaders coming from PackFile We no longer need an ObjectLoader to be lazy and try to delay the materialization of the object content. That was done only to support PackWriter searching for a good reuse candidate. Instead, simplify the code base by doing the materialization immediately when the loader asks for it, because any caller asking for the loader is going to need the content. Change-Id: Id867b1004529744f234ab8f9cfab3d2c52ca3bd0 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anni fa
Shawn O. Pearce	86547022f0	Tighten up local packed object representation during packing Rather than making a loader, and then using that to fill the object representation, parse the header and set up our data directly. This saves some time, as we don't waste cycles on information we won't use right now. The weight computed for a representation is now its actual stored size in the pack file, rather than its inflated size. This accounts for changes made when the compression level is modified on the repository. It is however more costly to determine the weight of the object, since we have to find its length in the pack. To try and recover that cost we now cache the length as part of our ObjectToPack record, so it doesn't have to be found during the output phase. A LocalObjectToPack now costs us (assuming 32 bit pointers): (32 bit) (64 bit) vm header: 8 bytes 8 bytes ObjectId: 20 bytes 20 bytes PackedObjectInfo: 12 bytes 12 bytes ObjectToPack: 8 bytes 12 bytes LocalOTP: 20 bytes 24 bytes ----------- --------- 68 bytes 74 bytes Change-Id: I923d2736186eb2ac8ab498d3eb137e17930fcb50 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anni fa
Shawn O. Pearce	ad5238dc67	Move FileRepository to storage.file.FileRepository This move isolates all of the local file specific implementation code into a single package, where their package-private methods and support classes are properly hidden away from the rest of the core library. Because of the sheer number of files impacted, I have limited this change to only the renames and the updated imports. Change-Id: Icca4884e1a418f83f8b617d0c4c78b73d8a4bd17 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anni fa
Shawn O. Pearce	bf4ffff07f	Redo PackWriter object reuse selection The new selection implementation uses a public API on the ObjectReader, allowing the storage library to enumerate its candidates and select the best one for this packer without needing to build a temporary list of the candidates first. Change-Id: Ie01496434f7d3581d6d3bbb9e33c8f9fa649b6cd Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anni fa
Shawn O. Pearce	5cfc29b491	Replace WindowCache with ObjectReader The WindowCache is an implementation detail of PackFile and how its used by ObjectDirectory. Lets start to hide it and replace the public API with a more generic concept, ObjectReader. Because PackedObjectLoader is also considered a private detail of PackFile, we have to make PackWriter temporarily dependent upon the WindowCursor and thus FileRepository and ObjectDirectory in order to just start the refactoring. In later changes we will clean up the APIs more, exposing sufficient support to PackWriter without needing the file specific implementation details. Change-Id: I676be12b57f3534f1285854ee5de1aa483895398 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anni fa
Shawn O. Pearce	133c987f4d	Refactor alternate object databases below ObjectDirectory Not every object storage system will have the concept of alternate object databases to search, and even if they do, they may not have the notion of fast-access / slow-access split like we do within the ObjectDirectory code for pack files and loose objects. Push all of that down below the generic API so that it is a hidden detail of the ObjectDirectory and its related supporting classes. Change-Id: I54bc1ca5ff2ac94dfffad1f9a9dad7af202b9523 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anni fa
Shawn O. Pearce	cad10e6640	Refactor object writing responsiblities to ObjectDatabase The ObjectInserter API permits ObjectDatabase implementations to control their own object insertion behavior, rather than forcing it to always be a new loose file created in the local filesystem. Inserted objects can also be queued and written asynchronously to the main application, such as by appending into a pack file that is later closed and added to the repository. This change also starts to open the door to non-file based object storage, such as an in-memory HashMap for unit testing, or a more complex system built on top of a distributed hash table. To help existing application code port to the newer interface we are keeping ObjectWriter as a delegation wrapper to the new API. Each ObjectWriter instances holds a reference to an ObjectInserter for the Repository's top-level ObjectDatabase, and it flushes and releases that instance on each object processed. Change-Id: I413224fb95563e7330c82748deb0aada4e0d6ace Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anni fa
Marc Strapetz	936e4ab2f2	Repository can be configured with FS On Windows, FS_Win32_Cygwin has been used if a Cygwin Git installation is present in the PATH. Assuming that the user works with the Cygwin Git installation may result in unnecessary overhead if he actually does not. Applications built on top of jgit may have more knowledge on the actually used Git client (Cygwin or not) and hence should be able to configure which FS to use accordingly. Change-Id: Ifc4278078b298781d55cf5421e9647a21fa5db24	14 anni fa
Sasa Zivkov	f3d8a8ecad	Externalize strings from JGit The strings are externalized into the root resource bundles. The resource bundles are stored under the new "resources" source folder to get proper maven build. Strings from tests are, in general, not externalized. Only in cases where it was necessary to make the test pass the strings were externalized. This was typically necessary in cases where e.getMessage() was used in assert and the exception message was slightly changed due to reuse of the externalized strings. Change-Id: Ic0f29c80b9a54fcec8320d8539a3e112852a1f7b Signed-off-by: Sasa Zivkov <sasa.zivkov@sap.com>	14 anni fa
Shawn O. Pearce	374c28057a	Don't insert the same pack twice into a pack list If a concurrent thread picks up a newly created PackFile and adds it to the pack list before the IndexPack thread itself can insert the item onto the front of the list, do nothing and use the item that was picked up by that other concurrent scanning thread. This avoids a potential condition where the same pack exists in memory twice, which causes confusion later during a rescan of the directory because we don't know exactly which PackFile instance should be retained into the new list, and which should be discarded. We can stop searching through the old pack list as soon as the sort function declares that the item to insert should be before the item already in the list. Because the list is always sorted by modification time (in seconds), we should never encounter a case where the pack is positioned at the wrong spot in the list. This early break out still permits an efficient implementation of the common case, inserting a new pack at the head of the list. Change-Id: Ice4459bbd4ee9487078aff5257893883d04f05fb Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anni fa
Shawn O. Pearce	a0a52897ed	Favor earlier PackFile instances over later duplicates There is a potential race condition during insertPack that can lead to us having the same pack file open twice in the same directory. A different thread can miss an object on disk, and trigger a scan of the directory, and notice the pack that was put in by IndexPack. So the pack winds up in the newly created PackList. The IndexPack thread then wakes up and finishes its insertPack by creating a new PackFile and inserting it into position 0 of the list. We now have the same pack listed twice. Readers will favor the earlier PackFile instance, because its the first one they come across as they iterate through the list. Keep that earlier one when we scan the pack directory again, as this will avoid needing to purge out all of the windows that may have been cached. Of course we should also fix that race condition, but this block was taking the wrong resolution if this error ever shows up, so lets first fix the block to use a more sane resolution. Change-Id: I0d339b9fd1dd8012e8fe5a564b893c0f69109e28 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anni fa
Constantine Plotnikov	cc64794b24	Added caching for loose object lookup during pack indexing On Windows systems, file system lookup is a slow operation, so checking each object if it exists during indexing (after receiving the pack) could take a siginificant time. This patch introduces CachedObjectDirectory that pre-caches lookup results. Bug: 300397 Change-Id: I471b93f9bb3ee173eb37cae1d75e9e4eb49985e7 Signed-off-by: Constantine Plotnikov <constantine.plotnikov@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anni fa
Shawn O. Pearce	0b821817fc	Add getPacks to ObjectDirectory This exposes the list of known packs, allowing callers to list them into a context like the objects/info/packs file. Change-Id: I0b889564bd176836ff5c77ba310c6d229409dcd5 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anni fa
Robin Rosenberg	eb63bfc1b8	Recognize Git repository environment variables This makes the jgit command line behave like the C Git implementation in the respect. These variables are not recognized in the core, though we add support to do the overrides there. Hence other users of the JGit library, like the Eclipse plugin and others, will not be affected. GIT_DIR The location of the ".git" directory. GIT_WORK_TREE The location of the work tree. GIT_INDEX_FILE The location of the index file. GIT_CEILING_DIRECTORIES A colon (semicolon on Windows) separated list of paths that which JGit will not cross when looking for the .git directory. GIT_OBJECT_DIRECTORY The location of the objects directory under which objects are stored. GIT_ALTERNATE_OBJECT_DIRECTORIES A colon (semicolon on Windows) separated list of object directories to search for objects. In addition to these we support the core.worktree config setting when the git directory is set deliberately instead of being found. Change-Id: I2b9bceb13c0f66b25e9e3cefd2e01534a286e04c Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	14 anni fa

8 Commit (27ee3342136a588adbc1eee4b333179d8f6f1aa7)