You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

ObjectDirectoryTest.java 8.5KB

Fix concurrent creation of fan-out object directories If multiple threads attempted to insert loose objects into the same new fan-out directory, the creation of that directory was subject to a race condition that could lead to an unnecessary IOException being thrown - because an inserter could not 'create' a directory that had just been generated by a different thread. All we require is that the directory does indeed *exist*, so not being able to _create_ it is not actually a fatal problem. Setting 'skipExisting' to 'true' on the call to mkdir() fixes the issue. I found this issue as a real world occurrence while working on The BFG Repo Cleaner (https://github.com/rtyley/bfg-repo-cleaner), a tool which concurrently performs a lot of object creation. In order to demonstrate the problem here I've added a small test case which reliably reproduces the issue on the few different hardware systems I've tried. The error thrown when the race-condition arises is this: java.io.IOException: Creating directory /home/roberto/repo.git/objects/e6 failed at org.eclipse.jgit.util.FileUtils.mkdir(FileUtils.java:182) at org.eclipse.jgit.storage.file.ObjectDirectory.insertUnpackedObject(ObjectDirectory.java:590) at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insertOneObject(ObjectDirectoryInserter.java:113) at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insert(ObjectDirectoryInserter.java:91) at org.eclipse.jgit.lib.ObjectInserter.insert(ObjectInserter.java:329) Change-Id: I88eac49bc600c56ba9ad290e6133d8a7113125ab
11 years ago
Fix concurrent creation of fan-out object directories If multiple threads attempted to insert loose objects into the same new fan-out directory, the creation of that directory was subject to a race condition that could lead to an unnecessary IOException being thrown - because an inserter could not 'create' a directory that had just been generated by a different thread. All we require is that the directory does indeed *exist*, so not being able to _create_ it is not actually a fatal problem. Setting 'skipExisting' to 'true' on the call to mkdir() fixes the issue. I found this issue as a real world occurrence while working on The BFG Repo Cleaner (https://github.com/rtyley/bfg-repo-cleaner), a tool which concurrently performs a lot of object creation. In order to demonstrate the problem here I've added a small test case which reliably reproduces the issue on the few different hardware systems I've tried. The error thrown when the race-condition arises is this: java.io.IOException: Creating directory /home/roberto/repo.git/objects/e6 failed at org.eclipse.jgit.util.FileUtils.mkdir(FileUtils.java:182) at org.eclipse.jgit.storage.file.ObjectDirectory.insertUnpackedObject(ObjectDirectory.java:590) at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insertOneObject(ObjectDirectoryInserter.java:113) at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insert(ObjectDirectoryInserter.java:91) at org.eclipse.jgit.lib.ObjectInserter.insert(ObjectInserter.java:329) Change-Id: I88eac49bc600c56ba9ad290e6133d8a7113125ab
11 years ago
Fix FileSnapshot.isModified FileSnapshot.isModified may have reported a file to be clean although it was actually dirty. Imagine you have a FileSnapshot on file f. lastmodified and lastread are both t0. Now time is t1 and you 1) modify the file 2) update the FileSnapshot of the file (lastModified=t1, lastRead=t1) 3) modify the file again 4) wait 3 seconds 5) ask the Filesnapshot whether the file is dirty or not. It erroneously answered it's clean. Any file which has been modified longer than 2.5 seconds ago was reported to be clean. As the test shows that's not always correct. The real-world problem fixed by this change is the following: * A gerrit server using JGit to serve git repositories is processing fetch requests while simultaneously a native git garbage collection runs on the repo. * At time t1 native git writes temporary files in the pack folder setting the mtime of the pack folder to t1. * A fetch request causes JGit to search for new packfiles and JGit remembers this scan in a Filesnapshot on the packs folder. Since the gc is not finished JGit doesn't see any new packfiles. * The fetch is processed and the gc ends while the filesystem timer is still t1. GC writes a new packfile and deletes the old packfile. * 3 seconds later another request arrives. JGit does not yet know about the new packfile but is also not rescanning the pack folder because it cached that the last scan happened at time t1 and pack folder's mtime is also t1. Now JGit will not be able to resolve any object contained in this new pack. This behavior may be persistent if objects referenced by the ref/meta/config branch are affected so gerrit can't read permissions stored in the refs/meta/config branch anymore and will not allow any pushes anymore. The pack folder will not change its mtime and therefore no rescan will take place. Change-Id: I3efd0ccffeb97b01207dc3e7a6b85c6b06928fad Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 years ago
Fix FileSnapshot.isModified FileSnapshot.isModified may have reported a file to be clean although it was actually dirty. Imagine you have a FileSnapshot on file f. lastmodified and lastread are both t0. Now time is t1 and you 1) modify the file 2) update the FileSnapshot of the file (lastModified=t1, lastRead=t1) 3) modify the file again 4) wait 3 seconds 5) ask the Filesnapshot whether the file is dirty or not. It erroneously answered it's clean. Any file which has been modified longer than 2.5 seconds ago was reported to be clean. As the test shows that's not always correct. The real-world problem fixed by this change is the following: * A gerrit server using JGit to serve git repositories is processing fetch requests while simultaneously a native git garbage collection runs on the repo. * At time t1 native git writes temporary files in the pack folder setting the mtime of the pack folder to t1. * A fetch request causes JGit to search for new packfiles and JGit remembers this scan in a Filesnapshot on the packs folder. Since the gc is not finished JGit doesn't see any new packfiles. * The fetch is processed and the gc ends while the filesystem timer is still t1. GC writes a new packfile and deletes the old packfile. * 3 seconds later another request arrives. JGit does not yet know about the new packfile but is also not rescanning the pack folder because it cached that the last scan happened at time t1 and pack folder's mtime is also t1. Now JGit will not be able to resolve any object contained in this new pack. This behavior may be persistent if objects referenced by the ref/meta/config branch are affected so gerrit can't read permissions stored in the refs/meta/config branch anymore and will not allow any pushes anymore. The pack folder will not change its mtime and therefore no rescan will take place. Change-Id: I3efd0ccffeb97b01207dc3e7a6b85c6b06928fad Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 years ago
Fix FileSnapshot.isModified FileSnapshot.isModified may have reported a file to be clean although it was actually dirty. Imagine you have a FileSnapshot on file f. lastmodified and lastread are both t0. Now time is t1 and you 1) modify the file 2) update the FileSnapshot of the file (lastModified=t1, lastRead=t1) 3) modify the file again 4) wait 3 seconds 5) ask the Filesnapshot whether the file is dirty or not. It erroneously answered it's clean. Any file which has been modified longer than 2.5 seconds ago was reported to be clean. As the test shows that's not always correct. The real-world problem fixed by this change is the following: * A gerrit server using JGit to serve git repositories is processing fetch requests while simultaneously a native git garbage collection runs on the repo. * At time t1 native git writes temporary files in the pack folder setting the mtime of the pack folder to t1. * A fetch request causes JGit to search for new packfiles and JGit remembers this scan in a Filesnapshot on the packs folder. Since the gc is not finished JGit doesn't see any new packfiles. * The fetch is processed and the gc ends while the filesystem timer is still t1. GC writes a new packfile and deletes the old packfile. * 3 seconds later another request arrives. JGit does not yet know about the new packfile but is also not rescanning the pack folder because it cached that the last scan happened at time t1 and pack folder's mtime is also t1. Now JGit will not be able to resolve any object contained in this new pack. This behavior may be persistent if objects referenced by the ref/meta/config branch are affected so gerrit can't read permissions stored in the refs/meta/config branch anymore and will not allow any pushes anymore. The pack folder will not change its mtime and therefore no rescan will take place. Change-Id: I3efd0ccffeb97b01207dc3e7a6b85c6b06928fad Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 years ago
Fix concurrent creation of fan-out object directories If multiple threads attempted to insert loose objects into the same new fan-out directory, the creation of that directory was subject to a race condition that could lead to an unnecessary IOException being thrown - because an inserter could not 'create' a directory that had just been generated by a different thread. All we require is that the directory does indeed *exist*, so not being able to _create_ it is not actually a fatal problem. Setting 'skipExisting' to 'true' on the call to mkdir() fixes the issue. I found this issue as a real world occurrence while working on The BFG Repo Cleaner (https://github.com/rtyley/bfg-repo-cleaner), a tool which concurrently performs a lot of object creation. In order to demonstrate the problem here I've added a small test case which reliably reproduces the issue on the few different hardware systems I've tried. The error thrown when the race-condition arises is this: java.io.IOException: Creating directory /home/roberto/repo.git/objects/e6 failed at org.eclipse.jgit.util.FileUtils.mkdir(FileUtils.java:182) at org.eclipse.jgit.storage.file.ObjectDirectory.insertUnpackedObject(ObjectDirectory.java:590) at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insertOneObject(ObjectDirectoryInserter.java:113) at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insert(ObjectDirectoryInserter.java:91) at org.eclipse.jgit.lib.ObjectInserter.insert(ObjectInserter.java:329) Change-Id: I88eac49bc600c56ba9ad290e6133d8a7113125ab
11 years ago
Fix FileSnapshot.isModified FileSnapshot.isModified may have reported a file to be clean although it was actually dirty. Imagine you have a FileSnapshot on file f. lastmodified and lastread are both t0. Now time is t1 and you 1) modify the file 2) update the FileSnapshot of the file (lastModified=t1, lastRead=t1) 3) modify the file again 4) wait 3 seconds 5) ask the Filesnapshot whether the file is dirty or not. It erroneously answered it's clean. Any file which has been modified longer than 2.5 seconds ago was reported to be clean. As the test shows that's not always correct. The real-world problem fixed by this change is the following: * A gerrit server using JGit to serve git repositories is processing fetch requests while simultaneously a native git garbage collection runs on the repo. * At time t1 native git writes temporary files in the pack folder setting the mtime of the pack folder to t1. * A fetch request causes JGit to search for new packfiles and JGit remembers this scan in a Filesnapshot on the packs folder. Since the gc is not finished JGit doesn't see any new packfiles. * The fetch is processed and the gc ends while the filesystem timer is still t1. GC writes a new packfile and deletes the old packfile. * 3 seconds later another request arrives. JGit does not yet know about the new packfile but is also not rescanning the pack folder because it cached that the last scan happened at time t1 and pack folder's mtime is also t1. Now JGit will not be able to resolve any object contained in this new pack. This behavior may be persistent if objects referenced by the ref/meta/config branch are affected so gerrit can't read permissions stored in the refs/meta/config branch anymore and will not allow any pushes anymore. The pack folder will not change its mtime and therefore no rescan will take place. Change-Id: I3efd0ccffeb97b01207dc3e7a6b85c6b06928fad Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 years ago
Fix FileSnapshot.isModified FileSnapshot.isModified may have reported a file to be clean although it was actually dirty. Imagine you have a FileSnapshot on file f. lastmodified and lastread are both t0. Now time is t1 and you 1) modify the file 2) update the FileSnapshot of the file (lastModified=t1, lastRead=t1) 3) modify the file again 4) wait 3 seconds 5) ask the Filesnapshot whether the file is dirty or not. It erroneously answered it's clean. Any file which has been modified longer than 2.5 seconds ago was reported to be clean. As the test shows that's not always correct. The real-world problem fixed by this change is the following: * A gerrit server using JGit to serve git repositories is processing fetch requests while simultaneously a native git garbage collection runs on the repo. * At time t1 native git writes temporary files in the pack folder setting the mtime of the pack folder to t1. * A fetch request causes JGit to search for new packfiles and JGit remembers this scan in a Filesnapshot on the packs folder. Since the gc is not finished JGit doesn't see any new packfiles. * The fetch is processed and the gc ends while the filesystem timer is still t1. GC writes a new packfile and deletes the old packfile. * 3 seconds later another request arrives. JGit does not yet know about the new packfile but is also not rescanning the pack folder because it cached that the last scan happened at time t1 and pack folder's mtime is also t1. Now JGit will not be able to resolve any object contained in this new pack. This behavior may be persistent if objects referenced by the ref/meta/config branch are affected so gerrit can't read permissions stored in the refs/meta/config branch anymore and will not allow any pushes anymore. The pack folder will not change its mtime and therefore no rescan will take place. Change-Id: I3efd0ccffeb97b01207dc3e7a6b85c6b06928fad Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 years ago
Fix concurrent creation of fan-out object directories If multiple threads attempted to insert loose objects into the same new fan-out directory, the creation of that directory was subject to a race condition that could lead to an unnecessary IOException being thrown - because an inserter could not 'create' a directory that had just been generated by a different thread. All we require is that the directory does indeed *exist*, so not being able to _create_ it is not actually a fatal problem. Setting 'skipExisting' to 'true' on the call to mkdir() fixes the issue. I found this issue as a real world occurrence while working on The BFG Repo Cleaner (https://github.com/rtyley/bfg-repo-cleaner), a tool which concurrently performs a lot of object creation. In order to demonstrate the problem here I've added a small test case which reliably reproduces the issue on the few different hardware systems I've tried. The error thrown when the race-condition arises is this: java.io.IOException: Creating directory /home/roberto/repo.git/objects/e6 failed at org.eclipse.jgit.util.FileUtils.mkdir(FileUtils.java:182) at org.eclipse.jgit.storage.file.ObjectDirectory.insertUnpackedObject(ObjectDirectory.java:590) at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insertOneObject(ObjectDirectoryInserter.java:113) at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insert(ObjectDirectoryInserter.java:91) at org.eclipse.jgit.lib.ObjectInserter.insert(ObjectInserter.java:329) Change-Id: I88eac49bc600c56ba9ad290e6133d8a7113125ab
11 years ago
Fix concurrent creation of fan-out object directories If multiple threads attempted to insert loose objects into the same new fan-out directory, the creation of that directory was subject to a race condition that could lead to an unnecessary IOException being thrown - because an inserter could not 'create' a directory that had just been generated by a different thread. All we require is that the directory does indeed *exist*, so not being able to _create_ it is not actually a fatal problem. Setting 'skipExisting' to 'true' on the call to mkdir() fixes the issue. I found this issue as a real world occurrence while working on The BFG Repo Cleaner (https://github.com/rtyley/bfg-repo-cleaner), a tool which concurrently performs a lot of object creation. In order to demonstrate the problem here I've added a small test case which reliably reproduces the issue on the few different hardware systems I've tried. The error thrown when the race-condition arises is this: java.io.IOException: Creating directory /home/roberto/repo.git/objects/e6 failed at org.eclipse.jgit.util.FileUtils.mkdir(FileUtils.java:182) at org.eclipse.jgit.storage.file.ObjectDirectory.insertUnpackedObject(ObjectDirectory.java:590) at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insertOneObject(ObjectDirectoryInserter.java:113) at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insert(ObjectDirectoryInserter.java:91) at org.eclipse.jgit.lib.ObjectInserter.insert(ObjectInserter.java:329) Change-Id: I88eac49bc600c56ba9ad290e6133d8a7113125ab
11 years ago
Fix FileSnapshot.isModified FileSnapshot.isModified may have reported a file to be clean although it was actually dirty. Imagine you have a FileSnapshot on file f. lastmodified and lastread are both t0. Now time is t1 and you 1) modify the file 2) update the FileSnapshot of the file (lastModified=t1, lastRead=t1) 3) modify the file again 4) wait 3 seconds 5) ask the Filesnapshot whether the file is dirty or not. It erroneously answered it's clean. Any file which has been modified longer than 2.5 seconds ago was reported to be clean. As the test shows that's not always correct. The real-world problem fixed by this change is the following: * A gerrit server using JGit to serve git repositories is processing fetch requests while simultaneously a native git garbage collection runs on the repo. * At time t1 native git writes temporary files in the pack folder setting the mtime of the pack folder to t1. * A fetch request causes JGit to search for new packfiles and JGit remembers this scan in a Filesnapshot on the packs folder. Since the gc is not finished JGit doesn't see any new packfiles. * The fetch is processed and the gc ends while the filesystem timer is still t1. GC writes a new packfile and deletes the old packfile. * 3 seconds later another request arrives. JGit does not yet know about the new packfile but is also not rescanning the pack folder because it cached that the last scan happened at time t1 and pack folder's mtime is also t1. Now JGit will not be able to resolve any object contained in this new pack. This behavior may be persistent if objects referenced by the ref/meta/config branch are affected so gerrit can't read permissions stored in the refs/meta/config branch anymore and will not allow any pushes anymore. The pack folder will not change its mtime and therefore no rescan will take place. Change-Id: I3efd0ccffeb97b01207dc3e7a6b85c6b06928fad Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 years ago
Fix FileSnapshot.isModified FileSnapshot.isModified may have reported a file to be clean although it was actually dirty. Imagine you have a FileSnapshot on file f. lastmodified and lastread are both t0. Now time is t1 and you 1) modify the file 2) update the FileSnapshot of the file (lastModified=t1, lastRead=t1) 3) modify the file again 4) wait 3 seconds 5) ask the Filesnapshot whether the file is dirty or not. It erroneously answered it's clean. Any file which has been modified longer than 2.5 seconds ago was reported to be clean. As the test shows that's not always correct. The real-world problem fixed by this change is the following: * A gerrit server using JGit to serve git repositories is processing fetch requests while simultaneously a native git garbage collection runs on the repo. * At time t1 native git writes temporary files in the pack folder setting the mtime of the pack folder to t1. * A fetch request causes JGit to search for new packfiles and JGit remembers this scan in a Filesnapshot on the packs folder. Since the gc is not finished JGit doesn't see any new packfiles. * The fetch is processed and the gc ends while the filesystem timer is still t1. GC writes a new packfile and deletes the old packfile. * 3 seconds later another request arrives. JGit does not yet know about the new packfile but is also not rescanning the pack folder because it cached that the last scan happened at time t1 and pack folder's mtime is also t1. Now JGit will not be able to resolve any object contained in this new pack. This behavior may be persistent if objects referenced by the ref/meta/config branch are affected so gerrit can't read permissions stored in the refs/meta/config branch anymore and will not allow any pushes anymore. The pack folder will not change its mtime and therefore no rescan will take place. Change-Id: I3efd0ccffeb97b01207dc3e7a6b85c6b06928fad Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 years ago
Fix FileSnapshot.isModified FileSnapshot.isModified may have reported a file to be clean although it was actually dirty. Imagine you have a FileSnapshot on file f. lastmodified and lastread are both t0. Now time is t1 and you 1) modify the file 2) update the FileSnapshot of the file (lastModified=t1, lastRead=t1) 3) modify the file again 4) wait 3 seconds 5) ask the Filesnapshot whether the file is dirty or not. It erroneously answered it's clean. Any file which has been modified longer than 2.5 seconds ago was reported to be clean. As the test shows that's not always correct. The real-world problem fixed by this change is the following: * A gerrit server using JGit to serve git repositories is processing fetch requests while simultaneously a native git garbage collection runs on the repo. * At time t1 native git writes temporary files in the pack folder setting the mtime of the pack folder to t1. * A fetch request causes JGit to search for new packfiles and JGit remembers this scan in a Filesnapshot on the packs folder. Since the gc is not finished JGit doesn't see any new packfiles. * The fetch is processed and the gc ends while the filesystem timer is still t1. GC writes a new packfile and deletes the old packfile. * 3 seconds later another request arrives. JGit does not yet know about the new packfile but is also not rescanning the pack folder because it cached that the last scan happened at time t1 and pack folder's mtime is also t1. Now JGit will not be able to resolve any object contained in this new pack. This behavior may be persistent if objects referenced by the ref/meta/config branch are affected so gerrit can't read permissions stored in the refs/meta/config branch anymore and will not allow any pushes anymore. The pack folder will not change its mtime and therefore no rescan will take place. Change-Id: I3efd0ccffeb97b01207dc3e7a6b85c6b06928fad Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 years ago
Fix FileSnapshot.isModified FileSnapshot.isModified may have reported a file to be clean although it was actually dirty. Imagine you have a FileSnapshot on file f. lastmodified and lastread are both t0. Now time is t1 and you 1) modify the file 2) update the FileSnapshot of the file (lastModified=t1, lastRead=t1) 3) modify the file again 4) wait 3 seconds 5) ask the Filesnapshot whether the file is dirty or not. It erroneously answered it's clean. Any file which has been modified longer than 2.5 seconds ago was reported to be clean. As the test shows that's not always correct. The real-world problem fixed by this change is the following: * A gerrit server using JGit to serve git repositories is processing fetch requests while simultaneously a native git garbage collection runs on the repo. * At time t1 native git writes temporary files in the pack folder setting the mtime of the pack folder to t1. * A fetch request causes JGit to search for new packfiles and JGit remembers this scan in a Filesnapshot on the packs folder. Since the gc is not finished JGit doesn't see any new packfiles. * The fetch is processed and the gc ends while the filesystem timer is still t1. GC writes a new packfile and deletes the old packfile. * 3 seconds later another request arrives. JGit does not yet know about the new packfile but is also not rescanning the pack folder because it cached that the last scan happened at time t1 and pack folder's mtime is also t1. Now JGit will not be able to resolve any object contained in this new pack. This behavior may be persistent if objects referenced by the ref/meta/config branch are affected so gerrit can't read permissions stored in the refs/meta/config branch anymore and will not allow any pushes anymore. The pack folder will not change its mtime and therefore no rescan will take place. Change-Id: I3efd0ccffeb97b01207dc3e7a6b85c6b06928fad Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 years ago
Fix FileSnapshot.isModified FileSnapshot.isModified may have reported a file to be clean although it was actually dirty. Imagine you have a FileSnapshot on file f. lastmodified and lastread are both t0. Now time is t1 and you 1) modify the file 2) update the FileSnapshot of the file (lastModified=t1, lastRead=t1) 3) modify the file again 4) wait 3 seconds 5) ask the Filesnapshot whether the file is dirty or not. It erroneously answered it's clean. Any file which has been modified longer than 2.5 seconds ago was reported to be clean. As the test shows that's not always correct. The real-world problem fixed by this change is the following: * A gerrit server using JGit to serve git repositories is processing fetch requests while simultaneously a native git garbage collection runs on the repo. * At time t1 native git writes temporary files in the pack folder setting the mtime of the pack folder to t1. * A fetch request causes JGit to search for new packfiles and JGit remembers this scan in a Filesnapshot on the packs folder. Since the gc is not finished JGit doesn't see any new packfiles. * The fetch is processed and the gc ends while the filesystem timer is still t1. GC writes a new packfile and deletes the old packfile. * 3 seconds later another request arrives. JGit does not yet know about the new packfile but is also not rescanning the pack folder because it cached that the last scan happened at time t1 and pack folder's mtime is also t1. Now JGit will not be able to resolve any object contained in this new pack. This behavior may be persistent if objects referenced by the ref/meta/config branch are affected so gerrit can't read permissions stored in the refs/meta/config branch anymore and will not allow any pushes anymore. The pack folder will not change its mtime and therefore no rescan will take place. Change-Id: I3efd0ccffeb97b01207dc3e7a6b85c6b06928fad Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 years ago
Fix FileSnapshot.isModified FileSnapshot.isModified may have reported a file to be clean although it was actually dirty. Imagine you have a FileSnapshot on file f. lastmodified and lastread are both t0. Now time is t1 and you 1) modify the file 2) update the FileSnapshot of the file (lastModified=t1, lastRead=t1) 3) modify the file again 4) wait 3 seconds 5) ask the Filesnapshot whether the file is dirty or not. It erroneously answered it's clean. Any file which has been modified longer than 2.5 seconds ago was reported to be clean. As the test shows that's not always correct. The real-world problem fixed by this change is the following: * A gerrit server using JGit to serve git repositories is processing fetch requests while simultaneously a native git garbage collection runs on the repo. * At time t1 native git writes temporary files in the pack folder setting the mtime of the pack folder to t1. * A fetch request causes JGit to search for new packfiles and JGit remembers this scan in a Filesnapshot on the packs folder. Since the gc is not finished JGit doesn't see any new packfiles. * The fetch is processed and the gc ends while the filesystem timer is still t1. GC writes a new packfile and deletes the old packfile. * 3 seconds later another request arrives. JGit does not yet know about the new packfile but is also not rescanning the pack folder because it cached that the last scan happened at time t1 and pack folder's mtime is also t1. Now JGit will not be able to resolve any object contained in this new pack. This behavior may be persistent if objects referenced by the ref/meta/config branch are affected so gerrit can't read permissions stored in the refs/meta/config branch anymore and will not allow any pushes anymore. The pack folder will not change its mtime and therefore no rescan will take place. Change-Id: I3efd0ccffeb97b01207dc3e7a6b85c6b06928fad Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 years ago
Fix FileSnapshot.isModified FileSnapshot.isModified may have reported a file to be clean although it was actually dirty. Imagine you have a FileSnapshot on file f. lastmodified and lastread are both t0. Now time is t1 and you 1) modify the file 2) update the FileSnapshot of the file (lastModified=t1, lastRead=t1) 3) modify the file again 4) wait 3 seconds 5) ask the Filesnapshot whether the file is dirty or not. It erroneously answered it's clean. Any file which has been modified longer than 2.5 seconds ago was reported to be clean. As the test shows that's not always correct. The real-world problem fixed by this change is the following: * A gerrit server using JGit to serve git repositories is processing fetch requests while simultaneously a native git garbage collection runs on the repo. * At time t1 native git writes temporary files in the pack folder setting the mtime of the pack folder to t1. * A fetch request causes JGit to search for new packfiles and JGit remembers this scan in a Filesnapshot on the packs folder. Since the gc is not finished JGit doesn't see any new packfiles. * The fetch is processed and the gc ends while the filesystem timer is still t1. GC writes a new packfile and deletes the old packfile. * 3 seconds later another request arrives. JGit does not yet know about the new packfile but is also not rescanning the pack folder because it cached that the last scan happened at time t1 and pack folder's mtime is also t1. Now JGit will not be able to resolve any object contained in this new pack. This behavior may be persistent if objects referenced by the ref/meta/config branch are affected so gerrit can't read permissions stored in the refs/meta/config branch anymore and will not allow any pushes anymore. The pack folder will not change its mtime and therefore no rescan will take place. Change-Id: I3efd0ccffeb97b01207dc3e7a6b85c6b06928fad Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
7 years ago
Fix concurrent creation of fan-out object directories If multiple threads attempted to insert loose objects into the same new fan-out directory, the creation of that directory was subject to a race condition that could lead to an unnecessary IOException being thrown - because an inserter could not 'create' a directory that had just been generated by a different thread. All we require is that the directory does indeed *exist*, so not being able to _create_ it is not actually a fatal problem. Setting 'skipExisting' to 'true' on the call to mkdir() fixes the issue. I found this issue as a real world occurrence while working on The BFG Repo Cleaner (https://github.com/rtyley/bfg-repo-cleaner), a tool which concurrently performs a lot of object creation. In order to demonstrate the problem here I've added a small test case which reliably reproduces the issue on the few different hardware systems I've tried. The error thrown when the race-condition arises is this: java.io.IOException: Creating directory /home/roberto/repo.git/objects/e6 failed at org.eclipse.jgit.util.FileUtils.mkdir(FileUtils.java:182) at org.eclipse.jgit.storage.file.ObjectDirectory.insertUnpackedObject(ObjectDirectory.java:590) at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insertOneObject(ObjectDirectoryInserter.java:113) at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insert(ObjectDirectoryInserter.java:91) at org.eclipse.jgit.lib.ObjectInserter.insert(ObjectInserter.java:329) Change-Id: I88eac49bc600c56ba9ad290e6133d8a7113125ab
11 years ago
Fix concurrent creation of fan-out object directories If multiple threads attempted to insert loose objects into the same new fan-out directory, the creation of that directory was subject to a race condition that could lead to an unnecessary IOException being thrown - because an inserter could not 'create' a directory that had just been generated by a different thread. All we require is that the directory does indeed *exist*, so not being able to _create_ it is not actually a fatal problem. Setting 'skipExisting' to 'true' on the call to mkdir() fixes the issue. I found this issue as a real world occurrence while working on The BFG Repo Cleaner (https://github.com/rtyley/bfg-repo-cleaner), a tool which concurrently performs a lot of object creation. In order to demonstrate the problem here I've added a small test case which reliably reproduces the issue on the few different hardware systems I've tried. The error thrown when the race-condition arises is this: java.io.IOException: Creating directory /home/roberto/repo.git/objects/e6 failed at org.eclipse.jgit.util.FileUtils.mkdir(FileUtils.java:182) at org.eclipse.jgit.storage.file.ObjectDirectory.insertUnpackedObject(ObjectDirectory.java:590) at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insertOneObject(ObjectDirectoryInserter.java:113) at org.eclipse.jgit.storage.file.ObjectDirectoryInserter.insert(ObjectDirectoryInserter.java:91) at org.eclipse.jgit.lib.ObjectInserter.insert(ObjectInserter.java:329) Change-Id: I88eac49bc600c56ba9ad290e6133d8a7113125ab
11 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220
  1. /*
  2. * Copyright (C) 2012, Roberto Tyley <roberto.tyley@gmail.com>
  3. *
  4. * This program and the accompanying materials are made available
  5. * under the terms of the Eclipse Distribution License v1.0 which
  6. * accompanies this distribution, is reproduced below, and is
  7. * available at http://www.eclipse.org/org/documents/edl-v10.php
  8. *
  9. * All rights reserved.
  10. *
  11. * Redistribution and use in source and binary forms, with or
  12. * without modification, are permitted provided that the following
  13. * conditions are met:
  14. *
  15. * - Redistributions of source code must retain the above copyright
  16. * notice, this list of conditions and the following disclaimer.
  17. *
  18. * - Redistributions in binary form must reproduce the above
  19. * copyright notice, this list of conditions and the following
  20. * disclaimer in the documentation and/or other materials provided
  21. * with the distribution.
  22. *
  23. * - Neither the name of the Eclipse Foundation, Inc. nor the
  24. * names of its contributors may be used to endorse or promote
  25. * products derived from this software without specific prior
  26. * written permission.
  27. *
  28. * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
  29. * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
  30. * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  31. * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  32. * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
  33. * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  34. * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  35. * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  36. * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
  37. * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
  38. * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
  39. * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
  40. * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  41. */
  42. package org.eclipse.jgit.internal.storage.file;
  43. import static java.nio.charset.StandardCharsets.UTF_8;
  44. import static org.junit.Assert.assertFalse;
  45. import static org.junit.Assert.assertThrows;
  46. import static org.junit.Assert.assertTrue;
  47. import java.io.File;
  48. import java.io.IOException;
  49. import java.io.PrintWriter;
  50. import java.text.MessageFormat;
  51. import java.util.Collection;
  52. import java.util.Collections;
  53. import java.util.Set;
  54. import java.util.concurrent.Callable;
  55. import java.util.concurrent.ExecutorService;
  56. import java.util.concurrent.Executors;
  57. import java.util.concurrent.Future;
  58. import org.eclipse.jgit.internal.JGitText;
  59. import org.eclipse.jgit.junit.RepositoryTestCase;
  60. import org.eclipse.jgit.lib.ConfigConstants;
  61. import org.eclipse.jgit.lib.Constants;
  62. import org.eclipse.jgit.lib.ObjectId;
  63. import org.eclipse.jgit.revwalk.RevCommit;
  64. import org.eclipse.jgit.storage.file.FileBasedConfig;
  65. import org.eclipse.jgit.util.FS;
  66. import org.junit.Assume;
  67. import org.junit.Test;
  68. public class ObjectDirectoryTest extends RepositoryTestCase {
  69. @Test
  70. public void testConcurrentInsertionOfBlobsToTheSameNewFanOutDirectory()
  71. throws Exception {
  72. ExecutorService e = Executors.newCachedThreadPool();
  73. for (int i=0; i < 100; ++i) {
  74. ObjectDirectory dir = createBareRepository().getObjectDatabase();
  75. for (Future f : e.invokeAll(blobInsertersForTheSameFanOutDir(dir))) {
  76. f.get();
  77. }
  78. }
  79. }
  80. /**
  81. * Test packfile scanning while a gc is done from the outside (different
  82. * process or different Repository instance). This situation occurs e.g. if
  83. * a gerrit server is serving fetch requests while native git is doing a
  84. * garbage collection. The test shows that when core.trustfolderstat==true
  85. * jgit may miss to detect that a new packfile was created. This situation
  86. * is persistent until a new full rescan of the pack directory is triggered.
  87. *
  88. * The test works with two Repository instances working on the same disk
  89. * location. One (db) for all write operations (creating commits, doing gc)
  90. * and another one (receivingDB) which just reads and which in the end shows
  91. * the bug
  92. *
  93. * @throws Exception
  94. */
  95. @Test
  96. public void testScanningForPackfiles() throws Exception {
  97. ObjectId unknownID = ObjectId
  98. .fromString("c0ffee09d0b63d694bf49bc1e6847473f42d4a8c");
  99. GC gc = new GC(db);
  100. gc.setExpireAgeMillis(0);
  101. gc.setPackExpireAgeMillis(0);
  102. // the default repo db is used to create the objects. The receivingDB
  103. // repo is used to trigger gc's
  104. try (FileRepository receivingDB = new FileRepository(
  105. db.getDirectory())) {
  106. // set trustfolderstat to true. If set to false the test always
  107. // succeeds.
  108. FileBasedConfig cfg = receivingDB.getConfig();
  109. cfg.setBoolean(ConfigConstants.CONFIG_CORE_SECTION, null,
  110. ConfigConstants.CONFIG_KEY_TRUSTFOLDERSTAT, true);
  111. cfg.save();
  112. // setup a repo which has at least one pack file and trigger
  113. // scanning of the packs directory
  114. ObjectId id = commitFile("file.txt", "test", "master").getId();
  115. gc.gc();
  116. assertFalse(receivingDB.getObjectDatabase().has(unknownID));
  117. assertTrue(receivingDB.getObjectDatabase().hasPackedObject(id));
  118. // preparations
  119. File packsFolder = receivingDB.getObjectDatabase()
  120. .getPackDirectory();
  121. // prepare creation of a temporary file in the pack folder. This
  122. // simulates that a native git gc is happening starting to write
  123. // temporary files but has not yet finished
  124. File tmpFile = new File(packsFolder, "1.tmp");
  125. RevCommit id2 = commitFile("file.txt", "test2", "master");
  126. // wait until filesystem timer ticks. This raises probability that
  127. // the next statements are executed in the same tick as the
  128. // filesystem timer
  129. fsTick(null);
  130. // create a Temp file in the packs folder and trigger a rescan of
  131. // the packs folder. This lets receivingDB think it has scanned the
  132. // packs folder at the current fs timestamp t1. The following gc
  133. // will create new files which have the same timestamp t1 but this
  134. // will not update the mtime of the packs folder. Because of that
  135. // JGit will not rescan the packs folder later on and fails to see
  136. // the pack file created during gc.
  137. assertTrue(tmpFile.createNewFile());
  138. assertFalse(receivingDB.getObjectDatabase().has(unknownID));
  139. // trigger a gc. This will create packfiles which have likely the
  140. // same mtime than the packfolder
  141. gc.gc();
  142. // To deal with racy-git situations JGit's Filesnapshot class will
  143. // report a file/folder potentially dirty if
  144. // cachedLastReadTime-cachedLastModificationTime < filesystem
  145. // timestamp resolution. This causes JGit to always rescan a file
  146. // after modification. But: this was true only if the difference
  147. // between current system time and cachedLastModification time was
  148. // less than 2500ms. If the modification is more than 2500ms ago we
  149. // may have reported a file/folder to be clean although it has not
  150. // been rescanned. A bug. To show the bug we sleep for more than
  151. // 2500ms
  152. Thread.sleep(2600);
  153. File[] ret = packsFolder.listFiles(
  154. (File dir, String name) -> name.endsWith(".pack"));
  155. assertTrue(ret != null && ret.length == 1);
  156. FS fs = db.getFS();
  157. Assume.assumeTrue(fs.lastModifiedInstant(tmpFile)
  158. .equals(fs.lastModifiedInstant(ret[0])));
  159. // all objects are in a new packfile but we will not detect it
  160. assertFalse(receivingDB.getObjectDatabase().has(unknownID));
  161. assertTrue(receivingDB.getObjectDatabase().has(id2));
  162. }
  163. }
  164. @Test
  165. public void testShallowFile()
  166. throws Exception {
  167. FileRepository repository = createBareRepository();
  168. ObjectDirectory dir = repository.getObjectDatabase();
  169. String commit = "d3148f9410b071edd4a4c85d2a43d1fa2574b0d2";
  170. try (PrintWriter writer = new PrintWriter(
  171. new File(repository.getDirectory(), Constants.SHALLOW),
  172. UTF_8.name())) {
  173. writer.println(commit);
  174. }
  175. Set<ObjectId> shallowCommits = dir.getShallowCommits();
  176. assertTrue(shallowCommits.remove(ObjectId.fromString(commit)));
  177. assertTrue(shallowCommits.isEmpty());
  178. }
  179. @Test
  180. public void testShallowFileCorrupt() throws Exception {
  181. FileRepository repository = createBareRepository();
  182. ObjectDirectory dir = repository.getObjectDatabase();
  183. String commit = "X3148f9410b071edd4a4c85d2a43d1fa2574b0d2";
  184. try (PrintWriter writer = new PrintWriter(
  185. new File(repository.getDirectory(), Constants.SHALLOW),
  186. UTF_8.name())) {
  187. writer.println(commit);
  188. }
  189. assertThrows(
  190. MessageFormat.format(JGitText.get().badShallowLine, commit),
  191. IOException.class, () -> dir.getShallowCommits());
  192. }
  193. private Collection<Callable<ObjectId>> blobInsertersForTheSameFanOutDir(
  194. final ObjectDirectory dir) {
  195. Callable<ObjectId> callable = () -> dir.newInserter()
  196. .insert(Constants.OBJ_BLOB, new byte[0]);
  197. return Collections.nCopies(4, callable);
  198. }
  199. }