Yunjie Li [Wed, 29 Apr 2020 22:27:47 +0000 (15:27 -0700)]
PackBitmapIndex: Set distance threshold
Setting the distance threshold to 2000 in PackWriterBitmapPreparer to
reduce memory usage in garbage collection. When the threshold is 0, GC
for the msm repository would use about 37 GB memory to complete. After
setting it to 2000, GC can finish in 75 min with about 10 GB memory.
Change-Id: I39783eeecbae58261c883735499e61ee1cac75fe Signed-off-by: Yunjie Li <yunjieli@google.com>
Yunjie Li [Thu, 23 Apr 2020 22:12:15 +0000 (15:12 -0700)]
PackBitmapIndex: Not buffer inflated bitmap in BasePackBitmapIndex
Currently we're buffering the inflated bitmap entry in BasePackBitmapIndex
to optimize running time. However, this will use lots of memory during
the construction of the pack bitmap index file which may cause failure of
garbage collection.
The running time didn't increase significantly, if there's any increase,
after removing the buffering here. The report about usage of time/memory
will come in the next commit.
Change-Id: I874503ecc85714acab7ca62a6a7968c2dc0b56b3 Signed-off-by: Yunjie Li <yunjieli@google.com>
Yunjie Li [Thu, 23 Apr 2020 22:11:14 +0000 (15:11 -0700)]
PackBitmapIndex: Remove convertedBitmaps in the Remapper
The convertedBitmaps serves for time-optimization purpose. But it's
actually not saving time much but using lots of memory. So remove the
field here to save memory.
Currently the remapper class is only used in the construction of the
bitmap index file. And during the preparation of the file, we're only
getting bitmaps from the remapper when finding objects accessible from
a commit, so bitmap associated with each commit will only be fetched once
and thus the convertedBitmaps would hardly be read, which means that it's
not saving time.
Change-Id: Ic942a8e485135fb177ec21d09282d08ca6646fdb Signed-off-by: Yunjie Li <yunjieli@google.com>
Yunjie Li [Tue, 11 Feb 2020 19:06:33 +0000 (11:06 -0800)]
PackBitmapIndex: Reduce memory usage in GC
Currently, the garbage collection is consistently failing for some large
repositories in the building bitmap phase, e.g.Linux-MSM project:
https://source.codeaurora.org/quic/la/kernel/msm-3.18
Historically, bitmap index creation happened in 3 phases:
1. Select the commits to which bitmaps should be attached.
2. Create all bitmaps for these commits, stored in uncompressed format
in the PackBitmapIndexBuilder.
3. Deltify the bitmaps and write them to disk.
We investigated the process. For phase 2 it's most efficient to create
bitmaps starting with oldest commit and moving to the newest commit,
because the newer commits are able to reuse the work for the old ones.
But for bitmap deltification in phase 3, it's better when a newer
commit's bitmap is the base, and the current disk format writes bitmaps
out for the newest commits first.
This change introduces a new collection to hold the deltified and
compressed representations of the bitmaps, keeping a smaller subset of
commits in the PackBitmapIndexBuilder to help make the bitmap index
creation more memory efficient.
And in this commit, we're setting DISTANCE_THRESHOLD to 0 in the
PackWriterBitmapPreparer, which means the garbage collection will not
have much behavoir change and will still use as much memory as before.
Change-Id: I6ec2c3e8dde11805af47874d67d33cf1ef83660e Signed-off-by: Yunjie Li <yunjieli@google.com>
Yunjie Li [Tue, 11 Feb 2020 01:02:13 +0000 (17:02 -0800)]
PackBitmapIndex: Add AddToBitmapWithCacheFilter class
Add a new revwalk filter, AddToBitmapWithCachedFilter. This filter updates
a client-provided {@code BitmapBuilder} as a side effect of a revwalk.
Similar to {@code AddToBitmapFilter}, it short circuits the walk when it
encounters a commit which is included in the provided bitmap's BitmapIndex.
It also short circuits the walk if it encounters the client-provided
cached commit.
Change-Id: I62cb503016f4d3995d648d92b82baab7f93549a9 Signed-off-by: Yunjie Li <yunjieli@google.com>
Yunjie Li [Wed, 22 Apr 2020 20:12:05 +0000 (13:12 -0700)]
PackBitmapIndex: Move BitmapCommit to a top-level class
Move BitmapCommit from inside the PackWriterBitmapPreparer to a new
top-level class in preparation for improving the memory footprint of GC's
bitmap generation phase.
Change-Id: I4d404a5b3a34998b441d23105197f33d32d39670 Signed-off-by: Yunjie Li <yunjieli@google.com>
Yunjie Li [Mon, 10 Feb 2020 23:22:31 +0000 (15:22 -0800)]
Refactor: Make retriveCompressed an method of the Bitmap class
Make retrieveCompressed() a method of Bitmap interface to avoid type
casting and later reuse in improving the memory footprint of GC's bitmap
generation phase.
Change-Id: I098d85105cf17af845d43b8c71b4ca48b02fd7da Signed-off-by: Yunjie Li <yunjieli@google.com>
David Ostrovsky [Fri, 17 Apr 2020 22:00:32 +0000 (00:00 +0200)]
Bazel: Disable SecurityManagerMissingPermissionsTest test
In Id5376f09f0d a test with dependency on log4j library was added, but
the library was missed to be added to the Bazel build tool chain.
Given that Bazel test runner doesn't suport custom security manager the
test wouldn't pass even if the missing dependency would be added. The
only solution we have for now is to exclude that test from Bazel tool
chain.
Filed a feature request for bazel to support such tests at
https://github.com/bazelbuild/bazel/issues/11146
Bug: 562274
Change-Id: I873a0e09addc583455b68122f66cd3952e485f0e Signed-off-by: David Ostrovsky <david@ostrovsky.org> Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Michael Keppler [Tue, 14 Apr 2020 07:31:50 +0000 (09:31 +0200)]
Remove double blank from sentence start
Multiple whitespaces are not normalized when reading properties files,
therefore leading to unwanted space/indentation in console or UI output.
Change-Id: I1f5224fe359e0cac493e0237872afc75dc8b9fbe Signed-off-by: Michael Keppler <Michael.Keppler@gmx.de> Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
(cherry picked from commit ebbc3efce73278d6e0dbb1acd099db2446b1bed9)
Thomas Wolf [Thu, 2 Apr 2020 19:15:18 +0000 (21:15 +0200)]
FS.runInShell(): handle quoted filters and hooksPath containing blanks
Revert commit 2323d7a. Using $0 in the shell command call results in
the command string being taken literally. That was introduced to fix
a problem with backslashes, but is actually not correct.
First, the problem with backslashes occurred only on Win32/Cygwin,
and has been properly fixed in commit 6f268f8.
Second, this is used only for hooks (which don't have backslashes in
their names) and filter commands from the git config, where the user
is responsible for properly quoting or escaping such that the commands
work.
Third, using $0 actually breaks correctly quoted filter commands
like in the bug report. The shell really takes the command literally,
and then doesn't find the command because of quotes.
So revert this change.
At the same time there's a related problem with hooks. If the path to
the hook contains blanks, runInShell() would also fail to find the
hook. In this case, the command doesn't come from user input but is
just a Java File object with an absolute path containing blanks. (Can
occur if core.hooksPath points to such a path with blanks, or if the
repository has such a path.)
The path to the hook as obtained from the file system must be quoted.
Matthias Sohn [Sun, 29 Mar 2020 11:08:26 +0000 (13:08 +0200)]
Define constants for pack config option keys
Change-Id: Ifb8227cb62370029d6774f2a22b15d6478c713ca Signed-off-by: Matthias Sohn <matthias.sohn@sap.com> Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
Masaya Suzuki [Tue, 24 Mar 2020 18:50:34 +0000 (11:50 -0700)]
ReceivePack: Use error message if set
ReceiveCommand can have an error message. This is shown only for some
cases even if it's set. This change uses the error message if it's set,
and fallback to the default message if unset.
Thomas Wolf [Wed, 25 Mar 2020 08:13:20 +0000 (09:13 +0100)]
Handle non-normalized index also for executable files
Commit 60cf85a4 corrected the handling of check-in for files where
the index version is non-normalized, i.e., contains CR-LF line endings.
However, it did so only for regular files, not executable files.
Bug: 561438
Change-Id: I372cc990c5efeb00315460f36459c0652d5d1e77 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Ivan Frade [Wed, 18 Mar 2020 05:29:59 +0000 (22:29 -0700)]
ResolveMerger: Ignore merge conflicts if asked so
The recursive merge strategy builds a virtual ancestor merging
recursively the common bases (when more than one) between the
want-to-merge commits. While building this virtual ancestor, content
conflicts are ignored, but current code doesn't do so when a file is
removed.
This was spotted in [1], for example. Merging two commits to build the
virtual ancestor bumped into a conflict (modified in one side, deleted
in the other) that stopped the process.
Follow the "spec" and in case of conflict leave the unmerged content in
the index and working trees.
Alex Spradlin [Thu, 12 Mar 2020 16:04:36 +0000 (09:04 -0700)]
RevWalk: fix bad topo flags error message
The error message for an Exception thrown by StartGenerator when given
both the TOPO flag and the TOPO_KEEP_BRANCH_TOGETHER flag mentions a
non-existent flag, TOPO_NON_INTERMIX. The error message was introduced
in commit e498d43.
Replace TOPO_NON_INTERMIX with TOPO_KEEP_BRANCH_TOGETHER in the error
message of an Exception thrown by the StartGenerator when the TOPO flag
is provided together with the TOPO_KEEP_BRANCH_TOGETHER flag.
Signed-off-by: Alex Spradlin <alexaspradlin@google.com>
Change-Id: Id24640dc08e96a196508fe38ce144aa7e035082f
Alex Spradlin [Wed, 26 Feb 2020 23:31:37 +0000 (15:31 -0800)]
RevWalk: new topo sort to not mix lines of history
The topological sort algorithm in TopoSortGenerator for RevWalk may mix
multiple lines of history, producing results that differ from C git's
git-log whose man page states: "Show no parents before all of its
children are shown, and avoid showing commits on multiple lines of
history intermixed." Lines of history are mixed because
TopoSortGenerator merely delays producing a commit until all of its
children have been produced; it does not immediately produce a commit
after its last child has been produced.
Therefore, add a new RevSort option called TOPO_KEEP_BRANCH_TOGETHER
with a new topo sort algorithm in TopoNonIntermixGenerator. In the
Generator, when the last child of a commit has been produced, unpop
that commit so that it will be returned upon the subsequent call to
next(). To avoid producing duplicates, mark commits that have not yet
been produced as TOPO_QUEUED so that when a commit is popped, it is
produced if and only if TOPO_QUEUED is set.
To support nesting with other generators that may produce the same
commit multiple times like DepthGenerator (for example, StartGenerator
does this), do not increment parent inDegree for the same child commit
more than once.
Thomas Wolf [Mon, 9 Mar 2020 22:41:37 +0000 (23:41 +0100)]
TransportHttp: support HTTP response 308 Permanent Redirect
RFC 7538[1] added HTTP response code 308, signifying a permanent
redirect that, contrary to the older 301, does not allow changing
the request method from POST to GET.
[1] https://tools.ietf.org/html/rfc7538
Bug: 560936
Change-Id: Ib65f3a3ed75db51d74d1fe81d4abe6fe92b0ca12 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Matthias Sohn [Sat, 7 Mar 2020 20:09:48 +0000 (21:09 +0100)]
Merge branch 'master' into stable-5.7
* master:
Silence API errors introduced by 093fbbd1
Bump Bazel version to 2.2.0
Add validation to hex decoder
Expose FileStoreAttributes.setBackground()
Update reftable storage repo layout
Add 4.14 and 4.15-staging target platforms
Update Orbit to R20200224183213 for final 2020-03
Update Orbit to S20200224183213 for 2020-03 RC1
Cygwin expects forward slashes for commands to be run via sh.exe
[releng] Update year in copyright notices for features
Using for-each loop in jdt
Make Logger instances final
Move array designators from the variable to the type
ObjectWalk: Add null check before skip tree.
Revert "RevWalk: stop mixing lines of history in topo sort"
Do not fail if known hosts file does not contain valid host key
Matthias Sohn [Thu, 5 Mar 2020 14:43:48 +0000 (15:43 +0100)]
Merge branch 'stable-5.6'
* stable-5.6:
Silence API errors introduced by 093fbbd1
Bump Bazel version to 2.2.0
Expose FileStoreAttributes.setBackground()
Update reftable storage repo layout
Change-Id: I237eaaed7991e8bbd56a7624f47bbba985330026 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Alex Blewitt [Tue, 25 Feb 2020 21:14:59 +0000 (21:14 +0000)]
Expose FileStoreAttributes.setBackground()
The FS.setAsyncFileStoreAttributes() static method calls
FileStoreAttributes.setBackground() as its implementation, but there are
other public attributes on this inner class already and there isn't a
real reason why this needs to be private.
By making it public we allow callers to be able to invoke it directly.
Although it doesn't appear that it would make a difference, by calling a
static method on the FS class, all static fields and the transitive
closure of class dependencies must be loaded and initialised, which can
be non-trivial.
Callers referring to FS.setAsyncFileStoreAttributes() may be replaced
with FS.FileStoreAttributes.setBackground() with no change of behaviour
other than improved performance due to less class loading required.
Bug: 560527
Change-Id: I9538acc90da8d18f53fd60d74eb54496857f93a5 Signed-off-by: Alex Blewitt <alex.blewitt@gmail.com>
Han-Wen Nienhuys [Tue, 18 Feb 2020 19:44:10 +0000 (20:44 +0100)]
Update reftable storage repo layout
The change Ic0b974fa (c217d33, "Documentation/technical/reftable:
improve repo layout") defines a new repository layout, which was
agreed with the git-core mailing list.
It addresses the following problems:
* old git clients will not recognize reftable-based repositories, and
look at encompassing directories.
* Poorly written tools might write directly into
.git/refs/heads/BRANCH.
Since we consider JGit reftable as experimental (git-core doesn't
support it yet), we have no backward compatibility. If you created a
repository with reftable between mid-Nov 2019 and now, you can do the
following to convert:
Matthias Sohn [Fri, 28 Feb 2020 22:53:17 +0000 (23:53 +0100)]
Merge branch 'stable-5.6'
* stable-5.6:
Cygwin expects forward slashes for commands to be run via sh.exe
Make Logger instances final
Move array designators from the variable to the type
Change-Id: I9a5dc570deb478525bf48ef526d8cba5b19418bf Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Thomas Wolf [Thu, 27 Feb 2020 19:04:47 +0000 (20:04 +0100)]
Cygwin expects forward slashes for commands to be run via sh.exe
FS_Win32_Cygwin replaces backslashes by / as a side-effect of
relativize(). When support for core.hooksPath was added, paths were
relativized in a different place using Path.resolve(), which doesn't
do that transformation. As a result hooks could not be run on Cygwin
in some cases.
Do the transformation in FS_Win32_Cygwin.runInShell(). In all other
places, File or Path objects are used, which give no guarantee about
the file separator (typically the system-dependent default separator),
so doing the transformation earlier still wouldn't guarantee that
sh.exe indeed gets a command string using forward slashes.
Bug: 558577
Change-Id: I3c07eb85f0ac7c5628a2e92f990e5cdb7ecf532f Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
Thomas Wolf [Thu, 27 Feb 2020 23:32:50 +0000 (00:32 +0100)]
[releng] Update year in copyright notices for features
Bump upper end of range to 2020.
These copyright notices are user-facing; they're visible in several
dialogs in Eclipse. It is strange or even misleading to see a copyright
notice for JGit saying "2005, 2010" when there have been *many*
developments in the past ten years.
Change-Id: Idaa6244b2b3d9cecb29cc690085f8d008195cf12 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>
David Pursehouse [Thu, 27 Feb 2020 11:27:31 +0000 (20:27 +0900)]
Move array designators from the variable to the type
As reported by Sonar Lint:
Array designators should always be located on the type for better code
readability. Otherwise, developers must look both at the type and the
variable name to know whether or not a variable is an array.
Change-Id: If6b41fed3483d0992d402d8680552ab4bef89ffb Signed-off-by: David Pursehouse <david.pursehouse@gmail.com>
PlotWalk uses the TopoSortGenerator, which is causing problems for EGit users
who rely on the emission of commits being somewhat based on date as in the
previous topo-sort algorithm.
Bug: 560529
Change-Id: I3dbd3598a7aeb960de3fc39352699b4f11a8c226 Signed-off-by: Alex Spradlin <alexaspradlin@google.com>
Matthias Sohn [Mon, 24 Feb 2020 20:13:22 +0000 (21:13 +0100)]
Merge branch 'master' into stable-5.7
* master:
Update Orbit to S20200219023850 for 2012-03 M3
Revert "Prepend hostname to subsection used to store file timestamp resolution"
Remove use of org.bouncycastle.util.encoders.Hex
Remove use of org.bouncycastle.util.io.TeeOutputStream
Make the IMatcher public API
SimilarityRenameDetector: Fix inconsistent indentation
Use indexOf(char) and lastIndexOf(char) rather than String versions
Reorder modifiers to follow Java Language Specification
GitmoduleEntry: Remove redundant import of class from same package
Remove redundant "static" qualifier from enum declarations
RevWalk: stop mixing lines of history in topo sort
Upgrade plexus-compiler-{eclipse|javac|javac-errorprone} to 2.8.6
Upgrade maven-shade-plugin to 3.2.2
Removed unused imports
Documentation/technical/reftable: improve repo layout
Change-Id: I558eff2abda44342fbaf1662fda07e2bcc6d4ee3 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Sat, 22 Feb 2020 22:29:14 +0000 (23:29 +0100)]
Merge branch 'stable-5.6'
* stable-5.6:
Revert "Prepend hostname to subsection used to store file timestamp resolution"
SimilarityRenameDetector: Fix inconsistent indentation
Use indexOf(char) and lastIndexOf(char) rather than String versions
Reorder modifiers to follow Java Language Specification
GitmoduleEntry: Remove redundant import of class from same package
Remove redundant "static" qualifier from enum declarations
Change-Id: Ibb66bef7e8373f81e3e653c9843d986243446d68 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Resolving the hostname comes with a performance penalty. We no longer
store the timestamp resolution in the global git config which might be
copied around to other machines but in a dedicated jgit config meant for
automatically determined options like timestamp resolution. Hence there
is no strong reason anymore to have a hardware specific identifier in
the subsection name of file timestamp resolution options.
Bug: 560414
Change-Id: If8dcabe981eb1792db84643850faa6033f14b1cf Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Matthias Sohn [Wed, 19 Feb 2020 00:44:18 +0000 (01:44 +0100)]
Merge branch 'stable-5.6' into stable-5.7
# By Matthias Sohn (3) and Han-Wen Nienhuys (1)
* stable-5.6:
Update API problem filter
Prepare 5.6.2-SNAPSHOT builds
JGit v5.6.1.202002131546-r
Simplify ReftableCompactor
Change-Id: I16ed174f9fc662934c3ebaea85a60690efbed1c6 Signed-off-by: Matthias Sohn <matthias.sohn@sap.com>
Alex Spradlin [Tue, 11 Feb 2020 21:27:51 +0000 (13:27 -0800)]
RevWalk: stop mixing lines of history in topo sort
The topological sort algorithm in TopoSortGenerator for RevWalk may mix
multiple lines of history, producing results that differ from C git's
git log whose man page states: "Show no parents before all of its
children are shown, and avoid showing commits on multiple lines of
history intermixed." Lines of history are mixed because
TopoSortGenerator merely delays a commit until all of its children have
been produced; it does not immediately produce a commit after its last
child has been produced.
Therefore, when the last child of a commit has been produced, unpop the
commit so that it will be returned upon the subsequent call to next() in
TopoSortGenerator. To avoid producing duplicates, mark commits that
have not yet been produced as TOPO_QUEUED so that when a commit is
popped, it is produced if and only if TOPO_QUEUED is set.
To support nesting with other generators that may produce the same
commit multiple times like DepthGenerator (for example, StartGenerator
does this), do not increment parent inDegree for the same child commit
more than once.
Modify tests that assert that TopoSortGenerator mixes lines of commit
history.
Change-Id: I4ee03c7a8e5265d61230b2a01ae3858745b2432b Signed-off-by: Alex Spradlin <alexaspradlin@google.com>
Thomas Wolf [Sat, 15 Feb 2020 16:35:42 +0000 (17:35 +0100)]
Merge branch 'master' into stable-5.7
* master:
Use lambdas where possible
Upgrade maven-pmd-plugin to 3.13.0
Include org.apache.commons.codec 1.13 in the JGit http.apache.feature
Update Maven plugins
AmazonS3: Speed up fetch, try recent packs first
Update orbit to S20200128200239 for 2020-03 M2
FS: re-order fields and use all uppercase for true constants
FS: Don't use innocuous threads for CompletableFuture
ErrorProne: Enable and fix UnusedException check
Update Orbit to I20200126235943 and org.junit to 4.13.0.v20200126-2018
Fix SshSessionFactory#setInstance to use service loader
Use ServiceLoader to define the default SSH session factory.
Remove Error-Prone ExpectedExceptionChecker
ReceivePack: enable overriding filterCommands and executeCommands
Replace ExpectedException which was deprecated in junit 4.13
Add org.assertj 3.14.0.v20200120-1926 to target platform
Replace deprecated junit assertion methods with hamcrest
Update to Orbit I20200120214610 and JUnit to 4.13
Update to Tycho 1.6.0
Extract method refactoring in class DirCacheCheckout
Update Orbit to I20200115225246 and update dependencies
Update bazlets and bazel version
Change-Id: Ib6cd6593484cd79def9d5298181411189575c9f7 Signed-off-by: Thomas Wolf <thomas.wolf@paranor.ch>