Você não pode selecionar mais de 25 tópicos Os tópicos devem começar com uma letra ou um número, podem incluir traços ('-') e podem ter até 35 caracteres.

PackFileTest.java 13KB

Increase core.streamFileThreshold default to 50 MiB Projects like org.eclipse.mdt contain large XML files about 6 MiB in size. So does the Android project platform/frameworks/base. Doing a clone of either project with JGit takes forever to checkout the files into the working directory, because delta decompression tends to be very expensive as we need to constantly reposition the base stream for each copy instruction. This can be made worse by a very bad ordering of offsets, possibly due to an XML editor that doesn't preserve the order of elements in the file very well. Increasing the threshold to the same limit PackWriter uses when doing delta compression (50 MiB) permits a default configured JGit to decompress these XML file objects using the faster random-access arrays, rather than re-seeking through an inflate stream, significantly reducing checkout time after a clone. Since this new limit may be dangerously close to the JVM maximum heap size, every allocation attempt is now wrapped in a try/catch so that JGit can degrade by switching to the large object stream mode when the allocation is refused. It will run slower, but the operation will still complete. The large stream mode will run very well for big objects that aren't delta compressed, and is acceptable for delta compressed objects that are using only forward referencing copy instructions. Copies using prior offsets are still going to be horrible, and there is nothing we can do about it except increase core.streamFileThreshold. We might in the future want to consider changing the way the delta generators work in JGit and native C Git to avoid prior offsets once an object reaches a certain size, even if that causes the delta instruction stream to be slightly larger. Unfortunately native C Git won't want to do that until its also able to stream objects rather than malloc them as contiguous blocks. Change-Id: Ief7a3896afce15073e80d3691bed90c6a3897307 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
13 anos atrás
Increase core.streamFileThreshold default to 50 MiB Projects like org.eclipse.mdt contain large XML files about 6 MiB in size. So does the Android project platform/frameworks/base. Doing a clone of either project with JGit takes forever to checkout the files into the working directory, because delta decompression tends to be very expensive as we need to constantly reposition the base stream for each copy instruction. This can be made worse by a very bad ordering of offsets, possibly due to an XML editor that doesn't preserve the order of elements in the file very well. Increasing the threshold to the same limit PackWriter uses when doing delta compression (50 MiB) permits a default configured JGit to decompress these XML file objects using the faster random-access arrays, rather than re-seeking through an inflate stream, significantly reducing checkout time after a clone. Since this new limit may be dangerously close to the JVM maximum heap size, every allocation attempt is now wrapped in a try/catch so that JGit can degrade by switching to the large object stream mode when the allocation is refused. It will run slower, but the operation will still complete. The large stream mode will run very well for big objects that aren't delta compressed, and is acceptable for delta compressed objects that are using only forward referencing copy instructions. Copies using prior offsets are still going to be horrible, and there is nothing we can do about it except increase core.streamFileThreshold. We might in the future want to consider changing the way the delta generators work in JGit and native C Git to avoid prior offsets once an object reaches a certain size, even if that causes the delta instruction stream to be slightly larger. Unfortunately native C Git won't want to do that until its also able to stream objects rather than malloc them as contiguous blocks. Change-Id: Ief7a3896afce15073e80d3691bed90c6a3897307 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
13 anos atrás
Increase core.streamFileThreshold default to 50 MiB Projects like org.eclipse.mdt contain large XML files about 6 MiB in size. So does the Android project platform/frameworks/base. Doing a clone of either project with JGit takes forever to checkout the files into the working directory, because delta decompression tends to be very expensive as we need to constantly reposition the base stream for each copy instruction. This can be made worse by a very bad ordering of offsets, possibly due to an XML editor that doesn't preserve the order of elements in the file very well. Increasing the threshold to the same limit PackWriter uses when doing delta compression (50 MiB) permits a default configured JGit to decompress these XML file objects using the faster random-access arrays, rather than re-seeking through an inflate stream, significantly reducing checkout time after a clone. Since this new limit may be dangerously close to the JVM maximum heap size, every allocation attempt is now wrapped in a try/catch so that JGit can degrade by switching to the large object stream mode when the allocation is refused. It will run slower, but the operation will still complete. The large stream mode will run very well for big objects that aren't delta compressed, and is acceptable for delta compressed objects that are using only forward referencing copy instructions. Copies using prior offsets are still going to be horrible, and there is nothing we can do about it except increase core.streamFileThreshold. We might in the future want to consider changing the way the delta generators work in JGit and native C Git to avoid prior offsets once an object reaches a certain size, even if that causes the delta instruction stream to be slightly larger. Unfortunately native C Git won't want to do that until its also able to stream objects rather than malloc them as contiguous blocks. Change-Id: Ief7a3896afce15073e80d3691bed90c6a3897307 Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
13 anos atrás
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412
  1. /*
  2. * Copyright (C) 2010, Google Inc.
  3. * and other copyright owners as documented in the project's IP log.
  4. *
  5. * This program and the accompanying materials are made available
  6. * under the terms of the Eclipse Distribution License v1.0 which
  7. * accompanies this distribution, is reproduced below, and is
  8. * available at http://www.eclipse.org/org/documents/edl-v10.php
  9. *
  10. * All rights reserved.
  11. *
  12. * Redistribution and use in source and binary forms, with or
  13. * without modification, are permitted provided that the following
  14. * conditions are met:
  15. *
  16. * - Redistributions of source code must retain the above copyright
  17. * notice, this list of conditions and the following disclaimer.
  18. *
  19. * - Redistributions in binary form must reproduce the above
  20. * copyright notice, this list of conditions and the following
  21. * disclaimer in the documentation and/or other materials provided
  22. * with the distribution.
  23. *
  24. * - Neither the name of the Eclipse Foundation, Inc. nor the
  25. * names of its contributors may be used to endorse or promote
  26. * products derived from this software without specific prior
  27. * written permission.
  28. *
  29. * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
  30. * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
  31. * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  32. * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  33. * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
  34. * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  35. * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  36. * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  37. * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
  38. * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
  39. * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
  40. * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
  41. * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  42. */
  43. package org.eclipse.jgit.internal.storage.file;
  44. import static org.junit.Assert.assertArrayEquals;
  45. import static org.junit.Assert.assertEquals;
  46. import static org.junit.Assert.assertFalse;
  47. import static org.junit.Assert.assertNotNull;
  48. import static org.junit.Assert.assertNull;
  49. import static org.junit.Assert.assertTrue;
  50. import static org.junit.Assert.fail;
  51. import java.io.ByteArrayInputStream;
  52. import java.io.ByteArrayOutputStream;
  53. import java.io.File;
  54. import java.io.FileOutputStream;
  55. import java.io.IOException;
  56. import java.security.MessageDigest;
  57. import java.text.MessageFormat;
  58. import java.util.ArrayList;
  59. import java.util.Arrays;
  60. import java.util.Collections;
  61. import java.util.List;
  62. import java.util.zip.Deflater;
  63. import org.eclipse.jgit.errors.LargeObjectException;
  64. import org.eclipse.jgit.internal.JGitText;
  65. import org.eclipse.jgit.internal.storage.pack.DeltaEncoder;
  66. import org.eclipse.jgit.internal.storage.pack.PackExt;
  67. import org.eclipse.jgit.junit.JGitTestUtil;
  68. import org.eclipse.jgit.junit.LocalDiskRepositoryTestCase;
  69. import org.eclipse.jgit.junit.TestRepository;
  70. import org.eclipse.jgit.junit.TestRng;
  71. import org.eclipse.jgit.lib.Constants;
  72. import org.eclipse.jgit.lib.NullProgressMonitor;
  73. import org.eclipse.jgit.lib.ObjectId;
  74. import org.eclipse.jgit.lib.ObjectInserter;
  75. import org.eclipse.jgit.lib.ObjectLoader;
  76. import org.eclipse.jgit.lib.ObjectStream;
  77. import org.eclipse.jgit.lib.Repository;
  78. import org.eclipse.jgit.revwalk.RevBlob;
  79. import org.eclipse.jgit.storage.file.WindowCacheConfig;
  80. import org.eclipse.jgit.transport.PackParser;
  81. import org.eclipse.jgit.transport.PackedObjectInfo;
  82. import org.eclipse.jgit.util.IO;
  83. import org.eclipse.jgit.util.NB;
  84. import org.eclipse.jgit.util.TemporaryBuffer;
  85. import org.junit.After;
  86. import org.junit.Before;
  87. import org.junit.Test;
  88. public class PackFileTest extends LocalDiskRepositoryTestCase {
  89. private int streamThreshold = 16 * 1024;
  90. private TestRng rng;
  91. private FileRepository repo;
  92. private TestRepository<Repository> tr;
  93. private WindowCursor wc;
  94. private TestRng getRng() {
  95. if (rng == null)
  96. rng = new TestRng(JGitTestUtil.getName());
  97. return rng;
  98. }
  99. @Override
  100. @Before
  101. public void setUp() throws Exception {
  102. super.setUp();
  103. WindowCacheConfig cfg = new WindowCacheConfig();
  104. cfg.setStreamFileThreshold(streamThreshold);
  105. cfg.install();
  106. repo = createBareRepository();
  107. tr = new TestRepository<>(repo);
  108. wc = (WindowCursor) repo.newObjectReader();
  109. }
  110. @Override
  111. @After
  112. public void tearDown() throws Exception {
  113. if (wc != null)
  114. wc.close();
  115. new WindowCacheConfig().install();
  116. super.tearDown();
  117. }
  118. @Test
  119. public void testWhole_SmallObject() throws Exception {
  120. final int type = Constants.OBJ_BLOB;
  121. byte[] data = getRng().nextBytes(300);
  122. RevBlob id = tr.blob(data);
  123. tr.branch("master").commit().add("A", id).create();
  124. tr.packAndPrune();
  125. assertTrue("has blob", wc.has(id));
  126. ObjectLoader ol = wc.open(id);
  127. assertNotNull("created loader", ol);
  128. assertEquals(type, ol.getType());
  129. assertEquals(data.length, ol.getSize());
  130. assertFalse("is not large", ol.isLarge());
  131. assertTrue("same content", Arrays.equals(data, ol.getCachedBytes()));
  132. ObjectStream in = ol.openStream();
  133. assertNotNull("have stream", in);
  134. assertEquals(type, in.getType());
  135. assertEquals(data.length, in.getSize());
  136. byte[] data2 = new byte[data.length];
  137. IO.readFully(in, data2, 0, data.length);
  138. assertTrue("same content", Arrays.equals(data2, data));
  139. assertEquals("stream at EOF", -1, in.read());
  140. in.close();
  141. }
  142. @Test
  143. public void testWhole_LargeObject() throws Exception {
  144. final int type = Constants.OBJ_BLOB;
  145. byte[] data = getRng().nextBytes(streamThreshold + 5);
  146. RevBlob id = tr.blob(data);
  147. tr.branch("master").commit().add("A", id).create();
  148. tr.packAndPrune();
  149. assertTrue("has blob", wc.has(id));
  150. ObjectLoader ol = wc.open(id);
  151. assertNotNull("created loader", ol);
  152. assertEquals(type, ol.getType());
  153. assertEquals(data.length, ol.getSize());
  154. assertTrue("is large", ol.isLarge());
  155. try {
  156. ol.getCachedBytes();
  157. fail("Should have thrown LargeObjectException");
  158. } catch (LargeObjectException tooBig) {
  159. assertEquals(MessageFormat.format(
  160. JGitText.get().largeObjectException, id.name()), tooBig
  161. .getMessage());
  162. }
  163. ObjectStream in = ol.openStream();
  164. assertNotNull("have stream", in);
  165. assertEquals(type, in.getType());
  166. assertEquals(data.length, in.getSize());
  167. byte[] data2 = new byte[data.length];
  168. IO.readFully(in, data2, 0, data.length);
  169. assertTrue("same content", Arrays.equals(data2, data));
  170. assertEquals("stream at EOF", -1, in.read());
  171. in.close();
  172. }
  173. @Test
  174. public void testDelta_SmallObjectChain() throws Exception {
  175. try (ObjectInserter.Formatter fmt = new ObjectInserter.Formatter()) {
  176. byte[] data0 = new byte[512];
  177. Arrays.fill(data0, (byte) 0xf3);
  178. ObjectId id0 = fmt.idFor(Constants.OBJ_BLOB, data0);
  179. TemporaryBuffer.Heap pack = new TemporaryBuffer.Heap(64 * 1024);
  180. packHeader(pack, 4);
  181. objectHeader(pack, Constants.OBJ_BLOB, data0.length);
  182. deflate(pack, data0);
  183. byte[] data1 = clone(0x01, data0);
  184. byte[] delta1 = delta(data0, data1);
  185. ObjectId id1 = fmt.idFor(Constants.OBJ_BLOB, data1);
  186. objectHeader(pack, Constants.OBJ_REF_DELTA, delta1.length);
  187. id0.copyRawTo(pack);
  188. deflate(pack, delta1);
  189. byte[] data2 = clone(0x02, data1);
  190. byte[] delta2 = delta(data1, data2);
  191. ObjectId id2 = fmt.idFor(Constants.OBJ_BLOB, data2);
  192. objectHeader(pack, Constants.OBJ_REF_DELTA, delta2.length);
  193. id1.copyRawTo(pack);
  194. deflate(pack, delta2);
  195. byte[] data3 = clone(0x03, data2);
  196. byte[] delta3 = delta(data2, data3);
  197. ObjectId id3 = fmt.idFor(Constants.OBJ_BLOB, data3);
  198. objectHeader(pack, Constants.OBJ_REF_DELTA, delta3.length);
  199. id2.copyRawTo(pack);
  200. deflate(pack, delta3);
  201. digest(pack);
  202. PackParser ip = index(pack.toByteArray());
  203. ip.setAllowThin(true);
  204. ip.parse(NullProgressMonitor.INSTANCE);
  205. assertTrue("has blob", wc.has(id3));
  206. ObjectLoader ol = wc.open(id3);
  207. assertNotNull("created loader", ol);
  208. assertEquals(Constants.OBJ_BLOB, ol.getType());
  209. assertEquals(data3.length, ol.getSize());
  210. assertFalse("is large", ol.isLarge());
  211. assertNotNull(ol.getCachedBytes());
  212. assertArrayEquals(data3, ol.getCachedBytes());
  213. ObjectStream in = ol.openStream();
  214. assertNotNull("have stream", in);
  215. assertEquals(Constants.OBJ_BLOB, in.getType());
  216. assertEquals(data3.length, in.getSize());
  217. byte[] act = new byte[data3.length];
  218. IO.readFully(in, act, 0, data3.length);
  219. assertTrue("same content", Arrays.equals(act, data3));
  220. assertEquals("stream at EOF", -1, in.read());
  221. in.close();
  222. }
  223. }
  224. @Test
  225. public void testDelta_FailsOver2GiB() throws Exception {
  226. try (ObjectInserter.Formatter fmt = new ObjectInserter.Formatter()) {
  227. byte[] base = new byte[] { 'a' };
  228. ObjectId idA = fmt.idFor(Constants.OBJ_BLOB, base);
  229. ObjectId idB = fmt.idFor(Constants.OBJ_BLOB, new byte[] { 'b' });
  230. PackedObjectInfo a = new PackedObjectInfo(idA);
  231. PackedObjectInfo b = new PackedObjectInfo(idB);
  232. TemporaryBuffer.Heap pack = new TemporaryBuffer.Heap(64 * 1024);
  233. packHeader(pack, 2);
  234. a.setOffset(pack.length());
  235. objectHeader(pack, Constants.OBJ_BLOB, base.length);
  236. deflate(pack, base);
  237. ByteArrayOutputStream tmp = new ByteArrayOutputStream();
  238. DeltaEncoder de = new DeltaEncoder(tmp, base.length, 3L << 30);
  239. de.copy(0, 1);
  240. byte[] delta = tmp.toByteArray();
  241. b.setOffset(pack.length());
  242. objectHeader(pack, Constants.OBJ_REF_DELTA, delta.length);
  243. idA.copyRawTo(pack);
  244. deflate(pack, delta);
  245. byte[] footer = digest(pack);
  246. File dir = new File(repo.getObjectDatabase().getDirectory(),
  247. "pack");
  248. File packName = new File(dir, idA.name() + ".pack");
  249. File idxName = new File(dir, idA.name() + ".idx");
  250. FileOutputStream f = new FileOutputStream(packName);
  251. try {
  252. f.write(pack.toByteArray());
  253. } finally {
  254. f.close();
  255. }
  256. f = new FileOutputStream(idxName);
  257. try {
  258. List<PackedObjectInfo> list = new ArrayList<>();
  259. list.add(a);
  260. list.add(b);
  261. Collections.sort(list);
  262. new PackIndexWriterV1(f).write(list, footer);
  263. } finally {
  264. f.close();
  265. }
  266. PackFile packFile = new PackFile(packName, PackExt.INDEX.getBit());
  267. try {
  268. packFile.get(wc, b);
  269. fail("expected LargeObjectException.ExceedsByteArrayLimit");
  270. } catch (LargeObjectException.ExceedsByteArrayLimit bad) {
  271. assertNull(bad.getObjectId());
  272. } finally {
  273. packFile.close();
  274. }
  275. }
  276. }
  277. @Test
  278. public void testConfigurableStreamFileThreshold() throws Exception {
  279. byte[] data = getRng().nextBytes(300);
  280. RevBlob id = tr.blob(data);
  281. tr.branch("master").commit().add("A", id).create();
  282. tr.packAndPrune();
  283. assertTrue("has blob", wc.has(id));
  284. ObjectLoader ol = wc.open(id);
  285. ObjectStream in = ol.openStream();
  286. assertTrue(in instanceof ObjectStream.SmallStream);
  287. assertEquals(300, in.available());
  288. in.close();
  289. wc.setStreamFileThreshold(299);
  290. ol = wc.open(id);
  291. in = ol.openStream();
  292. assertTrue(in instanceof ObjectStream.Filter);
  293. assertEquals(1, in.available());
  294. }
  295. private static byte[] clone(int first, byte[] base) {
  296. byte[] r = new byte[base.length];
  297. System.arraycopy(base, 1, r, 1, r.length - 1);
  298. r[0] = (byte) first;
  299. return r;
  300. }
  301. private static byte[] delta(byte[] base, byte[] dest) throws IOException {
  302. ByteArrayOutputStream tmp = new ByteArrayOutputStream();
  303. DeltaEncoder de = new DeltaEncoder(tmp, base.length, dest.length);
  304. de.insert(dest, 0, 1);
  305. de.copy(1, base.length - 1);
  306. return tmp.toByteArray();
  307. }
  308. private static void packHeader(TemporaryBuffer.Heap pack, int cnt)
  309. throws IOException {
  310. final byte[] hdr = new byte[8];
  311. NB.encodeInt32(hdr, 0, 2);
  312. NB.encodeInt32(hdr, 4, cnt);
  313. pack.write(Constants.PACK_SIGNATURE);
  314. pack.write(hdr, 0, 8);
  315. }
  316. private static void objectHeader(TemporaryBuffer.Heap pack, int type, int sz)
  317. throws IOException {
  318. byte[] buf = new byte[8];
  319. int nextLength = sz >>> 4;
  320. buf[0] = (byte) ((nextLength > 0 ? 0x80 : 0x00) | (type << 4) | (sz & 0x0F));
  321. sz = nextLength;
  322. int n = 1;
  323. while (sz > 0) {
  324. nextLength >>>= 7;
  325. buf[n++] = (byte) ((nextLength > 0 ? 0x80 : 0x00) | (sz & 0x7F));
  326. sz = nextLength;
  327. }
  328. pack.write(buf, 0, n);
  329. }
  330. private static void deflate(TemporaryBuffer.Heap pack, final byte[] content)
  331. throws IOException {
  332. final Deflater deflater = new Deflater();
  333. final byte[] buf = new byte[128];
  334. deflater.setInput(content, 0, content.length);
  335. deflater.finish();
  336. do {
  337. final int n = deflater.deflate(buf, 0, buf.length);
  338. if (n > 0)
  339. pack.write(buf, 0, n);
  340. } while (!deflater.finished());
  341. deflater.end();
  342. }
  343. private static byte[] digest(TemporaryBuffer.Heap buf)
  344. throws IOException {
  345. MessageDigest md = Constants.newMessageDigest();
  346. md.update(buf.toByteArray());
  347. byte[] footer = md.digest();
  348. buf.write(footer);
  349. return footer;
  350. }
  351. private ObjectInserter inserter;
  352. @After
  353. public void release() {
  354. if (inserter != null) {
  355. inserter.close();
  356. }
  357. }
  358. private PackParser index(byte[] raw) throws IOException {
  359. if (inserter == null)
  360. inserter = repo.newObjectInserter();
  361. return inserter.newPackParser(new ByteArrayInputStream(raw));
  362. }
  363. }