You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

FileSender.java 6.9KB

Don't use interruptable pread() to access pack files The J2SE NIO APIs require that FileChannel close the underlying file descriptor if a thread is interrupted while it is inside of a read or write operation on that channel. This is insane, because it means we cannot share the file descriptor between threads. If a thread is in the middle of the FileChannel variant of IO.readFully() and it receives an interrupt, the pack will be automatically closed on us. This causes the other threads trying to use that same FileChannel to receive IOExceptions, which leads to the pack getting marked as invalid. Once the pack is marked invalid, JGit loses access to its entire contents and starts to report MissingObjectExceptions. Because PackWriter must ensure that the chosen pack file stays available until the current object's data is fully copied to the output, JGit cannot simply reopen the pack when its automatically closed due to an interrupt being sent at the wrong time. The pack may have been deleted by a concurrent `git gc` process, and that open file descriptor might be the last reference to the inode on disk. Once its closed, the PackWriter loses access to that object representation, and it cannot complete sending the object the client. Fortunately, RandomAccessFile's readFully method does not have this problem. Interrupts during readFully() are ignored. However, it requires us to first seek to the offset we need to read, then issue the read call. This requires locking around the file descriptor to prevent concurrent threads from moving the pointer before the read. This reduces the concurrency level, as now only one window can be paged in at a time from each pack. However, the WindowCache should already be holding most of the pages required to handle the working set for a process, and its own internal locking was already limiting us on the number of concurrent loads possible. Provided that most concurrent accesses are getting hits in the WindowCache, or are for different repositories on the same server, we shouldn't see a major performance hit due to the more serialized loading. I would have preferred to use a pool of RandomAccessFiles for each pack, with threads borrowing an instance dedicated to that thread whenever they needed to page in a window. This would permit much higher levels of concurrency by using multiple file descriptors (and file pointers) for each pack. However the code became too complex to develop in any reasonable period of time, so I've chosen to retrofit the existing code with more serialization instead. Bug: 308945 Change-Id: I2e6e11c6e5a105e5aef68871b66200fd725134c9 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
Don't use interruptable pread() to access pack files The J2SE NIO APIs require that FileChannel close the underlying file descriptor if a thread is interrupted while it is inside of a read or write operation on that channel. This is insane, because it means we cannot share the file descriptor between threads. If a thread is in the middle of the FileChannel variant of IO.readFully() and it receives an interrupt, the pack will be automatically closed on us. This causes the other threads trying to use that same FileChannel to receive IOExceptions, which leads to the pack getting marked as invalid. Once the pack is marked invalid, JGit loses access to its entire contents and starts to report MissingObjectExceptions. Because PackWriter must ensure that the chosen pack file stays available until the current object's data is fully copied to the output, JGit cannot simply reopen the pack when its automatically closed due to an interrupt being sent at the wrong time. The pack may have been deleted by a concurrent `git gc` process, and that open file descriptor might be the last reference to the inode on disk. Once its closed, the PackWriter loses access to that object representation, and it cannot complete sending the object the client. Fortunately, RandomAccessFile's readFully method does not have this problem. Interrupts during readFully() are ignored. However, it requires us to first seek to the offset we need to read, then issue the read call. This requires locking around the file descriptor to prevent concurrent threads from moving the pointer before the read. This reduces the concurrency level, as now only one window can be paged in at a time from each pack. However, the WindowCache should already be holding most of the pages required to handle the working set for a process, and its own internal locking was already limiting us on the number of concurrent loads possible. Provided that most concurrent accesses are getting hits in the WindowCache, or are for different repositories on the same server, we shouldn't see a major performance hit due to the more serialized loading. I would have preferred to use a pool of RandomAccessFiles for each pack, with threads borrowing an instance dedicated to that thread whenever they needed to page in a window. This would permit much higher levels of concurrency by using multiple file descriptors (and file pointers) for each pack. However the code became too complex to develop in any reasonable period of time, so I've chosen to retrofit the existing code with more serialization instead. Bug: 308945 Change-Id: I2e6e11c6e5a105e5aef68871b66200fd725134c9 Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
14 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226
  1. /*
  2. * Copyright (C) 2009-2010, Google Inc.
  3. * and other copyright owners as documented in the project's IP log.
  4. *
  5. * This program and the accompanying materials are made available
  6. * under the terms of the Eclipse Distribution License v1.0 which
  7. * accompanies this distribution, is reproduced below, and is
  8. * available at http://www.eclipse.org/org/documents/edl-v10.php
  9. *
  10. * All rights reserved.
  11. *
  12. * Redistribution and use in source and binary forms, with or
  13. * without modification, are permitted provided that the following
  14. * conditions are met:
  15. *
  16. * - Redistributions of source code must retain the above copyright
  17. * notice, this list of conditions and the following disclaimer.
  18. *
  19. * - Redistributions in binary form must reproduce the above
  20. * copyright notice, this list of conditions and the following
  21. * disclaimer in the documentation and/or other materials provided
  22. * with the distribution.
  23. *
  24. * - Neither the name of the Eclipse Foundation, Inc. nor the
  25. * names of its contributors may be used to endorse or promote
  26. * products derived from this software without specific prior
  27. * written permission.
  28. *
  29. * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
  30. * CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
  31. * INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
  32. * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  33. * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
  34. * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  35. * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  36. * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  37. * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
  38. * CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
  39. * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
  40. * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
  41. * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  42. */
  43. package org.eclipse.jgit.http.server;
  44. import static javax.servlet.http.HttpServletResponse.SC_PARTIAL_CONTENT;
  45. import static javax.servlet.http.HttpServletResponse.SC_REQUESTED_RANGE_NOT_SATISFIABLE;
  46. import static org.eclipse.jgit.util.HttpSupport.HDR_ACCEPT_RANGES;
  47. import static org.eclipse.jgit.util.HttpSupport.HDR_CONTENT_LENGTH;
  48. import static org.eclipse.jgit.util.HttpSupport.HDR_CONTENT_RANGE;
  49. import static org.eclipse.jgit.util.HttpSupport.HDR_IF_RANGE;
  50. import static org.eclipse.jgit.util.HttpSupport.HDR_RANGE;
  51. import java.io.EOFException;
  52. import java.io.File;
  53. import java.io.FileNotFoundException;
  54. import java.io.IOException;
  55. import java.io.OutputStream;
  56. import java.io.RandomAccessFile;
  57. import java.text.MessageFormat;
  58. import java.util.Enumeration;
  59. import javax.servlet.http.HttpServletRequest;
  60. import javax.servlet.http.HttpServletResponse;
  61. import org.eclipse.jgit.lib.ObjectId;
  62. /**
  63. * Dumps a file over HTTP GET (or its information via HEAD).
  64. * <p>
  65. * Supports a single byte range requested via {@code Range} HTTP header. This
  66. * feature supports a dumb client to resume download of a larger object file.
  67. */
  68. final class FileSender {
  69. private final File path;
  70. private final RandomAccessFile source;
  71. private final long lastModified;
  72. private final long fileLen;
  73. private long pos;
  74. private long end;
  75. FileSender(final File path) throws FileNotFoundException {
  76. this.path = path;
  77. this.source = new RandomAccessFile(path, "r");
  78. try {
  79. this.lastModified = path.lastModified();
  80. this.fileLen = source.getChannel().size();
  81. this.end = fileLen;
  82. } catch (IOException e) {
  83. try {
  84. source.close();
  85. } catch (IOException closeError) {
  86. // Ignore any error closing the stream.
  87. }
  88. final FileNotFoundException r;
  89. r = new FileNotFoundException(MessageFormat.format(HttpServerText.get().cannotGetLengthOf, path));
  90. r.initCause(e);
  91. throw r;
  92. }
  93. }
  94. void close() {
  95. try {
  96. source.close();
  97. } catch (IOException e) {
  98. // Ignore close errors on a read-only stream.
  99. }
  100. }
  101. long getLastModified() {
  102. return lastModified;
  103. }
  104. String getTailChecksum() throws IOException {
  105. final int n = 20;
  106. final byte[] buf = new byte[n];
  107. source.seek(fileLen - n);
  108. source.readFully(buf, 0, n);
  109. return ObjectId.fromRaw(buf).getName();
  110. }
  111. void serve(final HttpServletRequest req, final HttpServletResponse rsp,
  112. final boolean sendBody) throws IOException {
  113. if (!initRangeRequest(req, rsp)) {
  114. rsp.sendError(SC_REQUESTED_RANGE_NOT_SATISFIABLE);
  115. return;
  116. }
  117. rsp.setHeader(HDR_ACCEPT_RANGES, "bytes");
  118. rsp.setHeader(HDR_CONTENT_LENGTH, Long.toString(end - pos));
  119. if (sendBody) {
  120. final OutputStream out = rsp.getOutputStream();
  121. try {
  122. final byte[] buf = new byte[4096];
  123. source.seek(pos);
  124. while (pos < end) {
  125. final int r = (int) Math.min(buf.length, end - pos);
  126. final int n = source.read(buf, 0, r);
  127. if (n < 0) {
  128. throw new EOFException(MessageFormat.format(HttpServerText.get().unexpectedeOFOn, path));
  129. }
  130. out.write(buf, 0, n);
  131. pos += n;
  132. }
  133. out.flush();
  134. } finally {
  135. out.close();
  136. }
  137. }
  138. }
  139. private boolean initRangeRequest(final HttpServletRequest req,
  140. final HttpServletResponse rsp) throws IOException {
  141. final Enumeration<String> rangeHeaders = getRange(req);
  142. if (!rangeHeaders.hasMoreElements()) {
  143. // No range headers, the request is fine.
  144. return true;
  145. }
  146. final String range = rangeHeaders.nextElement();
  147. if (rangeHeaders.hasMoreElements()) {
  148. // To simplify the code we support only one range.
  149. return false;
  150. }
  151. final int eq = range.indexOf('=');
  152. final int dash = range.indexOf('-');
  153. if (eq < 0 || dash < 0 || !range.startsWith("bytes=")) {
  154. return false;
  155. }
  156. final String ifRange = req.getHeader(HDR_IF_RANGE);
  157. if (ifRange != null && !getTailChecksum().equals(ifRange)) {
  158. // If the client asked us to verify the ETag and its not
  159. // what they expected we need to send the entire content.
  160. return true;
  161. }
  162. try {
  163. if (eq + 1 == dash) {
  164. // "bytes=-500" means last 500 bytes
  165. pos = Long.parseLong(range.substring(dash + 1));
  166. pos = fileLen - pos;
  167. } else {
  168. // "bytes=500-" (position 500 to end)
  169. // "bytes=500-1000" (position 500 to 1000)
  170. pos = Long.parseLong(range.substring(eq + 1, dash));
  171. if (dash < range.length() - 1) {
  172. end = Long.parseLong(range.substring(dash + 1));
  173. end++; // range was inclusive, want exclusive
  174. }
  175. }
  176. } catch (NumberFormatException e) {
  177. // We probably hit here because of a non-digit such as
  178. // "," appearing at the end of the first range telling
  179. // us there is a second range following. To simplify
  180. // the code we support only one range.
  181. return false;
  182. }
  183. if (end > fileLen) {
  184. end = fileLen;
  185. }
  186. if (pos >= end) {
  187. return false;
  188. }
  189. rsp.setStatus(SC_PARTIAL_CONTENT);
  190. rsp.setHeader(HDR_CONTENT_RANGE, "bytes " + pos + "-" + (end - 1) + "/"
  191. + fileLen);
  192. source.seek(pos);
  193. return true;
  194. }
  195. private static Enumeration<String> getRange(final HttpServletRequest req) {
  196. return req.getHeaders(HDR_RANGE);
  197. }
  198. }