diff options
Diffstat (limited to 'src/documentation/content/xdocs/components/slideshow')
6 files changed, 1727 insertions, 0 deletions
diff --git a/src/documentation/content/xdocs/components/slideshow/how-to-shapes.xml b/src/documentation/content/xdocs/components/slideshow/how-to-shapes.xml new file mode 100644 index 0000000000..f1183c357d --- /dev/null +++ b/src/documentation/content/xdocs/components/slideshow/how-to-shapes.xml @@ -0,0 +1,642 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!-- + ==================================================================== + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + ==================================================================== +--> +<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd"> + +<document> + <header> + <title>Busy Developers' Guide to HSLF drawing layer</title> + <authors> + <person email="yegor@dinom.ru" name="Yegor Kozlov" id="CO"/> + </authors> + </header> + <body> + <section><title>Busy Developers' Guide to HSLF drawing layer</title> + <section><title>Index of Features</title> + <ul> + <li><a href="#NewPresentation">How to create a new presentation and add new slides to it</a></li> + <li><a href="#PageSize">How to retrieve or change slide size</a></li> + <li><a href="#GetShapes">How to get shapes contained in a particular slide</a></li> + <li><a href="#Shapes">Drawing a shape on a slide</a></li> + <li><a href="#Pictures">How to work with pictures</a></li> + <li><a href="#SlideTitle">How to set slide title</a></li> + <li><a href="#Fill">How to work with slide/shape background</a></li> + <li><a href="#Bullets">How to create bulleted lists</a></li> + <li><a href="#Hyperlinks">Hyperlinks</a></li> + <li><a href="#Tables">Tables</a></li> + <li><a href="#RemoveShape">How to remove shapes</a></li> + <li><a href="#OLE">How to retrieve embedded OLE objects</a></li> + <li><a href="#Sound">How to retrieve embedded sounds</a></li> + <li><a href="#Freeform">How to create shapes of arbitrary geometry</a></li> + <li><a href="#Graphics2D">Shapes and Graphics2D</a></li> + <li><a href="#Render">How to convert slides into images</a></li> + <li><a href="#HeadersFooters">Headers / Footers</a></li> + </ul> + </section> + <section><title>Features</title> + <anchor id="NewPresentation"/> + <section><title>New Presentation</title> + <source> + //create a new empty slide show + HSLFSlideShow ppt = new HSLFSlideShow(); + + //add first slide + HSLFSlide s1 = ppt.createSlide(); + + //add second slide + HSLFSlide s2 = ppt.createSlide(); + + //save changes in a file + FileOutputStream out = new FileOutputStream("slideshow.ppt"); + ppt.write(out); + out.close(); + </source> + </section> + <anchor id="PageSize"/> + <section><title>How to retrieve or change slide size</title> + <source> + HSLFSlideShow ppt = new HSLFSlideShow(new HSLFSlideShowImpl("slideshow.ppt")); + //retrieve page size. Coordinates are expressed in points (72 dpi) + java.awt.Dimension pgsize = ppt.getPageSize(); + int pgx = pgsize.width; //slide width + int pgy = pgsize.height; //slide height + + //set new page size + ppt.setPageSize(new java.awt.Dimension(1024, 768)); + //save changes + FileOutputStream out = new FileOutputStream("slideshow.ppt"); + ppt.write(out); + out.close(); + </source> + </section> + <anchor id="GetShapes"/> + <section><title>How to get shapes contained in a particular slide</title> + <p> + The following code demonstrates how to iterate over shapes for each slide. + </p> + <source> + HSLFSlideShow ppt = new HSLFSlideShow(new HSLFSlideShowImpl("slideshow.ppt")); + // get slides + for (HSLFSlide slide : ppt.getSlides()) { + for (HSLFShape sh : slide.getShapes()) { + // name of the shape + String name = sh.getShapeName(); + + // shapes's anchor which defines the position of this shape in the slide + java.awt.Rectangle anchor = sh.getAnchor(); + + if (sh instanceof Line) { + Line line = (Line) sh; + // work with Line + } else if (sh instanceof HSLFAutoShape) { + HSLFAutoShape shape = (HSLFAutoShape) sh; + // work with AutoShape + } else if (sh instanceof HSLFTextBox) { + HSLFTextBox shape = (HSLFTextBox) sh; + // work with TextBox + } else if (sh instanceof HSLFPictureShape) { + HSLFPictureShape shape = (HSLFPictureShape) sh; + // work with Picture + } + } + } + </source> + </section> + <anchor id="Shapes"/> + <section><title>Drawing a shape on a slide</title> + <warning> + To work with graphic objects HSLF uses Java2D classes + that may throw exceptions if graphical environment is not available. In case if graphical environment + is not available, you must tell Java that you are running in headless mode and + set the following system property: <code> java.awt.headless=true </code> + (either via <code>-Djava.awt.headless=true</code> startup parameter or via <code>System.setProperty("java.awt.headless", "true")</code>). + </warning> + <p> + When you add a shape, you usually specify the dimensions of the shape and the position + of the upper left corner of the bounding box for the shape relative to the upper left + corner of the slide. Distances in the drawing layer are measured in points (72 points = 1 inch). + </p> + <source> + HSLFSlideShow ppt = new HSLFSlideShow(); + + HSLFSlide slide = ppt.createSlide(); + + //Line shape + Line line = new Line(); + line.setAnchor(new java.awt.Rectangle(50, 50, 100, 20)); + line.setLineColor(new Color(0, 128, 0)); + line.setLineCompound(LineCompound.DOUBLE); + slide.addShape(line); + + //TextBox + HSLFTextBox txt = new HSLFTextBox(); + txt.setText("Hello, World!"); + txt.setAnchor(new java.awt.Rectangle(300, 100, 300, 50)); + + // use TextRun to work with the text format + HSLFTextParagraph tp = txt.getTextParagraphs().get(0); + tp.setAlignment(TextAlign.RIGHT); + HSLFTextRun rt = tp.getTextRuns().get(0); + rt.setFontSize(32.); + rt.setFontFamily("Arial"); + rt.setBold(true); + rt.setItalic(true); + rt.setUnderlined(true); + rt.setFontColor(Color.red); + + slide.addShape(txt); + + // Autoshape + // 32-point star + HSLFAutoShape sh1 = new HSLFAutoShape(ShapeType.STAR_32); + sh1.setAnchor(new java.awt.Rectangle(50, 50, 100, 200)); + sh1.setFillColor(Color.red); + slide.addShape(sh1); + + //Trapezoid + HSLFAutoShape sh2 = new HSLFAutoShape(ShapeType.TRAPEZOID); + sh2.setAnchor(new java.awt.Rectangle(150, 150, 100, 200)); + sh2.setFillColor(Color.blue); + slide.addShape(sh2); + + FileOutputStream out = new FileOutputStream("slideshow.ppt"); + ppt.write(out); + out.close(); + </source> + </section> + <anchor id="Pictures"/> + <section><title>How to work with pictures</title> + + <p> + Currently, HSLF API supports the following types of pictures: + </p> + <ul> + <li>Windows Metafiles (WMF)</li> + <li>Enhanced Metafiles (EMF)</li> + <li>JPEG Interchange Format</li> + <li>Portable Network Graphics (PNG)</li> + <li>Macintosh PICT</li> + </ul> + + <source> + HSLFSlideShow ppt = new HSLFSlideShow(new HSLFSlideShowImpl("slideshow.ppt")); + + // extract all pictures contained in the presentation + int idx = 1; + for (HSLFPictureData pict : ppt.getPictureData()) { + // picture data + byte[] data = pict.getData(); + + PictureData.PictureType type = pict.getType(); + String ext = type.extension; + FileOutputStream out = new FileOutputStream("pict_" + idx + ext); + out.write(data); + out.close(); + idx++; + } + + // add a new picture to this slideshow and insert it in a new slide + HSLFPictureData pd = ppt.addPicture(new File("clock.jpg"), PictureData.PictureType.JPEG); + + HSLFPictureShape pictNew = new HSLFPictureShape(pd); + + // set image position in the slide + pictNew.setAnchor(new java.awt.Rectangle(100, 100, 300, 200)); + + HSLFSlide slide = ppt.createSlide(); + slide.addShape(pictNew); + + // now retrieve pictures containes in the first slide and save them on disk + idx = 1; + slide = ppt.getSlides().get(0); + for (HSLFShape sh : slide.getShapes()) { + if (sh instanceof HSLFPictureShape) { + HSLFPictureShape pict = (HSLFPictureShape) sh; + HSLFPictureData pictData = pict.getPictureData(); + byte[] data = pictData.getData(); + PictureData.PictureType type = pictData.getType(); + FileOutputStream out = new FileOutputStream("slide0_" + idx + type.extension); + out.write(data); + out.close(); + idx++; + } + } + + FileOutputStream out = new FileOutputStream("slideshow.ppt"); + ppt.write(out); + out.close(); + </source> + </section> + <anchor id="SlideTitle"/> + <section><title>How to set slide title</title> + <source> + HSLFSlideShow ppt = new HSLFSlideShow(); + HSLFSlide slide = ppt.createSlide(); + HSLFTextBox title = slide.addTitle(); + title.setText("Hello, World!"); + + // save changes + FileOutputStream out = new FileOutputStream("slideshow.ppt"); + ppt.write(out); + out.close(); + </source> + <p> + Below is the equivalent code in PowerPoint VBA: + </p> + <source> + Set myDocument = ActivePresentation.Slides(1) + myDocument.Shapes.AddTitle.TextFrame.TextRange.Text = "Hello, World!" + </source> + </section> + <anchor id="Fill"/> + <section><title>How to modify background of a slide master</title> + <source> + HSLFSlideShow ppt = new HSLFSlideShow(); + HSLFSlideMaster master = ppt.getSlideMasters().get(0); + + HSLFFill fill = master.getBackground().getFill(); + HSLFPictureData pd = ppt.addPicture(new File("background.png"), PictureData.PictureType.PNG); + fill.setFillType(HSLFFill.FILL_PICTURE); + fill.setPictureData(pd); + </source> + </section> + <section><title>How to modify background of a slide</title> + <source> + HSLFSlideShow ppt = new HSLFSlideShow(); + HSLFSlide slide = ppt.createSlide(); + + // This slide has its own background. + // Without this line it will use master's background. + slide.setFollowMasterBackground(false); + HSLFFill fill = slide.getBackground().getFill(); + HSLFPictureData pd = ppt.addPicture(new File("background.png"), PictureData.PictureType.PNG); + fill.setFillType(HSLFFill.FILL_PATTERN); + fill.setPictureData(pd); + </source> + </section> + <section><title>How to modify background of a shape</title> + <source> + HSLFSlideShow ppt = new HSLFSlideShow(); + HSLFSlide slide = ppt.createSlide(); + + HSLFShape shape = new HSLFAutoShape(ShapeType.RECT); + shape.setAnchor(new java.awt.Rectangle(100, 100, 200, 200)); + HSLFFill fill = shape.getFill(); + fill.setFillType(HSLFFill.FILL_SHADE); + fill.setBackgroundColor(Color.red); + fill.setForegroundColor(Color.green); + + slide.addShape(shape); + </source> + </section> + <anchor id="Bullets"/> + <section><title>How to create bulleted lists</title> + <source> + HSLFSlideShow ppt = new HSLFSlideShow(); + + HSLFSlide slide = ppt.createSlide(); + + HSLFTextBox shape = new HSLFTextBox(); + HSLFTextParagraph tp = shape.getTextParagraphs().get(0); + tp.setBullet(true); + tp.setBulletChar('\u263A'); //bullet character + tp.setIndent(0.); //bullet offset + tp.setLeftMargin(50.); //text offset (should be greater than bullet offset) + HSLFTextRun rt = tp.getTextRuns().get(0); + shape.setText( + "January\r" + + "February\r" + + "March\r" + + "April"); + rt.setFontSize(42.); + slide.addShape(shape); + + shape.setAnchor(new java.awt.Rectangle(50, 50, 500, 300)); //position of the text box in the slide + slide.addShape(shape); + + FileOutputStream out = new FileOutputStream("bullets.ppt"); + ppt.write(out); + out.close(); + </source> + </section> + <anchor id="Hyperlinks"/> + <section><title>How to read hyperlinks from a slide show</title> + <source> + FileInputStream is = new FileInputStream("slideshow.ppt"); + HSLFSlideShow ppt = new HSLFSlideShow(is); + is.close(); + + for (HSLFSlide slide : ppt.getSlides()) { + //read hyperlinks from the text runs + for (List<HSLFTextParagraph> txt : slide.getTextParagraphs()) { + for (HSLFTextParagraph para : txt) { + for (HSLFTextRun run : para) { + HSLFHyperlink link = run.getHyperlink(); + if (link != null) { + String title = link.getLabel(); + String address = link.getAddress(); + String text = run.getRawText(); + } + } + } + } + + //in PowerPoint you can assign a hyperlink to a shape without text, + //for example to a Line object. The code below demonstrates how to + //read such hyperlinks + for (HSLFShape sh : slide.getShapes()) { + if (sh instanceof HSLFSimpleShape) { + HSLFHyperlink link = ((HSLFSimpleShape)sh).getHyperlink(); + if(link != null) { + String title = link.getLabel(); + String address = link.getAddress(); + } + } + } + } + </source> + </section> + <anchor id="Tables"/> + <section><title>How to create tables</title> + <source> + //table data + String[][] data = { + {"INPUT FILE", "NUMBER OF RECORDS"}, + {"Item File", "11,559"}, + {"Vendor File", "300"}, + {"Purchase History File", "10,000"}, + {"Total # of requisitions", "10,200,038"} + }; + + HSLFSlideShow ppt = new HSLFSlideShow(); + + HSLFSlide slide = ppt.createSlide(); + //create a table of 5 rows and 2 columns + HSLFTable table = new HSLFTable(5, 2); + for (int i = 0; i < data.length; i++) { + for (int j = 0; j < data[i].length; j++) { + HSLFTableCell cell = table.getCell(i, j); + cell.setText(data[i][j]); + + HSLFTextRun rt = cell.getTextParagraphs().get(0).getTextRuns().get(0); + rt.setFontFamily("Arial"); + rt.setFontSize(10.); + + cell.setVerticalAlignment(VerticalAlignment.MIDDLE); + cell.setHorizontalCentered(true); + } + } + + //set table borders + Line border = table.createBorder(); + border.setLineColor(Color.black); + border.setLineWidth(1.0); + table.setAllBorders(border); + + //set width of the 1st column + table.setColumnWidth(0, 300); + //set width of the 2nd column + table.setColumnWidth(1, 150); + + slide.addShape(table); + table.moveTo(100, 100); + + FileOutputStream out = new FileOutputStream("hslf-table.ppt"); + ppt.write(out); + out.close(); + </source> + </section> + + <anchor id="RemoveShape"/> + <section><title>How to remove shapes from a slide</title> + <source> + for (HSLFShape shape : slide.getShapes()) { + // remove the shape + boolean ok = slide.removeShape(shape); + if (ok) { + // the shape was removed. Do something. + } + } + </source> + </section> + <anchor id="OLE"/> + <section><title>How to retrieve embedded OLE objects</title> + <source> + for (HSLFShape shape : slide.getShapes()) { + if (shape instanceof OLEShape) { + OLEShape ole = (OLEShape) shape; + HSLFObjectData data = ole.getObjectData(); + String name = ole.getInstanceName(); + if ("Worksheet".equals(name)) { + HSSFWorkbook wb = new HSSFWorkbook(data.getData()); + } else if ("Document".equals(name)) { + HWPFDocument doc = new HWPFDocument(data.getData()); + } + } + } + </source> + </section> + + <anchor id="Sound"/> + <section><title>How to retrieve embedded sounds</title> + <source> + FileInputStream is = new FileInputStream(args[0]); + HSLFSlideShow ppt = new HSLFSlideShow(is); + is.close(); + + for (HSLFSoundData sound : ppt.getSoundData()) { + // save *WAV sounds on disk + if (sound.getSoundType().equals(".WAV")) { + FileOutputStream out = new FileOutputStream(sound.getSoundName()); + out.write(sound.getData()); + out.close(); + } + } + </source> + </section> + + <anchor id="Freeform"/> + <section><title>How to create shapes of arbitrary geometry</title> + <source> + HSLFSlideShow ppt = new HSLFSlideShow(); + HSLFSlide slide = ppt.createSlide(); + + java.awt.geom.GeneralPath path = new java.awt.geom.GeneralPath(); + path.moveTo(100, 100); + path.lineTo(200, 100); + path.curveTo(50, 45, 134, 22, 78, 133); + path.curveTo(10, 45, 134, 56, 78, 100); + path.lineTo(100, 200); + path.closePath(); + + HSLFFreeformShape shape = new HSLFFreeformShape(); + shape.setPath(path); + slide.addShape(shape); + </source> + </section> + + <anchor id="Graphics2D"/> + <section><title>How to draw into a slide using Graphics2D</title> + <warning> + Current implementation of the PowerPoint Graphics2D driver is not fully compliant with the java.awt.Graphics2D specification. + Some features like clipping, drawing of images are not yet supported. + </warning> + <source> + HSLFSlideShow ppt = new HSLFSlideShow(); + HSLFSlide slide = ppt.createSlide(); + + // draw a simple bar graph + // bar chart data. + // The first value is the bar color, + // the second is the width + Object[] def = new Object[]{ + Color.yellow, new Integer(100), + Color.green, new Integer(150), + Color.gray, new Integer(75), + Color.red, new Integer(200), + }; + + // all objects are drawn into a shape group so we need to create one + + HSLFGroupShape group = new HSLFGroupShape(); + // define position of the drawing in the slide + Rectangle bounds = new java.awt.Rectangle(200, 100, 350, 300); + // if you want to draw in the entire slide area then define the anchor + // as follows: + // Dimension pgsize = ppt.getPageSize(); + // java.awt.Rectangle bounds = new java.awt.Rectangle(0, 0, + // pgsize.width, pgsize.height); + + group.setAnchor(bounds); + slide.addShape(group); + + // draw a simple bar chart + Graphics2D graphics = new PPGraphics2D(group); + int x = bounds.x + 50, y = bounds.y + 50; + graphics.setFont(new Font("Arial", Font.BOLD, 10)); + for (int i = 0, idx = 1; i < def.length; i += 2, idx++) { + graphics.setColor(Color.black); + int width = ((Integer) def[i + 1]).intValue(); + graphics.drawString("Q" + idx, x - 20, y + 20); + graphics.drawString(width + "%", x + width + 10, y + 20); + graphics.setColor((Color) def[i]); + graphics.fill(new Rectangle(x, y, width, 30)); + y += 40; + } + graphics.setColor(Color.black); + graphics.setFont(new Font("Arial", Font.BOLD, 14)); + graphics.draw(bounds); + graphics.drawString("Performance", x + 70, y + 40); + + FileOutputStream out = new FileOutputStream("hslf-graphics2d.ppt"); + ppt.write(out); + out.close(); + </source> + </section> + + <anchor id="Render"/> + <section><title>Export PowerPoint slides into java.awt.Graphics2D</title> + <p> + HSLF provides a way to export slides into images. You can capture slides into java.awt.Graphics2D object (or any other) + and serialize it into a PNG or JPEG format. Please note, although HSLF attempts to render slides as close to PowerPoint as possible, + the output may look differently from PowerPoint due to the following reasons: + </p> + <ul> + <li>Java2D renders fonts differently vs PowerPoint. There are always some differences in the way the font glyphs are painted</li> + <li>HSLF uses java.awt.font.LineBreakMeasurer to break text into lines. PowerPoint may do it in a different way.</li> + <li>If a font from the presentation is not available, then the JDK default font will be used.</li> + </ul> + <p> + Current Limitations: + </p> + <ul> + <li>Some types of shapes are not yet supported (WordArt, complex auto-shapes)</li> + <li>Only Bitmap images (PNG, JPEG, DIB) can be rendered in Java</li> + </ul> + <source> + FileInputStream is = new FileInputStream("slideshow.ppt"); + HSLFSlideShow ppt = new HSLFSlideShow(is); + is.close(); + + Dimension pgsize = ppt.getPageSize(); + + int idx = 1; + for (HSLFSlide slide : ppt.getSlides()) { + + BufferedImage img = new BufferedImage(pgsize.width, pgsize.height, BufferedImage.TYPE_INT_RGB); + Graphics2D graphics = img.createGraphics(); + // clear the drawing area + graphics.setPaint(Color.white); + graphics.fill(new Rectangle2D.Float(0, 0, pgsize.width, pgsize.height)); + + // render + slide.draw(graphics); + + // save the output + FileOutputStream out = new FileOutputStream("slide-" + idx + ".png"); + javax.imageio.ImageIO.write(img, "png", out); + out.close(); + + idx++; + } + </source> + </section> + + </section> + <anchor id="HeadersFooters"/> + <section><title>How to extract Headers / Footers from an existing presentation</title> + <source> + FileInputStream is = new FileInputStream("slideshow.ppt"); + HSLFSlideShow ppt = new HSLFSlideShow(is); + is.close(); + + // presentation-scope headers / footers + HeadersFooters hdd = ppt.getSlideHeadersFooters(); + if (hdd.isFooterVisible()) { + String footerText = hdd.getFooterText(); + } + + // per-slide headers / footers + for (HSLFSlide slide : ppt.getSlides()) { + HeadersFooters hdd2 = slide.getHeadersFooters(); + if (hdd2.isFooterVisible()) { + String footerText = hdd2.getFooterText(); + } + if (hdd2.isUserDateVisible()) { + String customDate = hdd2.getDateTimeText(); + } + if (hdd2.isSlideNumberVisible()) { + int slideNUm = slide.getSlideNumber(); + } + } + </source> + </section> + <section><title>How to set Headers / Footers</title> + <source> + HSLFSlideShow ppt = new HSLFSlideShow(); + + // presentation-scope headers / footers + HeadersFooters hdd = ppt.getSlideHeadersFooters(); + hdd.setSlideNumberVisible(true); + hdd.setFootersText("Created by POI-HSLF"); + </source> + </section> + </section> + </body> +</document> diff --git a/src/documentation/content/xdocs/components/slideshow/index.xml b/src/documentation/content/xdocs/components/slideshow/index.xml new file mode 100644 index 0000000000..b963d928a7 --- /dev/null +++ b/src/documentation/content/xdocs/components/slideshow/index.xml @@ -0,0 +1,72 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!-- + ==================================================================== + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + ==================================================================== +--> +<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd"> + +<document> + <header> + <title>POI-HSLF and and POI-XLSF - Java API To Access Microsoft Powerpoint Format Files</title> + <subtitle>Overview</subtitle> + <authors> + <person name="Avik Sengupta" email="avik at apache dot org"/> + <person name="Nick Burch" email="nick at apache dot org"/> + <person name="Yegor Kozlov" email="yegor at apache dot org"/> + </authors> + </header> + + <body> + <section> + <title>POI-HSLF</title> + + <p>HSLF is the POI Project's pure Java implementation of the Powerpoint '97(-2007) file format. </p> + <p>HSLF provides a way to read, create or modify PowerPoint presentations. In particular, it provides: + </p> + <ul> + <li>api for data extraction (text, pictures, embedded objects, sounds)</li> + <li>usermodel api for creating, reading and modifying ppt files</li> + </ul> + <note> + This code currently lives the + <a href="https://github.com/apache/poi/tree/trunk/poi-scratchpad/">scratchpad area</a> + of the POI Git repository. To use this component, ensure + you have the Scratchpad Jar on your classpath, or a dependency + defined on the <em>poi-scratchpad</em> artifact - the main POI + jar is not enough! See the + <a href="site:components">POI Components Map</a> + for more details. + </note> + <p>The <a href="./quick-guide.html">quick guide</a> documentation provides + information on using this API. Comments and fixes gratefully accepted on the POI + dev mailing lists.</p> + </section> + <section> + <title>POI-XSLF</title> + <p> + XSLF is the POI Project's pure Java implementation of the PowerPoint 2007 OOXML (.xlsx) file format. + Whilst HSLF and XSLF provide similar features, there is not a common interface across the two of them at this time. + </p> + <p> + Please note that XSLF is still in early development and is a subject to incompatible changes in future. + </p> + <p> + A quick guide is available in the <a href="./xslf-cookbook.html">XSLF Cookbook</a> + </p> + </section> + </body> +</document> diff --git a/src/documentation/content/xdocs/components/slideshow/ppt-file-format.xml b/src/documentation/content/xdocs/components/slideshow/ppt-file-format.xml new file mode 100644 index 0000000000..202df1d436 --- /dev/null +++ b/src/documentation/content/xdocs/components/slideshow/ppt-file-format.xml @@ -0,0 +1,367 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!-- + ==================================================================== + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + ==================================================================== +--> +<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd"> + +<document> + <header> + <title>POI-HSLF - A Guide to the PowerPoint File Format</title> + <subtitle>Overview</subtitle> + <authors> + <person name="Nick Burch" email="nick at torchbox dot com"/> + <person name="Yegor Kozlov" email="yegor at dinom dot ru"/> + </authors> + </header> + + <body> + <section><title>Records, Containers and Atoms</title> + <p> + PowerPoint documents are made up of a tree of records. A record may + contain either other records (in which case it is a Container), + or data (in which case it's an Atom). A record can't hold both. + </p> + <p> + PowerPoint documents don't have one overall container record. Instead, + there are a number of different container records to be found at + the top level. + </p> + <p> + Any numbers or strings stored in the records are always stored in + Little Endian format (least important bytes first). This is the case + no matter what platform the file was written on - be that a + Little Endian or a Big Endian system. + </p> + <p> + PowerPoint may have Escher (DDF) records embedded in it. These + are always held as the children of a PPDrawing record (record + type 1036). Escher records have the same format as PowerPoint + records. + </p> + </section> + + <section><title>Record Headers</title> + <p> + All records, be they containers or atoms, have the same standard + 8 byte header. It is: + </p> + <ul><li>1/2 byte container flag</li> + <li>1.5 byte option field</li> + <li>2 byte record type</li> + <li>4 byte record length</li></ul> + <p> + If the first byte of the header, BINARY_AND with 0x0f, is 0x0f, + then the record is a container. Otherwise, it's an atom. The rest + of the first two bytes are used to store the "options" for the + record. Most commonly, this is used to indicate the version of + the record, but the exact usage is record specific. + </p> + <p> + The record type is a little endian number, which tells you what + kind of record you're dealing with. Each different kind of record + has its own value that gets stored here. PowerPoint records have + a type that's normally less than 6000 (decimal). Escher records + normally have a type between 0xF000 and 0xF1FF. + </p> + <p> + The record length is another little endian number. For an atom, + it's the size of the data part of the record, i.e. the length + of the record <em>less</em> its 8 byte record header. For a + container, it's the size of all the records that are children of + this record. That means that the size of a container record is the + length, plus 8 bytes for its record header. + </p> + </section> + + <section><title>CurrentUserAtom, UserEditAtom and PersistPtrIncrementalBlock</title> + <p><strong>aka Records that care about the byte level position of other records</strong></p> + <p> + A small number of records contain byte level position offsets to other + records. If you change the position of any records in the file, then + there's a good chance that you will need to update some of these + special records. + </p> + <p> + First up, CurrentUserAtom. This is actually stored in a different + OLE2 (POIFS) stream to the main PowerPoint document. It contains + a few bits of information on who lasted edited the file. Most + importantly, at byte 8 of its contents, it stores (as a 32 bit + little endian number) the offset in the main stream to the most + recent UserEditAtom. + </p> + <p> + The UserEditAtom contains two byte level offsets (again as 32 bit + little endian numbers). At byte 12 is the offset to the + PersistPtrIncrementalBlock associated with this UserEditAtom + (each UserEditAtom has one and only one PersistPtrIncrementalBlock). + At byte 8, there's the offset to the previous UserEditAtom. If this + is 0, then you're at the first one. + </p> + <p> + Every time you do a non full save in PowerPoint, it tacks on another + UserEditAtom and another PersistPtrIncrementalBlock. The + CurrentUserAtom is updated to point to this new UserEditAtom, and the + new UserEditAtom points back to the previous UserEditAtom. You then + end up with a chain, starting from the CurrentUserAtom, linking + back through all the UserEditAtoms, until you reach the first one + from a full save. + </p> +<source> +/-------------------------------\ +| CurrentUserAtom (own stream) | +| OffsetToCurrentEdit = 10562 |==\ +\-------------------------------/ | + | +/==================================/ +| /-----------------------------------\ +| | PersistPtrIncrementalBlock @ 6144 | +| \-----------------------------------/ +| /---------------------------------\ | +| | UserEditAtom @ 6176 | | +| | LastUserEditAtomOffset = 0 | | +| | PersistPointersOffset = 6144 |==================/ +| \---------------------------------/ +| | /-----------------------------------\ +| \====================\ | PersistPtrIncrementalBlock @ 8646 | +| | \-----------------------------------/ +| /---------------------------------\ | | +| | UserEditAtom @ 8674 | | | +| | LastUserEditAtomOffset = 6176 |=/ | +| | PersistPointersOffset = 8646 |==================/ +| \---------------------------------/ +| | /------------------------------------\ +| \====================\ | PersistPtrIncrementalBlock @ 10538 | +| | \------------------------------------/ +| /---------------------------------\ | | +\==| UserEditAtom @ 10562 | | | + | LastUserEditAtomOffset = 8674 |=/ | + | PersistPointersOffset = 10538 |==================/ + \---------------------------------/ +</source> + <p> + The PersistPtrIncrementalBlock contains byte offsets to all the + Slides, Notes, Documents and MasterSlides in the file. The first + PersistPtrIncrementalBlock will point to all the ones that + were present the first time the file was saved. Subsequent + PersistPtrIncrementalBlocks will contain pointers to all the ones + that were changed in that edit. To find the offset to a given + sheet in the latest version, then start with the most recent + PersistPtrIncrementalBlock. If this knows about the sheet, use the + offset it has. If it doesn't, then work back through older + PersistPtrIncrementalBlocks until you find one which does, and + use that. + </p> + <p> + Each PersistPtrIncrementalBlock can contain a number of entries + blocks. Each block holds information on a sequence of sheets. + Each block starts with a 32 bit little endian integer. Once read + into memory, the lower 20 bits contain the starting number for the + sequence of sheets to be described. The higher 12 bits contain + the count of the number of sheets described. Following that is + one 32 bit little endian integer for each sheet in the sequence, + the value being the offset to that sheet. If there is any data + left after parsing a block, then it corresponds to the next block. + </p> +<source> +hex on disk decimal description +----------- ------- ----------- +0000 0 No options +7217 6002 Record type is 6002 +2000 0000 32 Length of data is 32 bytes +0100 5000 5242881 Count is 5 (12 highest bits) + Starting number is 1 (20 lowest bits) +0000 0000 0 Sheet (1+0)=1 starts at offset 0 +900D 0000 3472 Sheet (1+1)=2 starts at offset 3472 +E403 0000 996 Sheet (1+2)=3 starts at offset 996 +9213 0000 5010 Sheet (1+3)=4 starts at offset 5010 +BE15 0000 5566 Sheet (1+4)=5 starts at offset 5566 +0900 1000 1048585 Count is 1 (12 highest bits) + Starting number is 9 (20 lowest bits) +4418 0000 6212 Sheet (9+0)=9 starts at offset 9212 +</source> + </section> + + <section><title>Paragraph and Text Styling</title> + <p> + There are quite a number of records that affect the styling + of text, and a smaller number that are responsible for the + styling of paragraphs. + </p> + <p> + By default, a given set of text will inherit paragraph and text + stylings from the appropriate master sheet. If anything differs + from the master sheet, then appropriate styling records will + follow the text record. + </p> + <p> + <em>(We don't currently know enough about master sheet styling + to write about it)</em> + </p> + <p> + Normally, powerpoint will have one text record (TextBytesAtom + or TextCharsAtom) for every paragraph, with a preceding + TextHeaderAtom to describe what sort of paragraph it is. + If any of the stylings differ from the master's, then a + StyleTextPropAtom will follow the text record. This contains + the paragraph style information, and the styling information + for each section of the text which has a different style. + (More on StyleTextPropAtom later) + </p> + <p> + For every font used, a FontEntityAtom must exist for that font. + The FontEntityAtoms live inside a FontCollection record, and + there's one of those inside Environment record inside the + Document record. <em>(More on Fonts to be discovered)</em> + </p> + </section> + + <section><title>StyleTextPropAtom</title> + <p> + If the text or paragraph stylings for a given text record + differ from those of the appropriate master, then there will + be one of these records. + </p> + <p> + This record is made up of two lists of lists. Firstly, + there's a list of paragraph stylings - each made up of the + number of characters it applies two, followed by the matching + styling elements. Following that is the equivalent for + character stylings. + </p> + <p> + Each styling list (in either list) starts with the number + of characters it applies to, stored in a 2 byte little + endian number. If it is a paragraph styling, it will be + followed by a 2 byte number (of unknown use). After this is + a four byte number, which is a mask indicating which stylings + will follow. You then have an entry for each of the stylings + indicated in the mask. Finally, you move onto the next set + of stylings. + </p> + <p> + Each styling has a specific mask flag to indicate its + presence. (The list may be found towards the top of + org.apache.poi.hslf.record.StyleTextPropAtom.java, and is + too long to sensibly include here). For each styling entry + will occur in the order of its mask value (so one with mask + 1 will come first, followed by the next highest mask value). + Depending on the styling, it is either made up of a 2 byte + or 4 byte numeric value. The meaning of the value will + depend on the styling (eg for font.size, it is the font + size in points). + </p> + <p> + Some stylings are actually mask stylings. For these, the + value will be a 4 byte number. This is then processed as + mask, to indicate a number of different sub-stylings. + The styling for bold/italic/underline is one such example. + </p> +<source> +hex on disk decimal description +----------- ------- ----------- + +0000 0 No options +A10F 4001 Record type is 4001 +8000 0000 128 Length of data is 128 bytes +1E00 0000 30 The paragraph styling applies to 30 characters +0000 0 Paragraph options are 0 +0018 0000 6144 0x0800=Text Alignment, 0x1000=Line Spacing +0000 0 Text Alignment = Left +5000 80 Line Spacing = 80 + +1C00 0000 28 The paragraph styling applies to 28 characters +0000 0 Paragraph options are 0 +0010 0000 4096 0x1000=Line Spacing +5000 80 Line Spacing = 80 + +1900 0000 25 The paragraph styling applies to 25 characters +0000 0 Paragraph options are 0 +0018 0000 6144 0x0800=Text Alignment, 0x1000=Line Spacing +0200 0 Text Alignment = Right +5000 80 Line Spacing = 80 + +6100 0000 61 The paragraph styling applies to 61 characters + (includes final CR) +0000 0 Paragraph options are 0 +0018 0000 6144 0x0800=Text Alignment, 0x1000=Line Spacing +0000 0 Text Alignment = Left +5000 80 Line Spacing = 80 + +1E00 0000 30 The character styling applies to 30 characters +0100 0200 131073 0x0001=Char Props Mask, 0x20000=Font Size +0100 1 Char Props 0x0001=Bold +1400 20 Font Size = 20 + +1C00 0000 28 The character styling applies to 28 characters +0200 0600 393218 0x0002=Char Props Mask, 0x20000=Font Size, 0x40000=Font Color +0200 2 Char Props 0x0002=Italic +1400 20 Font Size = 20 +0000 0005 83886080 Blue + +1900 0000 25 The character styling applies to 25 characters +0000 0600 393216 0x20000=Font Size, 0x40000=Font Color +1400 20 Font Size = 20 +FF33 00FE 4261426175 Red + +6000 0000 96 The character styling applies to 96 characters +0400 0300 196612 0x0004=Char Props Mask, 0x10000=Font Index, 0x20000=Font Size +0400 4 Char Props 0x0004=Underlined +0100 1 Font Index = 1 (2nd Font in table) +1800 24 Font Size = 24 +</source> + </section> + + <section><title>Fonts in PowerPoint</title> + <p> + PowerPoint stores information about the fonts used in FontEntityAtoms, + which live inside Document.Environment.FontCollection. For every different + font used, a FontEntityAtom must exist for that font. There is always at + least one FontEntityAtom in Document.Environment.FontCollection, + which describes the default font. + </p> + </section> + + <section><title>FontEntityAtom</title> + <p> + The instance field of the record header contains the zero based index of the + font. Font index entries in StyleTextPropAtoms will refer to their required + font via this index. + </p> + <p> + The length of FontEntityAtoms is always 68 bytes. The first 64 bytes of + it hold the typeface name of the font to be used. This is stored as + a null-terminated string, and encoded as little endian unicode. (The + length of the string must not exceed 32 characters including the null + termination, so the typeface name cannot exceed 31 characters). + </p> + + <p> + After the typeface name there are 4 bytes of bitmask flags. The details of these + can be found in the Windows API, under the LOGFONT structure. + The 65th byte is the output precision, which defines how closely the system chosen + font must match the requested font, in terms of height, width, pitch etc. + The 66th byte is the clipping precision, which defines how to clip characters + that occur partly outside the clipping region. + The 67th byte is the output quality, which defines how closely the system + must match the logical font's attributes to those of the physical font used. + The 68th (and final) byte is the pitch and family, which is used by the + system when matching fonts. + </p> + </section> + </body> +</document> diff --git a/src/documentation/content/xdocs/components/slideshow/ppt-wmf-emf-renderer.xml b/src/documentation/content/xdocs/components/slideshow/ppt-wmf-emf-renderer.xml new file mode 100644 index 0000000000..7421db5733 --- /dev/null +++ b/src/documentation/content/xdocs/components/slideshow/ppt-wmf-emf-renderer.xml @@ -0,0 +1,209 @@ +<?xml version="1.0" encoding="UTF-8"?><!-- + ==================================================================== + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + ==================================================================== +--> +<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd"> + +<document> + <header> + <title>Rendering slideshows, WMF, EMF and EMF+</title> + </header> + <body> + <note>Please be aware, that the documentation on this page reflects the current development, which might not + have been released. If you rely on an unreleased feature, either use a + <a href="site:download">nightly development build</a> or feel free to ask on the + <a href="site:mailinglists">mailing list</a> for the release schedule.</note> + <section> + <title>Rendering slideshows, WMF, EMF and EMF+</title> + <p> + For rendering slideshow (HSLF/XSLF), WMF, EMF and EMF+ pictures, POI provides an utility class + <a href="https://github.com/apache/poi/tree/trunk/poi-ooxml/src/main/java/org/apache/poi/xslf/util/PPTX2PNG.java?view=markup"> + PPTX2PNG</a>: + </p> + + <source><![CDATA[ + Usage: PPTX2PNG [options] <.ppt/.pptx/.emf/.wmf file or 'stdin'> + + Options: + -scale <float> scale factor + -fixSide <side> specify side (long,short,width,height) to fix - use <scale> as amount of pixels + -slide <integer> 1-based index of a slide to render + -format <type> png,gif,jpg,svg,pdf (log,null for testing) + -outdir <dir> output directory, defaults to origin of the ppt/pptx file + -outfile <file> output filename, defaults to "${basename}-${slideno}.${format}" + -outpat <pattern> output filename pattern, defaults to "${basename}-${slideno}.${format}" + patterns: basename, slideno, format, ext + -dump <file> dump the annotated records to a file + -quiet do not write to console (for normal processing) + -ignoreParse ignore parsing error and continue with the records read until the error + -extractEmbedded extract embedded parts + -inputType <type> default input file type (OLE2,WMF,EMF), default is OLE2 = Powerpoint + some files (usually wmf) don't have a header, i.e. an identifiable file magic + -textAsShapes text elements are saved as shapes in SVG, necessary for variable spacing + often found in math formulas + -charset <cs> sets the default charset to be used, defaults to Windows-1252 + -emfHeaderBounds force the usage of the emf header bounds to calculate the bounding box + + -fontdir <dir> (PDF only) font directories separated by ";" - use $HOME for current users home dir + defaults to the usual plattform directories + -fontTtf <regex> (PDF only) regex to match the .ttf filenames + -fontMap <map> ";"-separated list of font mappings <typeface from>:<typeface to> + ]]> + </source> + + <section> + <title>Instructions to run</title> + <p> + Download the <a href="https://ci-builds.apache.org/job/POI/job/POI-DSL-1.8/lastSuccessfulBuild/artifact/build/dist/">current nightly</a> + and for SVG/PDF the <a href="site:components/index/batikpdf">additional dependencies</a>.</p> + <p>Execute the java command (Unix-paths needs to be replaced for Windows - use "-charset" for non-western WMF/EMFs):</p> + <source> + java -cp poi-5.4.1.jar:poi-ooxml-5.4.1.jar:poi-ooxml-lite-5.4.1.jar:poi-scratchpad-5.4.1.jar:lib/*:ooxml-lib/*:auxiliary/* org.apache.poi.xslf.util.PPTX2PNG -format png -fixside long -scale 1000 -charset GBK file.pptx + </source> + <p> + If you want to use the renderer on the module path (JPMS) there a currently a few more steps necessary: + </p> + <ul> + <li>Create a build project using Maven, Gradle or your favorite build tool.</li> + <li>Alternatively, download the jars from https://repo1.maven.org/maven2/org/apache/poi/</li> + <li>Exclude poi-ooxml-full-5.4.1.jar,poi-javadoc-5.4.1.jar and auxiliary/xml-apis-1.4.01.jar (Java 11+) into new subdirectory "unused"</li> + <li>Move all other jars in current directory into a new subdirectory "poi"</li> + <li>Invoke PPTX2PNG: + <source> + java --module-path poi:lib:auxiliary:ooxml-lib --module org.apache.poi.ooxml/org.apache.poi.xslf.util.PPTX2PNG -format png -fixside long -scale 1000 file.pptx + </source> + </li> + </ul> + <note> + JDK 1.8 is by default using the PiscesRenderingEngine and affected by + <a href="https://github.com/AdoptOpenJDK/openjdk-build/issues/716">Busy loop hangs</a>. + To workaround this, use the MarlinRenderingEngine which is experimental provided starting from + <a href="https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8143849">openjdk8u252 (JDK-8143849)</a> + via <code>-Dsun.java2d.renderer=sun.java2d.marlin.MarlinRenderingEngine</code> or for older jdk builds, + <a href="https://github.com/bourgesl/marlin-renderer/wiki/How-to-use">preload the marlin jar</a>. + </note> + </section> + + </section> + <section> + <title>Integrate rendering in your code</title> + <section> + <title>#1 - Use PPTX2PNG via file or stdin</title> + <p>For file system access, you need to save your slideshow/WMF/EMF/EMF+ first to disc and then call <code> + PPTX2PNG.main() + </code> with the corresponding parameters. + </p> + + <p>for stdin access, you need to redirect <code>System.in</code> before: + </p> + <source><![CDATA[ + /* the file content */ + InputStream is = ...; + /* Save and set System.in */ + InputStream oldIn = System.in; + try { + System.setIn(is); + + String[] args = { + "-format", "png", // png,gif,jpg,svg or null for test + "-outdir", new File("out/").getCanonicalPath(), + "-outfile", "export.png", + "-fixside", "long", + "-scale", "800", + "-ignoreParse", + "stdin" + }; + PPTX2PNG.main(args); + + } finally { + System.setIn(oldIn); + } + ]]></source> + </section> + <section> + <title>#2 - Render WMF / EMF / EMF+ via the *Picture classes</title> + <source><![CDATA[ + File f = samples.getFile("santa.wmf"); + try (FileInputStream fis = new FileInputStream(f)) { + // for WMF + HwmfPicture wmf = new HwmfPicture(fis); + + // for EMF / EMF+ + HemfPicture emf = new HemfPicture(fis); + + Dimension dim = wmf.getSize(); + int width = Units.pointsToPixel(dim.getWidth()); + // keep aspect ratio for height + int height = Units.pointsToPixel(dim.getHeight()); + double max = Math.max(width, height); + if (max > 1500) { + width *= 1500/max; + height *= 1500/max; + } + + BufferedImage bufImg = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB); + Graphics2D g = bufImg.createGraphics(); + g.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON); + g.setRenderingHint(RenderingHints.KEY_RENDERING, RenderingHints.VALUE_RENDER_QUALITY); + g.setRenderingHint(RenderingHints.KEY_INTERPOLATION, RenderingHints.VALUE_INTERPOLATION_BICUBIC); + g.setRenderingHint(RenderingHints.KEY_FRACTIONALMETRICS, RenderingHints.VALUE_FRACTIONALMETRICS_ON); + + wmf.draw(g, new Rectangle2D.Double(0,0,width,height)); + + g.dispose(); + + ImageIO.write(bufImg, "PNG", new File("bla.png")); + } + ]]> + </source> + </section> + <section> + <title>#3 - Render slideshows directly</title> + <source><![CDATA[ + File file = new File("example.pptx"); + double scale = 1.5; + try (SlideShow<?, ?> ss = SlideShowFactory.create(file, null, true)) { + Dimension pgsize = ss.getPageSize(); + int width = (int) (pgsize.width * scale); + int height = (int) (pgsize.height * scale); + + for (Slide<?, ?> slide : ss.getSlides()) { + BufferedImage img = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB); + Graphics2D graphics = img.createGraphics(); + + // default rendering options + graphics.setRenderingHint(RenderingHints.KEY_ANTIALIASING, RenderingHints.VALUE_ANTIALIAS_ON); + graphics.setRenderingHint(RenderingHints.KEY_RENDERING, RenderingHints.VALUE_RENDER_QUALITY); + graphics.setRenderingHint(RenderingHints.KEY_INTERPOLATION, RenderingHints.VALUE_INTERPOLATION_BICUBIC); + graphics.setRenderingHint(RenderingHints.KEY_FRACTIONALMETRICS, RenderingHints.VALUE_FRACTIONALMETRICS_ON); + graphics.setRenderingHint(Drawable.BUFFERED_IMAGE, new WeakReference<>(img)); + + graphics.scale(scale, scale); + + // draw stuff + slide.draw(graphics); + + ImageIO.write(img, "PNG", new File("output.png")); + graphics.dispose(); + img.flush(); + } + } + ]]></source> + </section> + </section> + </body> +</document>
\ No newline at end of file diff --git a/src/documentation/content/xdocs/components/slideshow/quick-guide.xml b/src/documentation/content/xdocs/components/slideshow/quick-guide.xml new file mode 100644 index 0000000000..88d85d877c --- /dev/null +++ b/src/documentation/content/xdocs/components/slideshow/quick-guide.xml @@ -0,0 +1,133 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!-- + ==================================================================== + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + ==================================================================== +--> +<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd"> + +<document> + <header> + <title>POI-HSLF - A Quick Guide</title> + <subtitle>Overview</subtitle> + <authors> + <person name="Nick Burch" email="nick at torchbox dot com"/> + </authors> + </header> + + <body> + <section><title>Basic Text Extraction</title> + <p>For basic text extraction, make use of + <code>org.apache.poi.sl.extractor.SlideShowExtractor</code>. + It accepts a slideshow which can be created from a file or stream via <code>org.apache.poi.sl.usermodel.SlideShowFactory</code>. + The <code>getText()</code> method can be used to get the text from the slides. + </p> + </section> + + <section><title>Specific Text Extraction</title> + <p>To get specific bits of text, first create a <code>org.apache.poi.hslf.usermodel.HSLFSlideShow</code> +(from a <code>org.apache.poi.hslf.usermodel.HSLFSlideShowImpl</code>, which accepts a file or an input +stream). Use <code>getSlides()</code> and <code>getNotes()</code> to get the slides and notes. +These can be queried to get their page ID (though they should be returned +in the right order).</p> + <p>You can then call <code>getTextParagraphs()</code> on these, to get +their blocks of text. (A list of <code>HSLFTextParagraph</code> normally holds all the text in a +given area of the page, eg in the title bar, or in a box). +From the <code>HSLFTextParagraph</code>, you can extract the text, and check +what type of text it is (eg Body, Title). You can also call +<code>getTextRuns()</code>, which will return the +<code>HSLFTextRun</code>s that make up the <code>TextParagraph</code>. A +<code>HSLFTextRun</code> is a text fragment, having the same character formatting. +The paragraph formatting is defined in the parent <code>HSLFTextParagraph</code>. + </p> + </section> + + <section><title>Poor Quality Text Extraction</title> + <p>If speed is the most important thing for you, you don't care + about getting duplicate blocks of text, you don't care about + getting text from master sheets, and you don't care about getting + old text, then + <code>org.apache.poi.hslf.extractor.QuickButCruddyTextExtractor</code> + might be of use.</p> + <p>QuickButCruddyTextExtractor doesn't use the normal record + parsing code, instead it uses a tree structure blind search + method to get all text holding records. You will get all the text, + including lots of text you normally wouldn't ever want. However, + you will get it back very very fast!</p> + <p>There are two ways of getting the text back. + <code>getTextAsString()</code> will return a single string with all + the text in it. <code>getTextAsVector()</code> will return a + vector of strings, one for each text record found in the file. + </p> + </section> + + <section><title>Changing Text</title> + <p>It is possible to change the text via + <code>HSLFTextParagraph.setText(List<HSLFTextParagraph>,String)</code> or + <code>HSLFTextRun.setText(String)</code>. It is possible to add additional TextRuns + with <code>HSLFTextParagraph.appendText(List<HSLFTextParagraph>,String,boolean)</code> + or <code>HSLFTextParagraph.addTextRun(HSLFTextRun)</code></p> + <p>When calling <code>HSLFTextParagraph.setText(List<HSLFTextParagraph>,String)</code>, all + the text will end up with the same formatting. When calling + <code>HSLFTextRun.setText(String)</code>, the text will retain + the old formatting of that <code>HSLFTextRun</code>. + </p> + </section> + + <section><title>Adding Slides</title> + <p>You may add new slides by calling + <code>HSLFSlideShow.createSlide()</code>, which will add a new slide + to the end of the SlideShow. It is possible to re-order slides with <code>HSLFSlideShow.reorderSlide(...)</code>. + </p> + </section> + + <section><title>Guide to key classes</title> + <ul> + <li><code>org.apache.poi.hslf.usermodel.HSLFSlideShowImpl</code> + Handles reading in and writing out files. Calls + <code>org.apache.poi.hslf.record.record</code> to build a tree + of all the records in the file, which it allows access to. + </li> + <li><code>org.apache.poi.hslf.record.Record</code> + Base class of all records. Also provides the main record generation + code, which will build up a tree of records for a file. + </li> + <li><code>org.apache.poi.hslf.usermodel.HSLFSlideShow</code> + Builds up model entries from the records, and presents a user facing + view of the file + </li> + <li><code>org.apache.poi.hslf.usermodel.HSLFSlide</code> + A user facing view of a Slide in a slideshow. Allows you to get at the + Text of the slide, and at any drawing objects on it. + </li> + <li><code>org.apache.poi.hslf.usermodel.HSLFTextParagraph</code> + A list of <code>HSLFTextParagraph</code>s holds all the text in a given area of the Slide, and will + contain one or more <code>HSLFTextRun</code>s. + </li> + <li><code>org.apache.poi.hslf.usermodel.HSLFTextRun</code> + Holds a run of text, all having the same character stylings. It is possible to modify text, and/or text stylings. + </li> + <li><code>org.apache.poi.sl.extractor.SlideShowExtractor</code> + Uses the model code to allow extraction of text from files + </li> + <li><code>org.apache.poi.hslf.extractor.QuickButCruddyTextExtractor</code> + Uses the record code to extract all the text from files very fast, + but including deleted text (and other bits of Crud). + </li> + </ul> + </section> + </body> +</document> diff --git a/src/documentation/content/xdocs/components/slideshow/xslf-cookbook.xml b/src/documentation/content/xdocs/components/slideshow/xslf-cookbook.xml new file mode 100644 index 0000000000..4f72295b5f --- /dev/null +++ b/src/documentation/content/xdocs/components/slideshow/xslf-cookbook.xml @@ -0,0 +1,304 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!-- + ==================================================================== + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + ==================================================================== +--> +<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "document-v20.dtd"> + +<document> + <header> + <title>XSLF Cookbook</title> + <authors> + <person email="yegor@apache.org" name="Yegor Kozlov" id="YK"/> + </authors> + </header> + <body> + <section><title>XSLF Cookbook</title> + <p> + This page offers a short introduction into the XSLF API. More examples can be found in the + <a href="https://github.com/apache/poi/tree/trunk/poi-examples/src/main/java/org/apache/poi/examples/xslf/">XSLF Examples</a> + in the POI Git repository. + </p> + <note> + Please note that XSLF is still in early development and is a subject to incompatible changes in a future release. + </note> + <section><title>Index of Features</title> + <ul> + <li><a href="#NewPresentation">Create a new presentation</a></li> + <li><a href="#ReadPresentation">Read an existing presentation</a></li> + <li><a href="#SlideLayout">Create a slide with a predefined layout</a></li> + <li><a href="#DeleteSlide">Delete slide</a></li> + <li><a href="#MoveSlide">Re-order slides</a></li> + <li><a href="#SlideSize">Change slide size</a></li> + <li><a href="#GetShapes">Read shapes</a></li> + <li><a href="#AddImage">Add image</a></li> + <li><a href="#ReadImages">Read images contained in a presentation</a></li> + <li><a href="#Text">Format text</a></li> + <li><a href="#Hyperlinks">Hyperlinks</a></li> + <li><a href="#PPTX2PNG">Convert .pptx slides into images</a></li> + <li><a href="#Merge">Merge multiple presentations together</a></li> + </ul> + </section> + <section><title>Cookbook</title> + <anchor id="NewPresentation"/> + <section><title>New Presentation</title> + <p> + The following code creates a new .pptx slide show and adds a blank slide to it: + </p> + <source> + //create a new empty slide show + XMLSlideShow ppt = new XMLSlideShow(); + + //add first slide + XSLFSlide blankSlide = ppt.createSlide(); + </source> + </section> + <anchor id="ReadPresentation"/> + <section><title>Read an existing presentation and append a slide to it</title> + <source> + XMLSlideShow ppt = new XMLSlideShow(new FileInputStream("slideshow.pptx")); + + //append a new slide to the end + XSLFSlide blankSlide = ppt.createSlide(); + </source> + </section> + + <anchor id="SlideLayout"/> + <section><title>Create a new slide from a predefined slide layout</title> + <source> + XMLSlideShow ppt = new XMLSlideShow(new FileInputStream("slideshow.pptx")); + + // first see what slide layouts are available : + System.out.println("Available slide layouts:"); + for(XSLFSlideMaster master : ppt.getSlideMasters()){ + for(XSLFSlideLayout layout : master.getSlideLayouts()){ + System.out.println(layout.getType()); + } + } + + // blank slide + XSLFSlide blankSlide = ppt.createSlide(); + + // there can be multiple masters each referencing a number of layouts + // for demonstration purposes we use the first (default) slide master + XSLFSlideMaster defaultMaster = ppt.getSlideMasters().get(0); + + // title slide + XSLFSlideLayout titleLayout = defaultMaster.getLayout(SlideLayout.TITLE); + // fill the placeholders + XSLFSlide slide1 = ppt.createSlide(titleLayout); + XSLFTextShape title1 = slide1.getPlaceholder(0); + title1.setText("First Title"); + + // title and content + XSLFSlideLayout titleBodyLayout = defaultMaster.getLayout(SlideLayout.TITLE_AND_CONTENT); + XSLFSlide slide2 = ppt.createSlide(titleBodyLayout); + + XSLFTextShape title2 = slide2.getPlaceholder(0); + title2.setText("Second Title"); + + XSLFTextShape body2 = slide2.getPlaceholder(1); + body2.clearText(); // unset any existing text + body2.addNewTextParagraph().addNewTextRun().setText("First paragraph"); + body2.addNewTextParagraph().addNewTextRun().setText("Second paragraph"); + body2.addNewTextParagraph().addNewTextRun().setText("Third paragraph"); + </source> + </section> + + <anchor id="DeleteSlide"/> + <section><title>Delete slide</title> + <source> + XMLSlideShow ppt = new XMLSlideShow(new FileInputStream("slideshow.pptx")); + + ppt.removeSlide(0); // 0-based index of a slide to be removed + </source> + </section> + + <anchor id="MoveSlide"/> + <section><title>Re-order slides</title> + <source> + XMLSlideShow ppt = new XMLSlideShow(new FileInputStream("slideshow.pptx")); + List<XSLFSlide> slides = ppt.getSlides(); + + XSLFSlide thirdSlide = slides.get(2); + ppt.setSlideOrder(thirdSlide, 0); // move the third slide to the beginning + </source> + </section> + + <anchor id="SlideSize"/> + <section><title>How to retrieve or change slide size</title> + <source> + XMLSlideShow ppt = new XMLSlideShow(); + //retrieve page size. Coordinates are expressed in points (72 dpi) + java.awt.Dimension pgsize = ppt.getPageSize(); + int pgx = pgsize.width; //slide width in points + int pgy = pgsize.height; //slide height in points + + //set new page size + ppt.setPageSize(new java.awt.Dimension(1024, 768)); + </source> + </section> + <anchor id="GetShapes"/> + <section><title>How to read shapes contained in a particular slide</title> + <p> + The following code demonstrates how to iterate over shapes for each slide. + </p> + <source> + XMLSlideShow ppt = new XMLSlideShow(new FileInputStream("slideshow.pptx")); + // get slides + for (XSLFSlide slide : ppt.getSlides()) { + for (XSLFShape sh : slide.getShapes()) { + // name of the shape + String name = sh.getShapeName(); + + // shapes's anchor which defines the position of this shape in the slide + if (sh instanceof PlaceableShape) { + java.awt.geom.Rectangle2D anchor = ((PlaceableShape)sh).getAnchor(); + } + + if (sh instanceof XSLFConnectorShape) { + XSLFConnectorShape line = (XSLFConnectorShape) sh; + // work with Line + } else if (sh instanceof XSLFTextShape) { + XSLFTextShape shape = (XSLFTextShape) sh; + // work with a shape that can hold text + } else if (sh instanceof XSLFPictureShape) { + XSLFPictureShape shape = (XSLFPictureShape) sh; + // work with Picture + } + } + } + </source> + </section> + <anchor id="AddImage"/> + <section><title>Add Image to Slide</title> + <source> + XMLSlideShow ppt = new XMLSlideShow(); + XSLFSlide slide = ppt.createSlide(); + + byte[] pictureData = IOUtils.toByteArray(new FileInputStream("image.png")); + + XSLFPictureData pd = ppt.addPicture(pictureData, PictureData.PictureType.PNG); + XSLFPictureShape pic = slide.createPicture(pd); + </source> + </section> + + <anchor id="ReadImages"/> + <section><title>Read Images contained within a presentation</title> + <source> + XMLSlideShow ppt = new XMLSlideShow(new FileInputStream("slideshow.pptx")); + for(XSLFPictureData data : ppt.getAllPictures()){ + byte[] bytes = data.getData(); + String fileName = data.getFileName(); + + } + </source> + </section> + + <anchor id="Text"/> + <section><title>Basic text formatting</title> + <source> + XMLSlideShow ppt = new XMLSlideShow(); + XSLFSlide slide = ppt.createSlide(); + + XSLFTextBox shape = slide.createTextBox(); + XSLFTextParagraph p = shape.addNewTextParagraph(); + + XSLFTextRun r1 = p.addNewTextRun(); + r1.setText("The"); + r1.setFontColor(Color.blue); + r1.setFontSize(24.); + + XSLFTextRun r2 = p.addNewTextRun(); + r2.setText(" quick"); + r2.setFontColor(Color.red); + r2.setBold(true); + + XSLFTextRun r3 = p.addNewTextRun(); + r3.setText(" brown"); + r3.setFontSize(12.); + r3.setItalic(true); + r3.setStrikethrough(true); + + XSLFTextRun r4 = p.addNewTextRun(); + r4.setText(" fox"); + r4.setUnderline(true); + </source> + </section> + <anchor id="Hyperlinks"/> + <section><title>How to create a hyperlink</title> + <source> + XMLSlideShow ppt = new XMLSlideShow(); + XSLFSlide slide = ppt.createSlide(); + + // assign a hyperlink to a text run + XSLFTextBox shape = slide.createTextBox(); + XSLFTextRun r = shape.addNewTextParagraph().addNewTextRun(); + r.setText("Apache POI"); + XSLFHyperlink link = r.createHyperlink(); + link.setAddress("https://poi.apache.org"); + </source> + </section> + <anchor id="PPTX2PNG"/> + <section><title>PPTX2PNG is an application that converts each slide of a .pptx slideshow into a PNG image</title> + <source> +Usage: PPTX2PNG [options] <pptx file> +Options: + -scale <float> scale factor (default is 1.0) + -slide <integer> 1-based index of a slide to render. Default is to render all slides. + </source> + <p>How it works:</p> + <p> + The XSLFSlide object implements a draw(Graphics2D graphics) method that recursively paints all shapes + in the slide into the supplied graphics canvas: + </p> + <source> + slide.draw(graphics); + </source> + <p> + where graphics is a class implementing java.awt.Graphics2D. In PPTX2PNG the graphic canvas is derived from + java.awt.image.BufferedImage, i.e. the destination is an image in memory, but in general case you can pass + any compliant implementation of java.awt.Graphics2D. + Find more information in the designated <a href="site:slrender">render page</a>, e.g. on how to render SVG images. + </p> + </section> + <anchor id="Merge"/> + <section> + <title>Merge multiple presentations together</title> + <source> + XMLSlideShow ppt = new XMLSlideShow(); + String[] inputs = {"presentations1.pptx", "presentation2.pptx"}; + for(String arg : inputs){ + FileInputStream is = new FileInputStream(arg); + XMLSlideShow src = new XMLSlideShow(is); + is.close(); + + for(XSLFSlide srcSlide : src.getSlides()){ + ppt.createSlide().importContent(srcSlide); + } + } + + FileOutputStream out = new FileOutputStream("merged.pptx"); + ppt.write(out); + out.close(); + </source> + </section> + + </section> + </section> + </body> +</document> |