文档在线预览(一)通过将txt、word、pdf转成图片实现在线预览功能

您所在的位置:网站首页 js文件在线预览 文档在线预览(一)通过将txt、word、pdf转成图片实现在线预览功能

文档在线预览(一)通过将txt、word、pdf转成图片实现在线预览功能

2023-06-05 22:45| 来源: 网络整理| 查看: 265

@

目录一、将文件转换成图片,并生成到本地1、将word文件转成图片2、将txt文件转成图片(同word文件转成图片)3、将pdf文件转图片二、利用多线程提升文件写入本地的效率三、将文件转换成图片流1、将word文件转成图片流2、将txt文件转成图片流3、将pdf转成图片流4、支持多种类型文件转成图片流最后附上完整的工具类代码: 如果不想网页上的文章被复制(没错,说的就是某点),如果想实现文档不需要下载下来就能在线预览查看(常见于文档付费下载网站、邮箱附件预览),该怎么做?常见的做法就是将他们转化成图片。以下代码基于 aspose-words(用于txt、word转图片),pdfbox(用于pdf转图片),封装成一个工具类来实现txt、word、pdf等文件转图片的需求。

首先在项目的pom文件里添加下面两个依赖

com.luhuiguo aspose-words 23.1 org.apache.pdfbox pdfbox 2.0.4 一、将文件转换成图片,并生成到本地 1、将word文件转成图片 public static void wordToImage(String wordPath, String imagePath) throws Exception { Document doc = new Document(wordPath); File file = new File(wordPath); String filename = file.getName(); String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf(".")); for (int i = 0; i < doc.getPageCount(); i++) { Document extractedPage = doc.extractPages(i, 1); String path = pathPre + (i + 1) + ".png"; extractedPage.save(path, SaveFormat.PNG); } }

验证:

public static void main(String[] args) throws Exception { FileConvertUtil.wordToImage("D:\\书籍\\电子书\\其它\\《山海经》异兽图.doc", "D:\\test\\word"); }

验证结果: 请添加图片描述

2、将txt文件转成图片(同word文件转成图片) public static void txtToImage(String txtPath, String imagePath) throws Exception { wordToImage(txtPath, imagePath); }

验证:

public static void main(String[] args) throws Exception { FileConvertUtil.wordToImage("D:\\书籍\\电子书\\其它\\《山海经》异兽图.doc", "D:\\test\\word"); }

验证结果: 请添加图片描述

3、将pdf文件转图片 public static void pdfToImage(String pdfPath, String imagePath) throws Exception { File file = new File(pdfPath); String filename = file.getName(); String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf(".")); PDDocument doc = PDDocument.load(file); PDFRenderer renderer = new PDFRenderer(doc); for (int i = 0; i < doc.getNumberOfPages(); i++) { BufferedImage image = renderer.renderImageWithDPI(i, 144); // Windows native DPI String pathname = pathPre + (i + 1) + ".png"; ImageIO.write(image, "PNG", new File(pathname)); } doc.close(); }

验证:

public static void main(String[] args) throws Exception { FileConvertUtil.pdfToImage("D:\\书籍\\电子书\\其它\\自然哲学的数学原理.pdf", "D:\\test\\pdf"); }

验证结果: 请添加图片描述

4、同时支持多种文件类型转成图片

public static void fileToImage(String sourceFilePath, String imagePath) throws Exception { String ext = sourceFilePath.substring(sourceFilePath.lastIndexOf(".")); switch (ext) { case ".doc": case ".docx": wordToImage(sourceFilePath, imagePath); break; case ".pdf": pdfToImage(sourceFilePath, imagePath); break; case ".txt": txtToImage(sourceFilePath, imagePath); break; default: System.out.println("文件格式不支持"); } } 二、利用多线程提升文件写入本地的效率

​ 在将牛顿大大的长达669页的巨作《自然哲学的数学原理》时发现执行时间较长,执行花了140,281ms。但其实这种IO密集型的操作是通过使用多线程的方式来提升效率的,于是针对这点,我又写了一版多线程的版本。

同步执行导出 自然哲学的数学原理.pdf 耗时: 请添加图片描述

优化后的代码如下:

public static void pdfToImageAsync(String pdfPath, String imagePath) throws Exception { long old = System.currentTimeMillis(); File file = new File(pdfPath); PDDocument doc = PDDocument.load(file); PDFRenderer renderer = new PDFRenderer(doc); int pageCount = doc.getNumberOfPages(); int numCores = Runtime.getRuntime().availableProcessors(); ExecutorService executorService = Executors.newFixedThreadPool(numCores); for (int i = 0; i < pageCount; i++) { int finalI = i; executorService.submit(() -> { try { BufferedImage image = renderer.renderImageWithDPI(finalI, 144); // Windows native DPI String filename = file.getName(); filename = filename.substring(0, filename.lastIndexOf(".")); String pathname = imagePath + File.separator + filename + (finalI + 1) + ".png"; ImageIO.write(image, "PNG", new File(pathname)); } catch (Exception ex) { ex.printStackTrace(); } }); } executorService.shutdown(); executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS); doc.close(); long now = System.currentTimeMillis(); System.out.println("pdfToImage 多线程 转换完成..用时:" + (now - old) + "ms"); }

多线程执行导出 自然哲学的数学原理.pdf 耗时如下: 请添加图片描述

从上图可以看到本次执行只花了24045ms,只花了原先差不多六分之一的时间,极大地提升了执行效率。除了pdf,word、txt转图片也可以做这样的多线程改造:

//将word转成图片(多线程) public static void wordToImageAsync(String wordPath, String imagePath) throws Exception { Document doc = new Document(wordPath); File file = new File(wordPath); String filename = file.getName(); String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf(".")); int numCores = Runtime.getRuntime().availableProcessors(); ExecutorService executorService = Executors.newFixedThreadPool(numCores); for (int i = 0; i < doc.getPageCount(); i++) { int finalI = i; executorService.submit(() -> { try { Document extractedPage = doc.extractPages(finalI, 1); String path = pathPre + (finalI + 1) + ".png"; extractedPage.save(path, SaveFormat.PNG); } catch (Exception ex) { ex.printStackTrace(); } }); } } //将txt转成图片(多线程) public static void txtToImageAsync(String txtPath, String imagePath) throws Exception { wordToImageAsync(txtPath, imagePath); } 三、将文件转换成图片流

​ 有的时候我们转成图片后并不需要在本地生成图片,而是需要将图片返回或者上传到图片服务器,这时候就需要将转换后的图片转成流返回以方便进行传输,代码示例如下:

1、将word文件转成图片流 public static List wordToImageStream(String wordPath) throws Exception { Document doc = new Document(wordPath); List list = new ArrayList(); for (int i = 0; i < doc.getPageCount(); i++) { try(ByteArrayOutputStream outputStream = new ByteArrayOutputStream()){ Document extractedPage = doc.extractPages(i, 1); extractedPage.save(outputStream, SaveFormat.*PNG*); list.add(outputStream.toByteArray()); } } return list; } 2、将txt文件转成图片流 public static List txtToImageStream(String txtPath) throws Exception { return *wordToImagetream*(txtPath); } 3、将pdf转成图片流 public static List pdfToImageStream(String pdfPath) throws Exception { File file = new File(pdfPath); PDDocument doc = PDDocument.*load*(file); PDFRenderer renderer = new PDFRenderer(doc); List list = new ArrayList(); for (int i = 0; i < doc.getNumberOfPages(); i++) { try(ByteArrayOutputStream outputStream = new ByteArrayOutputStream()) { BufferedImage image = renderer.renderImageWithDPI(i, 144); // Windows native DPI ImageIO.*write*(image, "PNG", outputStream); list.add(outputStream.toByteArray()); } } doc.close(); return list; } 4、支持多种类型文件转成图片流 public static List fileToImageStream(String pdfPath) throws Exception { String ext = pdfPath.substring(pdfPath.lastIndexOf(".")); switch (ext) { case ".doc": case ".docx": return *wordToImageStream*(pdfPath); case ".pdf": return *pdfToImageStream*(pdfPath); case ".txt": return *txtToImageStream*(pdfPath); default: System.*out*.println("文件格式不支持"); } return null; } 最后附上完整的工具类代码: package com.fhey.service.common.utils.file; import com.aspose.words.Document; import com.aspose.words.SaveFormat; import com.aspose.words.SaveOptions; import javassist.bytecode.ByteArray; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.rendering.PDFRenderer; import javax.imageio.ImageIO; import java.awt.image.BufferedImage; import java.io.ByteArrayOutputStream; import java.io.File; import java.util.ArrayList; import java.util.List; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.TimeUnit; public class FileConvertUtil { //文件转成图片 public static void fileToImage(String sourceFilePath, String imagePath) throws Exception { String ext = sourceFilePath.substring(sourceFilePath.lastIndexOf(".")); switch (ext) { case ".doc": case ".docx": wordToImage(sourceFilePath, imagePath); break; case ".pdf": pdfToImage(sourceFilePath, imagePath); break; case ".txt": txtToImage(sourceFilePath, imagePath); break; default: System.out.println("文件格式不支持"); } } //将pdf转成图片 public static void pdfToImage(String pdfPath, String imagePath) throws Exception { File file = new File(pdfPath); String filename = file.getName(); String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf(".")); PDDocument doc = PDDocument.load(file); PDFRenderer renderer = new PDFRenderer(doc); for (int i = 0; i < doc.getNumberOfPages(); i++) { BufferedImage image = renderer.renderImageWithDPI(i, 144); // Windows native DPI String pathname = pathPre + (i + 1) + ".png"; ImageIO.write(image, "PNG", new File(pathname)); } doc.close(); } //txt转成转成图片 public static void txtToImage(String txtPath, String imagePath) throws Exception { wordToImage(txtPath, imagePath); } //将word转成图片 public static void wordToImage(String wordPath, String imagePath) throws Exception { Document doc = new Document(wordPath); File file = new File(wordPath); String filename = file.getName(); String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf(".")); for (int i = 0; i < doc.getPageCount(); i++) { Document extractedPage = doc.extractPages(i, 1); String path = pathPre + (i + 1) + ".png"; extractedPage.save(path, SaveFormat.PNG); } } //pdf转成图片(多线程) public static void pdfToImageAsync(String pdfPath, String imagePath) throws Exception { long old = System.currentTimeMillis(); File file = new File(pdfPath); PDDocument doc = PDDocument.load(file); PDFRenderer renderer = new PDFRenderer(doc); int pageCount = doc.getNumberOfPages(); int numCores = Runtime.getRuntime().availableProcessors(); ExecutorService executorService = Executors.newFixedThreadPool(numCores); for (int i = 0; i < pageCount; i++) { int finalI = i; executorService.submit(() -> { try { BufferedImage image = renderer.renderImageWithDPI(finalI, 144); // Windows native DPI String filename = file.getName(); filename = filename.substring(0, filename.lastIndexOf(".")); String pathname = imagePath + File.separator + filename + (finalI + 1) + ".png"; ImageIO.write(image, "PNG", new File(pathname)); } catch (Exception ex) { ex.printStackTrace(); } }); } executorService.shutdown(); executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS); doc.close(); long now = System.currentTimeMillis(); System.out.println("pdfToImage 多线程 转换完成..用时:" + (now - old) + "ms"); } //将word转成图片(多线程) public static void wordToImageAsync(String wordPath, String imagePath) throws Exception { Document doc = new Document(wordPath); File file = new File(wordPath); String filename = file.getName(); String pathPre = imagePath + File.separator + filename.substring(0, filename.lastIndexOf(".")); int numCores = Runtime.getRuntime().availableProcessors(); ExecutorService executorService = Executors.newFixedThreadPool(numCores); for (int i = 0; i < doc.getPageCount(); i++) { int finalI = i; executorService.submit(() -> { try { Document extractedPage = doc.extractPages(finalI, 1); String path = pathPre + (finalI + 1) + ".png"; extractedPage.save(path, SaveFormat.PNG); } catch (Exception ex) { ex.printStackTrace(); } }); } } //将txt转成图片(多线程) public static void txtToImageAsync(String txtPath, String imagePath) throws Exception { wordToImageAsync(txtPath, imagePath); } //将文件转成图片流 public static List fileToImageStream(String pdfPath) throws Exception { String ext = pdfPath.substring(pdfPath.lastIndexOf(".")); switch (ext) { case ".doc": case ".docx": return wordToImageStream(pdfPath); case ".pdf": return pdfToImageStream(pdfPath); case ".txt": return txtToImageStream(pdfPath); default: System.out.println("文件格式不支持"); } return null; } //将pdf转成图片流 public static List pdfToImageStream(String pdfPath) throws Exception { File file = new File(pdfPath); PDDocument doc = PDDocument.load(file); PDFRenderer renderer = new PDFRenderer(doc); List list = new ArrayList(); for (int i = 0; i < doc.getNumberOfPages(); i++) { try(ByteArrayOutputStream outputStream = new ByteArrayOutputStream()) { BufferedImage image = renderer.renderImageWithDPI(i, 144); // Windows native DPI ImageIO.write(image, "PNG", outputStream); list.add(outputStream.toByteArray()); } } doc.close(); return list; } //将word转成图片流 public static List wordToImageStream(String wordPath) throws Exception { Document doc = new Document(wordPath); List list = new ArrayList(); for (int i = 0; i < doc.getPageCount(); i++) { try(ByteArrayOutputStream outputStream = new ByteArrayOutputStream()){ Document extractedPage = doc.extractPages(i, 1); extractedPage.save(outputStream, SaveFormat.PNG); list.add(outputStream.toByteArray()); } } return list; } //将txt转成图片流 public static List txtToImageStream(String txtPath) throws Exception { return wordToImageStream(txtPath); } }


【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3