当前位置: 移动技术网 > IT编程>开发语言>Java > java实现PPT转PDF出现中文乱码问题的解决方法

java实现PPT转PDF出现中文乱码问题的解决方法

2019年07月22日  | 移动技术网IT编程  | 我要评论

ppt转成pdf,原理是ppt转成图片,再用图片生产pdf,过程有个问题,不管是ppt还是pptx,都遇到中文乱码,编程方框的问题,其中ppt后缀网上随便找就有解决方案,就是设置字体为统一字体,pptx如果页面是一种中文字体不会有问题,如果一个页面有微软雅黑和宋体,就会导致部分中文方框,怀疑是poi处理的时候,只读取第一种字体,所以导致多个中文字体乱码。

百度和谷歌都找了很久,有看到说apache官网有人说是bug,但他们回复说是字体问题,这个问题其实我觉得poi可能可以自己做,读取原来字体设置成当前字体,不过性能应该会有很多消耗,反正我估计很多人跟我一样花费大量时间找解决方案,网上几乎没有现成的方案。自己也是一步步尝试,最终找到解决办法,ppt格式的就不说了网上找得到,pptx后缀的网上我是没找到。

问题前的pptx转成图片:

解决后的pptx转成图片:

解决方法:
读取每个shape,将文字转成统一的字体,网上找到的那段代码不可行,我自己改的方案如下:       

for( xslfshape shape : slide[i].getshapes() ){
  if ( shape instanceof xslftextshape ){
  xslftextshape txtshape = (xslftextshape)shape ;
  system.out.println("txtshape" + (i+1) + ":" + txtshape.getshapename());
  system.out.println("text:" +txtshape.gettext());
  
  for ( xslftextparagraph textpara : txtshape.gettextparagraphs() ){
  list<xslftextrun> textrunlist = textpara.gettextruns();
  for(xslftextrun textrun: textrunlist) {
  textrun.setfontfamily("宋体");
  }
  }
  }
 }

完整代码如下(除了以上自己的解决方案,大部分是stackoverflow上的代码):

public static void convertppttopdf(string sourcepath, string destinationpath, string filetype) throws exception {
 fileinputstream inputstream = new fileinputstream(sourcepath);
 double zoom = 2;
 affinetransform at = new affinetransform();
 at.settoscale(zoom, zoom);
 document pdfdocument = new document();
 pdfwriter pdfwriter = pdfwriter.getinstance(pdfdocument, new fileoutputstream(destinationpath));
 pdfptable table = new pdfptable(1);
 pdfwriter.open();
 pdfdocument.open();
 dimension pgsize = null;
 image slideimage = null;
 bufferedimage img = null;
 if (filetype.equalsignorecase(".ppt")) {
 slideshow ppt = new slideshow(inputstream);
 inputstream.close();
 pgsize = ppt.getpagesize();
 slide slide[] = ppt.getslides();
 pdfdocument.setpagesize(new rectangle((float) pgsize.getwidth(), (float) pgsize.getheight()));
 pdfwriter.open();
 pdfdocument.open();
 for (int i = 0; i < slide.length; i++) {
  
 textrun[] truns = slide[i].gettextruns(); 
 for ( int k=0;k<truns.length;k++){ 
  richtextrun[] rtruns = truns[k].getrichtextruns(); 
  for(int l=0;l<rtruns.length;l++){ 
//  int index = rtruns[l].getfontindex(); 
//  string name = rtruns[l].getfontname(); 
  rtruns[l].setfontindex(1); 
  rtruns[l].setfontname("宋体");  
  } 
 } 
  
  
 img = new bufferedimage((int) math.ceil(pgsize.width * zoom), (int) math.ceil(pgsize.height * zoom), bufferedimage.type_int_rgb);
 graphics2d graphics = img.creategraphics();
 graphics.settransform(at);
 
 graphics.setpaint(color.white);
 graphics.fill(new rectangle2d.float(0, 0, pgsize.width, pgsize.height));
 slide[i].draw(graphics);
 graphics.getpaint();
 slideimage = image.getinstance(img, null);
 table.addcell(new pdfpcell(slideimage, true));
 }
 }
 if (filetype.equalsignorecase(".pptx")) {
 xmlslideshow ppt = new xmlslideshow(inputstream);
 pgsize = ppt.getpagesize();
 xslfslide slide[] = ppt.getslides();
 pdfdocument.setpagesize(new rectangle((float) pgsize.getwidth(), (float) pgsize.getheight()));
 pdfwriter.open();
 pdfdocument.open();
 
 
 for (int i = 0; i < slide.length; i++) {
 for( xslfshape shape : slide[i].getshapes() ){
  if ( shape instanceof xslftextshape ){
  xslftextshape txtshape = (xslftextshape)shape ;
  // system.out.println("txtshape" + (i+1) + ":" + txtshape.getshapename());
  //system.out.println("text:" +txtshape.gettext());
  
  for ( xslftextparagraph textpara : txtshape.gettextparagraphs() ){
  list<xslftextrun> textrunlist = textpara.gettextruns();
  for(xslftextrun textrun: textrunlist) {
  textrun.setfontfamily("宋体");
  }
  }
  }
 }
 img = new bufferedimage((int) math.ceil(pgsize.width * zoom), (int) math.ceil(pgsize.height * zoom), bufferedimage.type_int_rgb);
 graphics2d graphics = img.creategraphics();
 graphics.settransform(at);
 graphics.setpaint(color.white);
 graphics.fill(new rectangle2d.float(0, 0, pgsize.width, pgsize.height));
 slide[i].draw(graphics);
  
  
// fileoutputstream out = new fileoutputstream("src/main/resources/test"+i+".jpg"); 
// javax.imageio.imageio.write(img, "jpg", out);
  
  
  
 graphics.getpaint();
 slideimage = image.getinstance(img, null);
 table.addcell(new pdfpcell(slideimage, true));
 }
 }
 pdfdocument.add(table);
 pdfdocument.close();
 pdfwriter.close();
 system.out.println("powerpoint file converted to pdf successfully");
 }

maven配置:

<dependency>
 <groupid>org.apache.poi</groupid>
 <artifactid>poi</artifactid>
 <!-- <version>3.13</version> -->
 <version>3.9</version>
 </dependency>
 <dependency>
 <groupid>org.apache.poi</groupid>
 <artifactid>poi-ooxml</artifactid>
 <!-- <version>3.10-final</version> -->
 <version>3.9</version>
 </dependency>
 
 <dependency>
 <groupid>com.itextpdf</groupid>
 <artifactid>itextpdf</artifactid>
 <version>5.5.7</version>
 </dependency>
 
 <dependency>
 <groupid>com.itextpdf.tool</groupid>
 <artifactid>xmlworker</artifactid>
 <version>5.5.7</version>
 </dependency>
 <dependency>
 <groupid>org.apache.poi</groupid>
 <artifactid>poi-scratchpad</artifactid>
 <!-- <version>3.12</version> -->
 <version>3.9</version>
 </dependency>

上面就是为大家分享的java实现ppt转pdf出现中文乱码问题的解决方法,希望对大家的学习有所帮助。

如对本文有疑问, 点击进行留言回复!!

相关文章:

验证码:
移动技术网