pdf2htmlEX



pdf2htmlEX とは

pdf2htmlEX は PDF ファイルを HTML ファイルに変換するソフトウェアです.

License

GPLv3

ChangeLog

不具合

インストール

see https://github.com/coolwanglu/pdf2htmlEX/wiki/Building

Windows

MXE (M cross environment)

pdf2htmlEX Windows Version

MinGW

Poppler - TeX Wiki を参照して Poppler をインストールします.(pdf2htmlEX をビルドする場合 Poppler は configure で Makefile を作成してインストールしてください)

pango をインストールします.

$ curl --insecure -R -L -O https://download.gnome.org/sources/pango/1.36/pango-1.36.8.tar.xz
$ tar xvf pango-1.36.8.tar.xz
$ pushd pango-1.36.8
$ ./configure --prefix=/mingw
$ make
$ make install
$ popd

PlibC をインストールします.

$ curl -R -L -O http://download.sourceforge.net/plibc/plibc-0.1.7-src.tar.gz
$ tar xvf plibc-0.1.7-src.tar.gz
$ pushd PlibC-0.1.7
$ ./configure --prefix=/mingw
$ make
$ make install
$ popd

FontForge をインストールします.

$ curl --insecure -R -L -o fontforge-fontforge.tar.gz https://github.com/fontforge/fontforge/tarball/master
$ tar xvf fontforge-fontforge.tar.gz
$ pushd fontforge-fontforge*
$ ./autogen.sh
Preparing the fontforge build system...please wait

Found GNU Autoconf version 2.68
Found GNU Automake version 1.12.4
Found GNU Libtool version 2.4

Automatically preparing build ... done

The fontforge build system is now prepared.  To build here, run:
  ./configure
  make

$ ./configure --prefix=/mingw

Summary of optional features:

  real (floating pt) double
  programs           yes
  native scripting   yes
  python scripting   no
  python extension   no
  freetype debugger  no
  capslock for alt   no
  raw points mode    no
  tile path          no
  gb12345 encoding   no

Summary of optional dependencies:

  cairo              yes        http://www.cairographics.org/
  giflib             no         http://giflib.sourceforge.net/
  libjpeg            yes        http://en.wikipedia.org/wiki/Libjpeg
  libpng             yes        http://www.libpng.org/
  libtiff            yes        http://en.wikipedia.org/wiki/Libtiff
  libxml             yes        http://www.xmlsoft.org/
  libspiro           no         http://libspiro.sourceforge.net/
  libuninameslist    no         https://github.com/fontforge/libuninameslist
  libunicodenames    no         https://bitbucket.org/sortsmill/libunicodenames
  zeromq             no         http://www.zeromq.org/
  libreadline        no         http://www.gnu.org/software/readline
  X Window System    no

$ make
$ make install
$ popd

ttfautohint をインストールします.

$ curl -R -L -O http://download.sourceforge.net/freetype/ttfautohint-1.3.tar.gz
$ tar xvf ttfautohint-1.3.tar.gz
$ pushd ttfautohint-1.3
$ ./configure --prefix=/mingw
$ make
../libtool: line 6013: cd: C:QtQt5.9.15.9.1mingw53_32lib: No such file or directory
libtool: link: cannot determine absolute directory name of `C:QtQt5.9.15.9.1mingw53_32lib'

のエラーが発生する場合は ttfautohint-1.00/frontend/Makefile の中身を

#QT_LIBS = -lglu32 -lopengl32 -lgdi32 -luser32 -lmingw32 -lqtmain -LC:\Qt\Qt5.9.1\5.9.1\mingw53_32\lib -lQt5Gui -lQt5Core -lQt5Widgets
QT_LIBS = -lglu32 -lopengl32 -lgdi32 -luser32 -lmingw32 -lqtmain -LC:/Qt/Qt5.9.1/5.9.1/mingw53_32/lib -lQt5Gui -lQt5Core -lQt5Widgets

に修正します.

$ make install
$ popd

pdf2htmlEX をインストールします.

$ curl --insecure -R -L -o coolwanglu-pdf2htmlEX.tar.gz https://github.com/coolwanglu/pdf2htmlEX/tarball/master
$ tar xvf coolwanglu-pdf2htmlEX.tar.gz
$ cd coolwanglu-pdf2htmlEX*

src/pdf2htmlEX.cc に

#if defined(_WIN32)
#include <errno.h>
char *mkdtemp(char *tempbuf) {
  int rand_value = 0;
  char* tempbase = NULL;
  char tempbasebuf[MAX_PATH] = "";

  if (strcmp(&tempbuf[strlen(tempbuf)-6], "XXXXXX")) {
    errno = EINVAL;
    return NULL;
  }

  srand((unsigned)time(0));
  rand_value = (int)((rand() / ((double)RAND_MAX+1.0)) * 1e6);
  tempbase = strrchr(tempbuf, '/');
  tempbase = tempbase ? tempbase+1 : tempbuf;
  strcpy(tempbasebuf, tempbase);
  sprintf(&tempbasebuf[strlen(tempbasebuf)-6], "%d", rand_value);
  ::GetTempPath(MAX_PATH, tempbuf);
  strcat(tempbuf, tempbasebuf);
  ::CreateDirectory(tempbuf, NULL);
  return tempbuf;
}
#endif

を追加します.

src/util/path.cc に

#if defined(_WIN32)
#include <windows.h>
int mkdir(const char *pathname, mode_t mode) {
  if (::GetFileAttributes(pathname) == FILE_ATTRIBUTE_DIRECTORY) {
    errno = EEXIST;
    return -1;
  }
  return ::CreateDirectory(pathname, NULL) ? 0 : -1;
}
#endif

を追加します.

src/util/ffw.c に

#if defined(_WIN32)
#undef printf
#undef vfprintf
#endif

を追加します.

$ mkdir build
$ cd build
$ cmake .. -G "MSYS Makefiles" -DCMAKE_INSTALL_PREFIX=/mingw -DENABLE_SVG=ON
-- The C compiler identification is GNU 4.9.3
-- The CXX compiler identification is GNU 4.9.3
-- Check for working C compiler: C:/MinGW/bin/gcc.exe
-- Check for working C compiler: C:/MinGW/bin/gcc.exe -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: C:/MinGW/bin/g++.exe
-- Check for working CXX compiler: C:/MinGW/bin/g++.exe -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Found PkgConfig: C:/MinGW/bin/pkg-config.exe (found version "0.28")
-- checking for module 'poppler>=0.20.0'
--   found poppler, version 0.42.0
-- checking for module 'libfontforge>=2.0.0'
--   found libfontforge, version 2.0.0
-- Performing Test CXX0X_SUPPORT
-- Performing Test CXX0X_SUPPORT - Success
-- Configuring done
-- Generating done
-- Build files have been written to: C:/coolwanglu-pdf2htmlEX-???????/build
$ make
[  3%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/pdf2htmlEX.cc.obj
[  7%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/HTMLRenderer/draw.cc.ob
j
[ 11%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/HTMLRenderer/general.cc
.obj
[ 15%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/HTMLRenderer/image.cc.o
bj
[ 19%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/HTMLRenderer/font.cc.ob
j
[ 23%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/HTMLRenderer/link.cc.ob
j
[ 26%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/HTMLRenderer/outline.cc
.obj
[ 30%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/HTMLRenderer/state.cc.o
bj
[ 34%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/HTMLRenderer/text.cc.ob
j
[ 38%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/BackgroundRenderer/Spla
shBackgroundRenderer.cc.obj
[ 42%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/BackgroundRenderer/Cair
oBackgroundRenderer.cc.obj
[ 46%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/util/const.cc.obj
[ 50%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/util/encoding.cc.obj
[ 53%] Building C object CMakeFiles/pdf2htmlEX.dir/src/util/ffw.c.obj
[ 57%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/util/math.cc.obj
[ 61%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/util/misc.cc.obj
[ 65%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/util/path.cc.obj
[ 69%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/util/unicode.cc.obj
[ 73%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/ArgParser.cc.obj
[ 76%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/Base64Stream.cc.obj
[ 80%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/Color.cc.obj
[ 84%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/HTMLTextLine.cc.obj
[ 88%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/HTMLTextPage.cc.obj
[ 92%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/Preprocessor.cc.obj
[ 96%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/StringFormatter.cc.obj
[100%] Building CXX object CMakeFiles/pdf2htmlEX.dir/src/TmpFiles.cc.obj
Linking CXX executable pdf2htmlEX.exe
[100%] Built target pdf2htmlEX
collect2.exe: error: ld returned 1 exit status

のエラーが発生する場合は CMakeFiles/pdf2htmlEX.dir/build.make の

	/C/MinGW/bin/g++.exe   -Wall --std=c++0x -O2 -DNDEBUG    -Wl,--whole-arc
hive CMakeFiles/pdf2htmlEX.dir/objects.a -Wl,--no-whole-archive  -o pdf2htmlEX.e
xe -Wl,--out-implib,libpdf2htmlEX.dll.a -Wl,--major-image-version,0,--minor-imag
e-version,0  -L/C/MinGW/lib -lpoppler -lfontforge -lgunicode -lkernel32 -luser32
 -lgdi32 -lwinspool -lshell32 -lole32 -loleaut32 -luuid -lcomdlg32 -ladvapi32

と記述されている行を修正してリンクできるようにしてください.
以下のように

-liconv -lintl -lz -lltdl -ljpeg -lxml2 -lgutils -lpng16 -ltiff -lglib-2.0 -lgio-2.0

を追加するとリンクできるようになりました.
実際は1行です.

	/C/MinGW/bin/g++.exe   -Wall --std=c++0x -O2 -DNDEBUG    -Wl,--whole-arc
hive CMakeFiles/pdf2htmlEX.dir/objects.a -Wl,--no-whole-archive  -o pdf2htmlEX.e
xe -Wl,--out-implib,libpdf2htmlEX.dll.a -Wl,--major-image-version,0,--minor-imag
e-version,0  -L/C/MinGW/lib -lpoppler -lfontforge -lgunicode -lkernel32 -luser32
 -lgdi32 -lwinspool -lshell32 -lole32 -loleaut32 -luuid -lcomdlg32 -ladvapi32 -l
iconv -lintl -lz -lltdl -ljpeg -lxml2 -lgutils -lpng16 -ltiff -lglib-2.0 -lgio-2
.0
$ make install
[100%] Built target pdf2htmlEX
Install the project...
-- Install configuration: "Release"
-- Installing: C:/MinGW/bin/pdf2htmlEX.exe
-- Installing: C:/MinGW/share/pdf2htmlEX/base.css
-- Installing: C:/MinGW/share/pdf2htmlEX/fancy.css
-- Installing: C:/MinGW/share/pdf2htmlEX/jquery.js
-- Installing: C:/MinGW/share/pdf2htmlEX/pdf2htmlEX.js
-- Installing: C:/MinGW/share/pdf2htmlEX/manifest
-- Installing: C:/MinGW/share/man/man1/pdf2htmlEX.1
$ pdf2htmlEX --version
pdf2htmlEX version 0.9

pdf2htmlEX はコードを適切に修正すれば Windows でも動作します.
プログラムを実行すると Segmentation fault が発生する場合は gdb を使用してデバッグしてください.

Cygwin

Cygwin では問題なく動作するようです.

macOS

Homebrew

$ brew install pdf2htmlex

MacPorts

Linux

Arch Linux

Linux Mint

$ sudo apt install software-properties-common
$ sudo apt-add-repository ppa:coolwanglu/pdf2htmlex
$ sudo apt update
$ sudo apt install pdf2htmlex

CentOS

使い方

Usage: pdf2htmlEX [Options] <input.pdf> [<output.html>]

関連リンク


Last-modified: 2018-03-11 (日) 13:48:15 (192d)