What's new in docx2txt 1.4
May 16, 2014
- New feature:
- Added configuration variable config_unzip_opts. This removes dependency on unzip program, and allows users to use unzipping programs like 7z, pkzipc, winzip as well.
- Updates:
- Fixed list numbering.
- Improved list/paragraph indentation and corresponding code.
- Updated README with brief guidance on how this utility can be used to recover text from corrupted docx file.
New in docx2txt 1.3 (Apr 8, 2014)
- New features:
- Added support for handling lists (bullet, decimal, letter, roman) along with (attempt at) indentation.
- Updates:
- Added configuration variable config_twipsPerChar.
- Removed configuration variables: config_listIndent, config_exp_extra_deEscape.
- Text output omits deleted text. This matters in case changes are being tracked in docx document.
- Text output omits non-document_text content marked by wp/wp14 tags.
New in docx2txt 1.2 (Jan 16, 2012)
- New features:
- Perl script usage is extended to accept docx file from standard input. It also works with input/output redirection now. Please refer to the documentation for more information.
- Script files and configuration file can be installed in separate directories on (non-Windows) systems using Makefile for installation.
- Linux Makefile also attempts to update the system configuration directory to desired directory in installed Perl script.
- User specific and system wide configuration files can be maintained separately even on windows.
- Updates:
- "-h" has to be given as the first argument to Perl script to get usage help.
- Added new configuration variable "config_tempDir".
- Configuration file is uniformly looked for in current directory, user configuration directory (APPDATA on Windows and HOME on non-Windows), system configuration directory (same location as script files on Windows, /etc or as set during installation on non-Windows systems) in the specified order.
- Documentation has been updated with usage examples and information on how .docx file text content can directly be viewed using Vim and Emacs editors.
- Improved handling of special (non-text) characters, along with support for more non-text characters like fractions.
- Fixed Bug #3463033: added ' and " to docx specific escape character conversions.
- Fixed the wrong code that had got committed during earlier fixing of nullDevice for Cygwin.
New in docx2txt 1.1 (Dec 12, 2011)
- New features:
- Added a check for existence of unzip command.
- Configuration file is looked for in HOME directory as well.
- Updates:
- Configuration variables now begin with config_ .
- Fixed bugs #3003903, #3082018 and #3082035.
- Fixed nulldevice for Cygwin.
- Superscripted cross-references are placed within now.
New in docx2txt 1.0.0 (Oct 6, 2009)
- New features:
- Input argument can also be a directory holding the unzipped content of .docx file.
- Windows wrapper script, and support for using CakeCmd command line unzipper.
- Configuration file support for easy control over settings.
- Windows installation script.
- Updates:
- Hyperlink is not displayed if hyperlink and hyperlinked text are same, even though user has enabled hyperlink display.
- Improved handling of short line justification, capturing many cases that were missed in earlier approach.
- Path names containing spaces are now handled.
New in docx2txt 0.4 (Sep 7, 2009)
- user can control display of hyperlink along with linked text.
- TOC related cleanup. TOC was not addressed so far.
- many new character conversions (check the script code for details).
- character conversion mappings are now organised in a tabular form.
- currency characters are converted to respective full currency name.
- code tweaks to speedup the conversion process.
New in docx2txt 0.3 (Aug 22, 2009)
- docx2txt.pl invocation has been changed a little.
- user involvement during installation is reduced.
- some suggestions on how Windows users can use this tool.