Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library.
Pandoc can read markdown and (subsets of) reStructuredText, HTML, and LaTeX, and it can write markdown, reStructuredText, DocBook XML, groff man, HTML, LaTeX, ConTeXt, RTF, and S5 HTML slide shows.
Here are some key features of "Pandoc":
· Modular design, using separate writers and readers for each supported format.
· A real markdown parser, not based on regex substitutions. More accurate and much faster than Markdown.pl.
· Also parses (subsets of) reStructuredText, LaTeX, and HTML.
· Multiple output formats: HTML, Docbook XML, LaTeX, ConTeXt, reStructuredText, Markdown, RTF, groff man pages, S5 slide shows.
· Unicode support.
· Optional "smart" quotes, dashes, and ellipses.
· Automatically generated tables of contents.
· Support for displaying math in HTML.
Extensions to markdown syntax:
· Document metadata (title, author, date).
· Footnotes, tables, and definition lists.
· Superscripts, subscripts, and strikeout.
· Inline LaTeX math and LaTeX commands.
· Markdown inside HTML blocks.
· Enhanced ordered lists: start number and numbering style are significant.
· Compatibility mode to turn off syntax entensions and emulate Markdown.pl.
Convenient wrapper scripts:
· html2markdown makes it easy to produce a markdown version of any web page.
· markdown2pdf converts markdown to PDF in one step.
· hsmarkdown is a drop-in replacement for Markdown.pl.
· Multi-platform: runs on Windows, Mac OS X, Linux, Unix.
Requirements:
· GHC
What's New in This Release: [ read full changelog ]
· Added docbook reader (with contributions from Mauro Bieg).
· Fixed bug in fromEntities. The previous version would turn hi & low you know; into hi &.
HTML reader:
· Don’t skip nonbreaking spaces. Previously a paragraph containing just would be rendered as an empty paragraph. Thanks to Paul Vorbach for pointing out the bug.
· Support and in tables. Closes #486.
Markdown reader:
· Don’t recognize references inside delimited code blocks.
· Allow list items to begin with lists.
LaTeX reader:
· Handle \bgroup, \egroup, \begingroup, \endgroup.
· Control sequences can’t be followed by a letter. This fixes a bug where \begingroup was parsed as \begin followed by group.
· Parse ‘dimension’ arguments to unknown commands. e.g. \parindent0pt
· Make \label and \ref sensitive to --parse-raw. If --parse-raw is selected, these will be parsed as raw latex inlines, rather than bracketed text.
· Don’t crash on unknown block commands (like \vspace{10pt}) inside...