CCExtractor Changelog

What's new in CCExtractor 0.69

Jun 3, 2014
  • A few patches from Christopher Small, including proper support for multiple multicast clients listening on the same port.
  • GUI: Fixed teletext preview.
  • GUI: Added a small indicator of data being received when reading from UDP.
  • GUI: Added UTF-8 support to preview Window (used for teletext).
  • Fixes in Makefile and build script, compilation in linux and OSX failed if another libpng was found in the system.
  • WTV support directly in CCExtractor (no need for wtvccdump any more).
  • Started refactoring and clean-up.
  • Fix: MPEG clock rollover (happens each 26 hours) caused a time discontinuity.

New in CCExtractor 0.67 (Oct 10, 2013)

  • Padding bytes were being discarded early in the process in 0.66, which is convenient for debugging, but it messes with timing in .raw, which depends on padding. Fixed.
  • MythTV's branch had a fixed size buffer that could not be enough some times. Made dynamic.
  • Better support for PAT changing mid stream.
  • Removed quotes in Start in .smi (format fix).
  • Added multicast support (Chris Small)
  • Added ability to select IP address to bind in UDP (Chris Small)
  • Fixes in -unixts and -delay for teletext.
  • Added -autodash : When two people are talking, add a dash as needed (this is based on subtitle position). Only in .srt and with -trim. Quite experimental, feedback appreciated.
  • Added -latin1 to select Latin 1 as encoding. Default is not UTF-8 (-utf8 still exists but it's not needed).
  • Added -ru1, which emulates a (non-existing in real life) 1 line roll-up mode.

New in CCExtractor 0.66 (Jul 5, 2013)

  • Fixed bug in auto detection code that triggered a messageabout file being auto of sync.
  • Added -investigate_packetsThe PMT is used to select the most promising elementary streamto get captions from. Sometimes captions are where you leastexpect it so -datapid allows you to select a elementary streammanually, in case the CC location is not obvious from the PMTcontents. To assist looking for the right stream, the parameter"-investigate_packets" will have CCExtractor look inside eachstream, looking for CC markers, and report the streams thatare likely to contain CC data even if it can't be determined fromtheir PMT entry.
  • Added -datastreamtype to manually selecting a stream based onits type instead of its PID. Useful if your recording programalways hides the caption under the stream stream type.
  • Added -streamtype so if an elementary stream is selected manuallyfor processing the streamtype can be selected too. This can beneeded if you process for example a stream that is declared as"private MPEG" in the PMT, so CCExtractor can't tell what it is.Usually you'll want -streamtype 2 (MPEG video) or -streamtype 6(MPEG private data).
  • PMT content listing improved, it now shows the stream type formore types.
  • Fixes in roll-up, cursor was being moved to column 1 if aRU2, RU3 or RU4 was received even if already in roll-up mode.
  • Added -autoprogram. If a multiprogram TS is processed and-autoprogram is used CCExtractor will analyze all PMTs and usethe first program that has a suitable data stream.
  • Timed transcript (ttxt) now also exports the caption mode(roll-up, paint-on, etc) next to each line, as it's useful todetect things like commercials.
  • Content Advisory information from XDS is now decoded if it'stransmitted in "US TV parental guidelines" or "MPA".Other encoding such as Canada's are not supported yet dueto lack of samples.
  • Copy Management information from XDS is now decoded.
  • Added -xds. If present and export format is timed transcript(only), XDS information will be saved to file (same file as thetranscript, with XDS being clearly marked). Note that for nowall XDS data is exported even if it doesn't change, so thetranscript file will be significantly larger.
  • Added some PaintOn support, at least enough to prevent itfrom breaking things when the other modes are used.
  • Removed afd_data() warning. AFD doesn't carry any caption relateddata. AFD still detected in code in case we want to do somethingwith it later anyway.
  • Ported last changes from Petr Kutalek's telxcc. Current versionis 2.4.4.
  • In teletext mode when exporting to transcript (not .srt), an effortis made to detect and merge line duplicates. This is done by usingthe Levenshtein's distance, which is the number of changes requiresto convert one string to another.
  • To simplify things, strings arecompared up to the length of the shortest one.There are 3 parameters that can be used to tweak the thresholds:
  • -deblev: Enable debug so the calculated distance for each twostrings is displayed. The output includes both strings, thecalculated distance, the maximum allowed distance, and whetherthe strings are ultimately considered equivalent or not, i.e.the calculated distance is less or equal than the max allowed.
  • -levdistmincnt value: Minimum distance we always allow regardless of the length of the strings. Default 2. This means that if the calculated distance is 0, 1 or 2, we consider the strings to be equivalent.
  • -levdistmaxpct value: Maximum distance we allow, as a percentage of the shortest string length. Default 10%. For example, consider a comparison of one string of 30 characters and one of 60 characters. We want to determine whether the first 30 characters of the longer string are more or less the same as the shortest string, i.e. whether the longest string is the shortest one plus new characters and maybe some corrections. Since the shortest string is 30 characters and the default percentage is 10%, we would allow a distance of up to 3 between the first 30 characters.
  • Added -lf : Use UNIX line terminator (LF) instead of Windows (CRLF).
  • Added -noautotimeref: Prevent UTC reference from being auto set fromthe stream data.

New in CCExtractor 0.60 (Jan 25, 2012)

  • Add: MP4 support, using GPAC (a media library).
  • Fix: The Windows version was writing text files with double \r.
  • Fix: Closed captions blocks with no data could cause a crash.
  • Fix: -noru (to generate files without duplicate lines in roll-up) was broken, with complete lines being missing.
  • Fix: bin format not working as input.

New in CCExtractor 0.58 (Aug 22, 2011)

  • Implemented new PTS based mode to order the caption informationof AVC/H.264 data streams. The old pic_order_cnt_lsb based methodis still available via the -poc or --usepicorder command switches.
  • Removed a couple of those annoying "Impossible!" error messages that appears when processing some (possibly broken, unsure) files.
  • Added -nots --notypesettings to prevent italics and underline codes from being displayed.
  • Note to those not liking the paragraph symbol being used for the music note: Submit a VALID replacement in latin-1.
  • Added preliminary support for multiple program TS files. The parameter --program-number (or -pn) will let you choose whichprogram number to process. If no number is passed and the TS file contains more than one, CCExtractor will display a list offound programs and terminate.
  • Added support (basic, because I only received one sample) for someHauppauge cards that save CC data in their own format. Use theparameter -haup to enable it (CCExtractor will display a notice if it thinks that it's processing a Hauppauge capture anyway).
  • Fixed bug in roll-up.
  • More AVC work, now TS files from echostar that provided garbledoutput are processed OK.
  • Updated Windows GUI.

New in CCExtractor 0.57 (Dec 16, 2010)

  • Bugfixes in the Windows version. Some debug code was unintentionally left in the released version.

New in CCExtractor 0.55 (Aug 10, 2009)

  • Replace pattern matching code with improved parser for MPEG-2 elementarystreams.
  • Fix parsing of ReplayTV 5000 captions.
  • Add ability to decode SCTE 20 encoded captions.
  • Make decoding of TS files more error tolerant.
  • Start implementation of EIA-708 decoding (not active yet).
  • Add -gt / --goptime switch to use GOP timing instead of PTS timing.
  • Start implementation of AVC/H.264 decoding (not active yet).
  • Fixed: The basic problem is that when 24fps movie film gets converted to 30fps NTSC they repeat every 4th frame. Some pics have 3 fields of CC data with field 3 CC data belongs to the same channel as field 1. The following pics have the fields reversed because of the odd number of fields. I used top_field_first to tell when the channels are reversed. See Table 6-1 of the SCTE 20 [Paul Fernquist]

New in CCExtractor 0.54 (Apr 17, 2009)

  • Add -nosync and -fullbin switches for debugging purposes.
  • Remove -lg (--largegops) switch.
  • Improve syncronization of captions for source files with jumps in their time information or gaps in the caption information.
  • [R. Abarca] Changed Mac script, it now compiles/link everything from the /src directory.
  • It's now possible to have CCExtractor add credits automatically.
  • Added a feature to add start and end messages (for credits). See help screen for details.

New in CCExtractor 0.53 (Feb 24, 2009)

  • Force generated RCWT files to have the same length as source file.
  • Fix documentation for -startat / -endat switches.
  • Make -startat / -endat work with all output formats.
  • Fix sync check for raw/rcwt files.
  • Improve timing of dvr-ms NTSC captions.
  • Add -in=bin switch to read CCExtractor's own binary format.
  • Fix problem with short input files (smaller 1MB).
  • Clean up regular and debug output.
  • Add --no_progress_bar switch to help readability of redirected output.
  • Add -out=bin switch to write RCWT data.
  • Remove -bo/--bufferoutput switch and functionality.
  • [Volker] Added new generic binary format (RCWT for Raw Captions With Time). This new format allows one file to contain all the available closed caption data instead of just one stream.
  • Added --no_progress_bar to disable status information (mostly used when debugging, as the progress information is annoying in the middle of debug logs).
  • The Windows GUI was reported to freeze in some conditions. Fixed.
  • The Windows GUI is now targeted for .NET 2.0 instead of 3.5. This allows Windows 2000 to run it (there's not .NET 3.5 for Windows 2000), as requested by a couple of key users.

New in CCExtractor 0.52 (Dec 22, 2008)

  • Removed -autopad and -goppad, no longer needed.
  • In preparation to a new binary format we haverenamed the current .bin to .raw. Raw fileshave only CC data (with no header, timing, etc).
  • The input file format (when forced) is nowspecified with -in=formatsuch as -in=ts, -in=raw, -in=ps ...The old switches (-ts, -ps, etc) still work.The only exception is -bin which has been removed(reserved for the new binary format). Use-in=raw to process a raw file.
  • Removed -d, which when produced a raw file useda DVD format. This has been merged into a newoutput type "dvdraw". So now instead of using-raw -d as before, use -out=dvdraw if you needthis.
  • Removed --noff
  • Added gui_mode_reports for frontend communications.
  • Windows GUI rewritten. Source code now included,too.
  • [Volker] Dish Network clean-up

New in CCExtractor 0.50 (Dec 13, 2008)

  • [Volker] Fix in DVR-MS NTSC timing
  • [Volker] More clean-up
  • Minor fixes

New in CCExtractor 0.49 (Dec 10, 2008)

  • [Volker] Major MPEG parser rework. Code much cleaner now.
  • Some stations transmit broken roll-up captions, and for some reason don't send CRs but RUs... Added work-around code to make captions readable.
  • Started work on EIA-708 (DTV). Right now you can add -debug-708 to get a dump of the 708 data. An actually useful decoder will come soon.
  • Some of the changes MIGHT HAVE BROKEN MythTV's code. I don't use MythTV myself so I rely on other people's samples and reports. If MythTV is broken please let me know.
  • Added new debug options.
  • Other minor bugfixes and changes.

New in CCExtractor 0.46 (Nov 25, 2008)

  • Added support for live streaming, ccextractor can now process files that are being recorded at the same time.
  • [Volker] Added a new DVR-MS loop - this is completely new, DVR-MS specific code, so we no longer use the generic MPEG code for DVR-MS. DVR-MS should (or will be eventually at least) be as reliable as TS.
  • Note: For now, it's only ATSC recordings, not NTSC (analog) recordings.

New in CCExtractor 0.45 (Nov 15, 2008)

  • Added autodetection of DVR-MS files.
  • Added -asf to force DVR-MS mode.
  • Added some specific support for DVR-MS files. These format used to work correcty in 0.34 (pure luck) but the MPEG code rework broke it. It should work as it used to.
  • Updated Windows GUI to support the new options.
  • Added -lg --largegops From the help screen: Each Group-of-Picture comes with timing information. When this info is too separate (for example because there are a lot of frames in a GOP) ccextractor may prefer not to use GOP timing. Use this option is you need ccextractor to use GOP timing in large GOPs.

New in CCExtractor 0.44 (Sep 11, 2008)

  • Added an option to the GUI to process individual files in batch, i.e. call ccextractor once per file. Use it if you want to process several unrelated files in one go.
  • Added an option to prevent duplicate lines in roll-up captions.
  • Several minor bugfixes.
  • Updated the GUI to add the new options.

New in CCExtractor 0.43 (Jun 21, 2008)

  • Fixed a bug in the read loop (no less) that caused some files to fail when reading without buffering (which is the default in the linux build).
  • Several improvements in the GUI, such as saving current options as default.

New in CCExtractor 0.41 (Jun 16, 2008)

  • A semi-decent MPEG parser (instead of a pattern scanner).
  • A transcript mode. No time codes, no repeated lines in roll-up caption.
  • Tivo support.
  • Several bugfixes.

New in CCExtractor 0.40 (May 21, 2008)

  • A semi-decent MPEG parser (instead of a pattern scanner).
  • A transcript mode. No time codes, no repeated lines in roll-up caption.
  • Tivo support.
  • Several bugfixes.

New in CCExtractor 0.39 (May 12, 2008)

  • A semi-decent MPEG parser (instead of a pattern scanner).
  • A transcript mode. No time codes, no repeated lines in roll-up caption.
  • Tivo support.
  • Several bugfixes.