Softpedia
 

MAC CATEGORIES:



GLOBAL PAGES >>
NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
WEEK'S BEST
  • EarthDesk 5.8.5
  • GraphicConverter 8...
  • MacPorts 2.1.1
  • CCleaner 1.03.131
  • Quicksilver ß68 Bu...
  • TextWrangler 4.0.1...
  • Google Chrome 21.0...
  • Winclone 3.3
  • ScreenFlow 3.0.6
  • Apple Security Upd...
  • 7-DAY TOP DOWNLOAD
    #
    Program
    Minecraft 1.2.5
    7,381 downloads
    iPhone and iPod
    Firmware (iOS) 5.1.1

    6,047 downloads
    Java for Mac OS X
    10.7 Update 2012-003
    / 10.6 Update 8 /
    10.5 Update 10

    4,241 downloads
    Apple Xcode 4.3.2
    3,487 downloads
    SPSS Statistics
    20.0.0

    2,959 downloads
    Cheat Engine 5.6.1
    2,915 downloads
    Mac Boy Advance
    1.7.6

    2,635 downloads

    downloads
    Internet Explorer
    5.2.3

    2,276 downloads
    Canon PIXMA MP280
    Driver 10.51.2.0

    2,184 downloads
    Home > Mac > Internet Utilities > Methabot > Changelog

    Methabot 1.7.0 - Changelog


    What's new in Methabot 1.7.0:

    November 5th, 2009

    · Support for converting between character encodings through libiconv
    · New parser utf8conv for converting almost any character encoding to utf8
    · New parser entityconv, converts html entities such as ä to the
    · corresponding utf-8 character
    · The configuration system has been moved to a seperate library, libmethaconfig
    · Various improvements to the configuration loader, such as dynamically adding
    · and changing classes and scopes
    · Lots of memory usage optimizations and cleanup fixes
    · The documentation available in the wiki has been copied to a texinfo file,
    · from now on all documentation will be put in this texinfo file and available
    · as a manual both online and offline
    · Support for filetype attributes. Parsers can now set custom data that will
    · be associated with a parsed file. Attributes' primary area of use is when you
    · are connected to a Methanol system and want to store meta-data about a URL.
    · new Javascript function set_attribute() for setting attributes for the
    · current URL
    · API support for custom status, error/warning and target reporter functions
    · lmetha_global_setopt() is no longer available, replaced with lmetha_setopt()
    · options
    · SpiderMonkey-1.8.0 support added
    · New global Javascript function exec()
    · New built-in handler function writefile
    · libmetha no longer depends on libev, but instead uses pipes and epoll() for
    · inter-thread communication and waiting for events on sockets.
    · Added internal counters useful for keeping statistics
    · New filetype option 'ignore_host'
    · --external option set to false can no longer be circumvented using a HTTP-
    · redirect
    · Support for CURIE (why not?) in the built-in HTML parser added
    · Bugfix, a syntax error would in some rare cases occur when parsing integer
    · values in configuration files
    · Bugfix in the configuration file parser when reading flag values
    · Bugfix, when javascript filetype parsers did not return a value, it was
    · treated as a string, "undefined", and used as a relative URL



    What's new in Methabot 1.6.0.1:

    February 24th, 2009

    · Bugfix, when external-peek was used the depth limit was messed up.
    · Memory usage cleanup fixes
    · dynamic-url option is no longer set to lookup by default, since it slows down the crawling significantly
    · Build system now creates and installs some header files that modules can use when linking
    · metha-config tool added
    · lmm_mysql moved outside of this package



    What's new in Methabot 1.5.0:

    January 16th, 2009

    · Support for reading intial buffer from stdin
    · --type and --base-url command line options added, along with the initial_filetype option in configuration files
    · Cookies and DNS info is now properly shared between workers when running multithreaded
    · Added some example usage commands to --examples
    · Big improvements to the inter-thread communication, now faster and more organized
    · Added support for 'init' functions to scripts. Read more about init functions at http://bithack.se/projects/methabot/docs/e4x/init_functions.html
    · libmetha doesn't freeze when doing multiple concurrent HTTP HEAD requests anymore. The reason for the freezes was a bug in libcurl which is now fixed. Some workarounds have been added to libmetha to prevent the freezes from occuring when using the defect libcurl versions aswell.
    · Support for older libcurl versions 7.17.x and 7.16.x
    · New information is available in the "this" object of javascript parsers, content-type and transfer status code. Read more at http://bithack.se/projects/methabot/docs/e4x/this.html
    · --verbose option replaced with --silent, since verbose mode is now default
    · Initial support for FTP crawling and the ftp_dir_url crawler option
    · Depth limiting is now crawler-specific
    · Added the command line options --crawler and --filetype
    · Support for extending and overriding already defined crawlers and filetypes
    · Support for the copy keyword in configuration files
    · Support for dynamically switching the active crawler, this lets you crawl different websites in completely different ways in one crawling session. Read more about crawler switching at http://bithack.se/projects/methabot/docs/crawler_switching.html
    · libev version upgrade to 3.51
    · The include directive in configuration files now makes sure the included configuration file hasn't already been loaded, to prevent include-loops and multiple filetype/crawler definitions.
    · Various SpiderMonkey garbage collection fixes, libmetha does not crash anymore when cleaning up after a multithreaded session
    · Added some extra information to the --info option
    · The 'external' option is now fixed and enabled again
    · New option --spread-workers
    · New libmetha API function lmetha_global_setopt() allows changing the global error/message/warning reporter
    · Added initial implementation of a test suite for developers
    · Better error reporting when loading configuration files
    · Bugfix when an HTTP server didn't return a Content-Type header after a HEAD request
    · Bugfix when sorting URLs after multiple HTTP HEAD requests
    · Bugfix in the html to xml converter when the HTML page did not have an tag
    · Bugfix, the extless-url option did not work
    · Bugfix, html to xml converter no longer chokes on byte-order marks or other text before the actual HTML
    · Bugfix, prevented libmetha from trying to access URLs of protocols that are not supported
    · Bugfix when shutting down after an error.
    · Bugfix, unresolvable URLs did not break out the retry loop after three retries
    · Very experimental and unstable support for Win32, mainly intended for developers

    New configuration files:
    · google.conf, to perform google searches
    · youtube.conf, youtube searching
    · meta.conf, prints meta information such as keywords and description about HTML pages
    · title.conf, prints the title of HTML pages
    · ftp.conf, for crawling FTP servers



    What's new in Methabot 1.4.0:

    December 28th, 2008

    · Completely new architectural design
    · Filetype parser scripting through Javascript/E4X
    · Multithreading is now a primary concept
    · HTTP HEAD requests are now done asynchronously in a separate thread using curl and libev
    · Support for "peeking" at external URLs
    · The Methabot Project has been split up into several subprojects, primarily there's the command line tool, which uses the web crawling library libmetha as its backend.
    · Initial work on the distributed web crawling system Methanol.




    WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

    SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM