RapidMiner Studio Changelog

What's new in RapidMiner Studio 10.2.0

Aug 17, 2023

Features:
Added Delete Amazon S3 Resource operator
Added user interaction after a project was cleaned up on AI Hub
Ignore it and keep project disconnected
Overwrite local version by clean check out from AI Hub
Archive local changes and then overwrite local version as above
Enhancements:
Migrated Generate ID and Split Data operators to the new Belt data core, future-proofing them and improving their speed.
Added new setting in the preferences to control whether RapidMiner Studio should favour speed over memory footprint or vice-versa. It can be changed to reduce memory footprint while trading runtime if memory is critical. The setting can be found under System and is called Memory Management.
Added repository web action to go to deployment endpoints
Improved error recovery and error messages for date_parse_str function of the expression parser.
Trailing white spaces are no longer treated as errors in the expression parser.
Improved opening URL experience on certain Linux distributions which do not support triggering browsing programmatically
The Correlation Matrix operator now uses the new and improved subset selector
Further reduced start-up time of RapidMiner Studio:
Introduced lazy loading of operators
Improved utilization of operator signature cache
Introduced shallow plugin initialization
Bugfixes:
Fixed broken error messages in the Edit Expressions dialog that operators like Generate Attributes use to display the expression parser
Removed deprecated Stream Database operator (deprecated since version 7.5, six years ago)
Fixed bug in data splitting code that prevented empty partitions in some cases.
Fixed Synchronize Meta Data with Real Data not working even though it was selected. The selection is now remembered after restart.
When Synchronize Meta Data with Real Data is activated and the process has been run, Read operators like Read Excel and Read CSV remember the real metadata even if another operator is added to the process.
Fixed parameters stay above value and stay below value of Prescriptive Analytics operator
Fixed a possible concurrency issue when writing json IOObjects in parallel
Fixed potential access denied error for Read Azure Data Lake Storage Gen2 operator when reading larger files
Development:
Added com.rapidminer.repository.recent.RecentDataManager to allow global access to the recently used data sets. It comes with a listener mechanism and currently keeps track of data opened in the Results view, as well as used in the Interactive Decision Tree wizard.
Removed deprecated classes and methods pertaining to the old concept of managing Perspectives (including MainFrame#getPerspectives())
Added DeveloperTools#shouldDeveloperToolsBeShown() to allow for an easy way to check whether you want to offer developer tools of some capacity when appropriate
Fixed bug that caused TableMetaData#columns() to return a meta data sub-table with random column order
Fixed bug when registering IOObjects from operator signature
Plugins now properly also look up resources like icons from the default com/rapidminer/extension/resources path. The old additional lookup for com/rapidminer/resources is kept for compatibility reasons.
Deprecated: SwingTools#addIconStoragePath(String), it never worked

New in RapidMiner Studio 10.1.3 (Jun 19, 2023)

New in RapidMiner Studio 10.1.2 (Mar 23, 2023)

New in RapidMiner Studio 10.1.1 (Mar 23, 2023)

Features:
Enabled usage of RapidMiner Studio with Altair licenses
Introduced new operators which are powered by the new Belt data core. Existing processes will continue to use the previous operator versions, so existing processes will continue to work as before.
Generate Attributes:
The expression parser can now access other rows via index, allowing for much more powerful expressions (e.g. Fibonacci, aggregations, etc.)
Added new lead/lag/cell_value/row_number functions
Added new time functions for time arithmetic
Improved date-time functions for a more consistent user experience across different time zones and locales.
Select Attributes:
The new column types available in the new data core are now available here. This is a feature not yet used, however it will be in the future.
Greatly improved the user interface for the selection to make it much more user-friendly
Removed the very rarely used filter types 'block type' and 'numeric value filter'
Set Role:
Roles can now be assigned more than once, e.g. selecting multiple label columns. This is a feature not yet used, however it will be in the future.
The new data core does not allow dynamic roles anymore, so only predefined roles are now available. The 'metadata' role can be used to mark special columns that should be ignored during operator calculations.
Advanced columns are now allowed outside operators:
The new advanced column types text, text-list, text-set and real-array can now be used in Belt Data Tables between operators
They can be filtered out by type with Select Attributes to use the data table in operators that still operate on Example Sets
For now, this feature will not be visible to you unless you install future extensions that make use of these new column types. This is just the foundation so new extensions can start making use of these new capabilities.
Enabled future-proof JSON serialization for all IOObjects
Enhancements:
Removed parameter create view from all preprocessing operators and Apply Model. This parameter was both virtually unused and broken - and is not being supported by our new internal data structure we introduced a while ago.
Improved automatic storage of custom result visualizations in cases where you have more than one file with the same name (but of different types) in the same repository folder
Activated Json serialization for all IOObjects (besides ExampleSets, Belt IOTables and IOObjectCollections, which all already have a new, special file format)
Old .ioo files can still be read
Non-converted IOObjects are still stored as .ioo
Works in local repositories and projects, but not in legacy repositories
Added admin settings with key rapidminer.disallow.decryption.storage. This impacts a user's interaction with connections
When set to true, any and all interactions with connections that need to decrypt values require a login (edit, create, move/copy, get metadata in a process)
If a project with disabled decryption storage is deleted from AI Hub, it cannot be recreated
Added support for OAuth for Salesforce connections
Support to manually configure the AI Hub frontend URL when connecting to a project (optional)
Added Setting to de-/activate automatic detection of remote changes for a project
Added Setting to de-/activate automatic detection of local changes for a project, to allow reducing filesystem access
Bugfixes:
Fixed UI issue when opening an operator chain (e.g. Subprocess operator) with a breakpoint
Fixed issues with the data import wizard not closing on completion
Studio is now respecting the max memory setting again when started via the .exe on Windows
Fixed date-time formatting when starting via the .exe on Windows
Fixed some UI scaling issues when starting via the .exe on Windows
Error dialog improvements when using AI Hub
Fixed some json serialization that was behaving incorrectly
Fixed bug in Remember operator that could lead to unexpected modification of the stored IOObject
Fixed a bug in Weight by Relief which led to an incorrectly empty weight table when the sample ratio was smaller than one
Fixed a bug in SVM (Evolutionary) that occurred for nominal labels, when the hold out set ratio was non-zero.
Fixed a bug in SVM (Evolutionary) that occurred when the hold out ratio lead to empty training data.
Fixed a bug in the view for kernel models with infinite parameters.
Fixed possible NPE in process tool
Fixed issue in the Statistics tab of a data table failing when data contained infinite values in numerical columns Fix missing error message when trying to run a process execution on AI Hub from a project where the encryption key is not known anymore
Fixed Google Cloud Services connections using Service account for Google Drive access
Development:
Server client now exposes loaded and failed extensions on AI Hub (ServerClient#getLoadedExtensions() and ServerClient#getFailedExtensions())
Renamed ServerClient methods listVersionedRepositories and deleteVersionedRepository to listProjects and deleteProject
Added ServerClient methods for getProject and createProject to match the naming convention
Deprecated and forwarded old ServerClient methods getVersionedRepository and createVersionedRepository to getProject and createProject
Exposed json serialization for IOObjects in API of JsonIOObjectEntry

New in RapidMiner Studio 10.1.0 (Jan 25, 2023)

Features:
Enabled usage of RapidMiner Studio with Altair licenses
Introduced new operators which are powered by the new Belt data core. Existing processes will continue to use the previous operator versions, so existing processes will continue to work as before.
Generate Attributes:
The expression parser can now access other rows via index, allowing for much more powerful expressions (e.g. Fibonacci, aggregations, etc.)
Added new lead/lag/cell_value/row_number functions
Added new time functions for time arithmetic
Improved date-time functions for a more consistent user experience across different time zones and locales.
Select Attributes:
The new column types available in the new data core are now available here. This is a feature not yet used, however it will be in the future.
Greatly improved the user interface for the selection to make it much more user-friendly
Removed the very rarely used filter types 'block type' and 'numeric value filter'
Set Role:
Roles can now be assigned more than once, e.g. selecting multiple label columns. This is a feature not yet used, however it will be in the future.
The new data core does not allow dynamic roles anymore, so only predefined roles are now available. The 'metadata' role can be used to mark special columns that should be ignored during operator calculations.
Advanced columns are now allowed outside operators:
The new advanced column types text, text-list, text-set and real-array can now be used in Belt Data Tables between operators
They can be filtered out by type with Select Attributes to use the data table in operators that still operate on Example Sets
For now, this feature will not be visible to you unless you install future extensions that make use of these new column types. This is just the foundation so new extensions can start making use of these new capabilities.
Enabled future-proof JSON serialization for all IOObjects
Enhancements:
Removed parameter create view from all preprocessing operators and Apply Model. This parameter was both virtually unused and broken - and is not being supported by our new internal data structure we introduced a while ago.
Improved automatic storage of custom result visualizations in cases where you have more than one file with the same name (but of different types) in the same repository folder
Activated Json serialization for all IOObjects (besides ExampleSets, Belt IOTables and IOObjectCollections, which all already have a new, special file format)
Old .ioo files can still be read
Non-converted IOObjects are still stored as .ioo
Works in local repositories and projects, but not in legacy repositories
Added admin settings with key rapidminer.disallow.decryption.storage. This impacts a user's interaction with connections
When set to true, any and all interactions with connections that need to decrypt values require a login (edit, create, move/copy, get metadata in a process)
If a project with disabled decryption storage is deleted from AI Hub, it cannot be recreated
Support to manually configure the AI Hub frontend URL when connecting to a project (optional)
Added Setting to de-/activate automatic detection of remote changes for a project
Added Setting to de-/activate automatic detection of local changes for a project, to allow reducing filesystem access
Bugfixes:
Fixed UI issue when opening an operator chain (e.g. Subprocess operator) with a breakpoint
Fixed issues with the data import wizard not closing on completion
Studio is now respecting the max memory setting again when started via the .exe on Windows
Fixed date-time formatting when starting via the .exe on Windows
Fixed some UI scaling issues when starting via the .exe on Windows
Error dialog improvements when using AI Hub
Fixed some json serialization that was behaving incorrectly
Fixed bug in Remember operator that could lead to unexpected modification of the stored IOObject
Fixed a bug in Weight by Relief which led to an incorrectly empty weight table when the sample ratio was smaller than one
Fixed a bug in SVM (Evolutionary) that occurred for nominal labels, when the hold out set ratio was non-zero.
Fixed a bug in SVM (Evolutionary) that occurred when the hold out ratio lead to empty training data.
Fixed a bug in the view for kernel models with infinite parameters.
Fixed possible NPE in process tools
Fixed Google Cloud Services connections using Service account for Google Drive access
Added support for OAuth for Salesforce connections
Development:
Server client now exposes loaded and failed extensions on AI Hub (ServerClient#getLoadedExtensions() and ServerClient#getFailedExtensions())
Renamed ServerClient methods listVersionedRepositories and deleteVersionedRepository to listProjects and deleteProject
Added ServerClient methods for getProject and createProject to match the naming convention
Deprecated and forwarded old ServerClient methods getVersionedRepository and createVersionedRepository to getProject and createProject
Exposed json serialization for IOObjects in API of JsonIOObjectEntry

New in RapidMiner Studio 10.0.0 (Nov 8, 2022)

Features:
RapidMiner Studio now finally uses Java 11 as opposed to Java 8!
AI Hub X now also uses Java 11, and as a consequence, RapidMiner Studio X cannot connect to AI Hub 9 or earlier! Both Studio and AI Hub need to be upgraded to version 10!
Windows & OS X users will get the updated Java runtime automatically, but Unix users (or anyone using the platform independent release) need to provide Java 11 manually for running Studio.
Some extensions might no longer work with Java 11 and require an update, please check the Marketplace for updates.
Visualizations: Added ability to sort results when using aggregations in all charts where it makes sense. Sorting can be ascending/descending either on the aggregated result value, or the aggregation column name.
Time Series: Added the Windowing Model as a preprocessing model for the Windowing operator.
The model can be used to apply the configured windowing operation on any data set (having the same columns) by using Apply Model operator
The model can be grouped together with other models using the Group Model operator.
Cloud Connectivity: Added Google Drive operators to read, write, delete and loop files, as well as create folders.
Connectivity: Added Snowflake as a first-class citizen for database connections
Enhancements:
Added preprocessing model to Pivot operator
Improved High-DPI scaling on Windows
The tooltip for date-time entries in the result view now shows the time-stamp in ISO format (including potential nanoseconds)
Copy&pasting data from date-time cells is now consistent with what is displayed in the precise tooltip
Added setting to disable repository indexing for searching altogether via the Enable repository search indexing setting. This can be used for very large repositories or ones behind a slow network drive or when a virus scanner is involved
Time Series: Added the parameter sort time series to all time series operators where an indices column is mandatory or optional
If selected the input time series is automatically sorted before the time series operation is applied. The output of original ports will also contain the sorted data set.
Time Series: Improved UserError for indices attributes which are not sorted or has non-unique values
Bugfixes:
Fixed a problem where collections with empty sub-collections might not be readable
Fixed problems with empty (sub-) collections not being readable
Fixed problems with repeatedly extracting collections because of an incorrectly set timestamp
Fixed the storage of the LFS & editable flags in the repositories.xml file for projects
Fixed a problem where collections with empty subcollections might not be readable
Fixed issue that could cause an error when a better license was installed automatically
Fixed creation of new Google Cloud Services connection after the recent Google OAuth flow changes
Development:
RapidMiner Studio is now running with Java 11, as are all bundled extensions. Starting from this version, all extensions targeting RapidMiner X and beyond must be compatible with Java 11 as well!
We upgraded ALL libraries RapidMiner Studio uses to their latest available versions. This includes libraries where the version jump comes with API changes. Please thoroughly test your extensions to ensure they not only run with Java 11, but are also working as expected given all the library upgrades.
Added new (Belt-based) expression parser that replaces the old one. The new expression parser comes with an improved API for developers, it can handle the new Belt types and index-based functions.
Some index-based functions have already been added to the new expression parser: lead, lag and row_number
Date-time functions have been revised
Time functions have been added
The behavior of the and / or functions for missing values changed
Back ported separation of Tools class to extract number formatting methods and make them available without core dependencies.
Upgraded JxBrowser to version 7.26 for HTML5-based visualizations. This should not affect extensions, unless they accessed the Browser creation directly.

New in RapidMiner Studio 9.10.11 (Aug 12, 2022)

New in RapidMiner Studio 9.10.10 (Jul 18, 2022)

New in RapidMiner Studio 9.10.8 (May 3, 2022)

New in RapidMiner Studio 9.10.7 (May 3, 2022)

New in RapidMiner Studio 9.10.6 (May 3, 2022)

Enhancements:
Fixed a memory & file leak when using large numbers of repeated JDBC connections
Visualizations: Added options to customize Wordcloud word orientations
Visualizations: Added Jamaica to the map collection
Updated postgres JDBC driver to version 42.3.2
Added skip inaccessible parameter for Loop Files to skip inaccessible files/directories, instead of a silent failure. If unchecked, the operator does not loop at all and will throw a proper error.
Stopping Loop Files is now always possible in a timely manner, even if you selected a directory with millions of files.
Updated H2 DB library due to security advisory
Added new parameter fitting error handling to the ARIMA Trainer operator.
In case of a fitting error during training, either a proper error is thrown or a fallback Default Forecast Model is provided.
Removed meta data warning for number of parameters is too large for the ARIMA Trainer operator.
Added new option to Amazon S3 connections that allows for much more flexible authentication schemes, like credential profiles and IAM roles.
Bugfixes:
Fixed character corruption issue with Read Database and Execute SQL when reading a query via a file from disk on certain operating systems
Fixed a memory leak when using database connections
Fixed a general file leak when using connections
Fixed a problem when creating dynamically suffixed attributes through the AttributeFactory in parallel
Fixed side effects for models when executing in parallel
Fixed an issue in projects that could sometimes cause Execute Process or Retrieve operators within parallel loops or similar setups to fail with an error message like "Cannot retrieve 'entry', it does not exist"
Fixed an issue that could sometimes cause Execute Process operators within parallel loops or similar setups to fail with a error messages like "Cannot connect to the RapidMiner AI Hub repository '_LOCAL'" when running on an AI Hub legacy repository
Fixed a wrong error, which was thrown during Apply Forecast when a Multiply operator was used on the Holt-Winters model
Fixed calculation errors for Holt-Winters models with additive seasonality

New in RapidMiner Studio 9.10.1 (Oct 25, 2021)

New in RapidMiner Studio 9.10.0 (Aug 16, 2021)

Features:
Added Function Fitting operator that can optimize parameters in a function of the attributes to fit the label. It can be used to create an optimal function to fit the data points in your data.
Bias Awareness: if the use of a specific column is more likely to add unwanted bias to your models, it is highlighted as such. This happens in various places such as in the Statistics view of data, the model simulator, in Turbo Prep, in Auto Model, during model training, in model annotations among others.
Enhancements:
The De-Normalization operator has a new parameter to also de-normalize predictions.
Based on attribute name: prediction(abc) tries to use de-normalization of abc if no explicit de-normalization available
The label (or other special attributes) can be included in normalization already in the normalize operator. The changes allow for multiple prediction attributes to be affected
Added date format parameter to Write CSV in case format date attributes is selected
Improved performance of Append operator
Handled yet another case of JDBC drivers ignoring the JDBC standard gracefully (here: Infor Data Lake DatabaseMetaData#getTypeInfo())
Introduced operator signatures to improve the startup of Studio
Signatures contain meta information that is used in operator registration, global search setup and documentation browser display
Signatures are persisted between starts for an improved startup time
Signature persistence can be configured or cleared with the setting System -> Local File Cache -> Keep Operator Signatures
Time Series: Enabled the usage of constant values for the replace types in the Equalize Numerical Indices and Equalize Time Stamps operators
The operators can now be used to fill gaps in non-equal data sets with constant values
Time Series: All Time Series operators (except for Multi Horizon Forecast, Multi Horizon Performance) now working with Belt IOTable (as in- and output)
Bugfixes:
In rare instances, operator parameters did not get saved correctly if a default value was set for it. This e.g. affected date parameters used in extensions.
Generate Attributes max and min functions do now always return missing value if any of the values is missing.
Fixed missing operator help for Azure Blob Storage and Data Lake Storage operators

New in RapidMiner Studio 9.9.0 (Mar 26, 2021)

New Features:
Data is the central piece in any RapidMiner process. The way RapidMiner internally deals with data has fundamentally changed in this release with the new Data Core (codename Belt). Its new columnar table representation provides a quantum leap in processing speed and memory efficiency for RapidMiner processes. Multiple operators already use it internally and it becomes fully available now for extension developers to create fast and efficient operators.
Added a Set Positive Value operator for the new Data Core which can make nominal attributes binominal or change the positive value of binominal attributes
Enhancements:
Replaced the Rename by Example Values operator by a new and improved version
Replaced the Rename operator by a new one that can additionally handle a renaming dictionary
Replaced the Sort operator by one that can sort by multiple attributes (currently already part of the Operator Toolbox extension)
Improved the FP-Growth operator so that it only works with explicitly defined positive values (either via binominal attributes or the positive value parameter) for items in dummy coded columns
Improved memory consumption of Cross Validation in certain circumstances
The operators Read CSV and Read Excel were improved to use the new data core
Pivot now supports Least and Mode aggregations for numerical attributes as well
Annotate now adds the annotations to the meta data as well
Added warning when trying to run a process on an AI Hub with a lower feature version than the current Studio version
Added a reason when displaying incompatible extensions in the dialog after startup to show why an extension failed to load. Details available via tooltip.
Upgraded integrated Chromium to version 84
Improved some metadata transformation w.r.t. nominal value sets
The splashscreen no longer shows duplicate extension icons during startup if more than one copy of an extension is installed
Visualizations now also support Least and Mode aggregations for numerical attributes
Improved concurrent execution in some corner cases
Deprecated the Exchange Roles operator
Model viewer for Gradient Boosted Tree models now respects the Number format settings in Studio preferences
Auto Model uses new clustering algorithms which no longer require one-hot encoding on the data set and therefore reduce the memory footprint for data sets with nominal columns with many values. As a result, users can no longer specify the minimum number of clusters in the X-Means case (automatic determination of the optimal number of clusters). The minimum is now fixed at 2.
Time Series: Added the option to ignore invalid values to the Moving Average Filter operator: Invalid values (missing, positive and negative infinity are now ignored when calculating the filtered value
This also results in valid values at the beginning and end of the filtered time series
As the Classic Decomposition and the Function and Seasonal Component Forecast are based on the Moving Average Filter, the also have now the "ignore invalid values" option
Bugfixes:
Fixed Data Table reading/writing when LFS light checkout is enabled
Fixed a problem where an uncaught exception could go through when using date/time attributes with values in the far future/past
Fixed an uncaught exception that could happen when the process run via Execute Process failed, the user opened it via the popup and ran it directly after fixing the problem
Fixed wrong attribute weights for Random Forest regression
Fixed error in Store operator when used after application of k-Means model
Fixed issue that Save dialogs did not accept any selection if a wildcard (.*) filter was provided (e.g. for Write Document)
Fixed Pivot meta data column names not matching the real data
Fixed missing text for the file restoring confirm dialog in projects
Fixed an issue that could cause Studio startup to silently fail
Fixed a possible error during startup w.r.t port preconditions on some operators
Fixed a bug that could cause project creation to not show an error and appear to do nothing
Removed check for preprocessing models in model deployments for custom models. This has been causing certain grouped models to fail if they contained models which have technically been not preprocessing models (e.g. PCA).
Time Series: Fixed a bug for the Lag operator, which caused original data to be changed at preceding ports as well
Time Series: Fixed some small errors in the description of two tutorial processes for Sliding Window Validation
Time Series: Fixed an error, which occurs in time-based windowing, when the end of the last window is equal to the last timestamp in the input data. This effects all windowing operators (Windowing, Process Windows, Forecast Validation, Sliding Window Validation).
Cloud Connectivity: File browser now adds the correct path separator character on Windows, and resolves macros properly for AWS, Azure, and Google Cloud file operators

New in RapidMiner Studio 9.8.1 (Dec 4, 2020)

New in RapidMiner Studio 9.8.0 (Oct 15, 2020)

New Features:
Utilize AI Hub 9.8 support for large files in Projects. Files with more than 10MB and stored ExampleSets are automatically handled to be versioned as expected, but stored more efficiently. This is backed by Git LFS, which means Python or R coders can continue to easily work with these projects as long as they have the Git LFS extension installed.
Time Series Windowing Update:
Added time based (window parameters are specified in time units) and custom windowing (start and stop values of the windows are provided by an additional example set) for all windowing operators (Windowing, Process Windows, Forecast Validation, Sliding Window Validation)
Added a few more parameters: expert settings (couples a few expert parameters into not shown, if it is not selected), windows defined (specifies from which point windows are defined), empty window handling
Changed the computation of the final model for the Forecast Validation and Sliding Window Validation operators to compute the model on a final window with the same size as the training windows and which ends at the last example of the input series
Time Series: Added new aggregation methods (median, maximum, minimum, standard deviation, variance) to Moving Average Filter
Cloud Connectivity:
Added connectivity to Azure Data Lake Storage Gen2:
Read Azure Data Lake Storage Gen2
Loop Azure Data Lake Storage Gen2
Write Azure Data Lake Storage Gen2
Enhancements:
H2O:
New operator: K-Means (H2O), which implements K-Means clustering using the bundled H2O library. Key features include:
Estimate the optimal value of k, when a good initial guess is not available from the user
Built-in standardization and nominal encoding
Quick and memory efficient execution
Note: estimate k is strongly preferential to low k values. Make sure to double check results and if they are in line with expectations.
Newly created repositories and projects are now by default stored in the current users "Documents" folder. The location continues to be customizable on repository / project creation
When opening a process or RapidMiner file using "Open with..." RapidMiner Studio, the process will be loaded from the repository registered for the path. Process files that are not stored in a repository will be imported just like the menu item "Import Process" would
IOObject collections are now stored in a new, zip-based file format, ending with .collection
Incorporated a new library to better make use of system proxy settings if "system" is selected in the preferences, especially w.r.t. Windows and WPAD/PAC files. This will drastically improve the experience in complex corporate network setups
HTML5 safe mode is now way more performant
Upgraded Chromium binaries to version 79
Improved error message for remote repository creation (central AI Hub repository and projects) when the authentication is mismatched (user/password vs SSO)
Added Settings option to optimize internal file browser for mapped network drives
Time Series: Moved Moving Average Filter into the Transformation operator group and removed the obsolete Filter operator group
Time Series: Reordered the output ports of the Multi Label Performance and Multi Horizon Performance operators
Bugfixes:
Fixed wrong metadata after renaming in the new repositories and then creating a new entry with the previous name
Fixed rare issues that could cause problems when trying to view Visualizations on certain machines
Fixed Mixed Euclidean Distance for nominal values and Nominal Distance
A JNA library on the Windows PATH no longer results in an error
Fixed issue that could cause charts in the Deployments view to not show up.
Fixed problem that caused the legacy smtp password setting in the Preferences dialog to become broken when the dialog was saved more than once after changing the value. Note that this setting is not recommended anymore, use the new Send Mail connection instead.
Fixed a similar problem with the legacy connection UI encrypting passwords and tokens multiple times
Auto Model Results calculated on AI Hub can now be opened via Results view after the folder with all results has been moved/copied
Upgraded bundled JRE to 8u265
Deployments keep working now after the Server repository has been renamed
Fixed a problem where unsigned extensions could not make use of the new connection objects inside operators
Fixed potential IllegalArgumentException in Google Storage operators when running on Server
ExampleSets with huge nominal values can be retrieved again from the repository
Time Series: Fixed a bug in Equalize Time Stamps which caused an infinite loop in some cases when the calendar time was set to 'domain' and the input data consists of already partwise equidistant time stamps
Known issues:
H2O K-Means:
Apply Model does not work with cluster models produced with the K-Means (H2O) operator
Label and ID roles from the input dataset are lost if add_cluster_attribute is set to true

New in RapidMiner Studio 9.7.2 (Aug 5, 2020)

New in RapidMiner Studio 9.7.1 (Jun 25, 2020)

New in RapidMiner Studio 9.7.0 (Jun 5, 2020)

New Features:
Added versioned projects which are tied to RapidMiner Server. You can have as many versioned projects as you like, no limits! The versioning is backed by Git and can be accessed by any regular Git clients. This means sharing between Python/R coders and RapidMiner users has never been easier!
Added dialog to select which version of a file to keep in case of a conflict in the versioned projects while getting Snapshots from Server.Versioning happens on a project level. As you can now have as many projects as you like, this is the most sensible behavior because most of the time many entries are interconnected in a project. Thus the entire state is saved and can be later restored, without having to worry about dependency versions.
Projects support ALL files you may have on your computer! You can put your .py scripts, your .md files, your .png files, your .pdf files, etc all into a project. It will be neatly displayed in RapidMiner Studio.
Of course, all those files can be versioned together, so RapidMiner users and Python coders can share the same git repository. The Python coders can even use their native Git client to do so, no magic required. This will make collaboration between RapidMiner users and Python coders easier than ever before!
Processes in versioned projects can also be run and scheduled on RapidMiner Server as they can for an existing Server central repository
All the files live locally on your computer, but are also shared via Git. This gives you the performance of a local repository when working with it during prototyping, but also allows for easy collaboration with your colleagues.
Added new panel "Snapshot History" which allows to browse the history of your versioned projects, as well as see the changes you've made since the latest snapshot. It can also be used to restore an earlier state of the project, view past versions of individual files, and to restore those past versions.
ExampleSets are now written to disk in a new file format: HDF5. This is a well-established format used e.g. by the NASA to store large amounts of data. This also means that Python and RapidMiner Studio can exchange data via HDF5 files much more easily and faster than ever before.
Local repositories that will be created with RapidMiner Studio 9.7 or later can also take advantage of supporting all files you may have on your computer (.py, .jpeg, .pdf, etc).
New operator Target Encoding which can remove nominal attributes with too many values and performs a target encoding (also known as mean encoding) on the remaining attributes
Auto Model: some processes (e.g. SVM, FLM, or weight calculations) now use the new Target Encoding instead of one-hot encoding which reduces memory usage and run times
Time Series: New operator Integrate to integrate time series with different methods (cumulative sum / left and right riemann sum / trapezoidal rule)
Enhancements:
It is now possible to have a folder with the same name as a data entry in the repository (might not work for some old repositories)
It is now possible to have a process and a data entry with the same name in the repository (might not work for some old repositories)
Replaced Send Mail operator with new version which supports file attachments
Improved memory usage for Aggregate and Pivot operators for nominal columns with potentially a lot of unused values
Improved dealing with whitespaces in repository entry names
Improved cleanup of temp files, to reduce disk space clutter when Studio runs for a long time, i.e. in a Server environment
Made log tables in Result View behave more like other results, adding more actions and a shortcut to the context menu
Process background images are now using a relative path to the image if possible, instead of an absolute path. This only applies for background images set from now on, it does not work retroactively
For binominal attributes the Statistics tab shows the positive and the negative value
Renamed RapidMiner Server to RapidMiner AI Hub
Opening/Moving the Process panel into the foreground when opening a process while in the Design view to make it more obvious something happened
Auto Model: remote executions on Server require the central repository as storage location
Turbo Prep: only local file based repositories can now be used as temporary repositories for the handover to Auto Model
Model Ops: only local repositories or central Server repositories can be used as storage locations for deployed models (also known as "deployment location")
Model Ops: keep unused and ID columns in the results after scoring
The operators Explain Predictions and Model Simulator now also support grouped models where arbitrary models have been grouped instead of only preprocessing models
The operator Explain Predictions now offers a parameter to limit the number of important features also for the "importances" output
Both local repositories and versioned projects (tied to RM Server) have been completely rebuilt to get rid of many old limitations. Benefits include:
Enhanced throughput and performance
Better meta data caching
Concurrent access support
Displaying all files (no matter what they are, e.g. Python scripts, images, ...)
Allowing different file types (e.g. data, processes) and folders to share the same name
Note: Your existing local repositories have (Legacy) after their name, indicating they still run on the old technology and still have some of the limitations! If you create a new local repository, it will have (Local) after its name and have all the capabilities listed above. You can copy your data over via Studio from the old repository to a new one to migrate.
Time Series:
Added options to use padding for Fast Fourier Transformation and calculate the frequency of the amplitude value.
Added the option to specify negative lags for the Lag operator
Added the option to specify a default lag for a set of attributes (selected by an attribute subset selector) to the Lag operator
Unfortunately due to parameter key incompatibilities, old version of the Lag operator is deprecated and new version with the same name, but different operator key is added.
H2O:
Updated H2O library to version 3.30.0.1.
Added monotonicity constraints to Gradient Boosted Trees
Added weights port to Deep Learning
Expanded whitelist of accepted expert parameters, now supports all parameters provided by H2O
Deep Learning and Logistic Regression now work with datasets that have nominal columns with only one value
Bugfixes:
Fixed an issue that could cause Studio startup to never complete
Made Studio startup more rigid to quit process instead of silently hanging on the splash screen forever
Fixed issue that could cause panels to sometimes not open if they had been closed previously in this session
Fixed an issue that caused CTAs not working when HTML5 safe mode was enabled
Fixed an issue with back propagation of changes to performance vectors
Fixed a problem for JDBC drivers that do not implement a certain set of functionality by adding a fallback (e.g. SQLite writing)
Fixed potential cause for complete UI freeze when interacting with a CTA notification banner
Fixed an issue with process navigation and property panel if operator names contain HTML
Generate Multi-Label Data does now correctly work in non-regression mode
Fixed memory leak caused by the Visualizations
Fixed rare issue where data sets could not be downsampled automatically if license limit was exceeded
Fixed an issue in Automatic Feature Engineering if all input features have been nominal in the feature selection case
Fixed "Edit Access Rights" dialog for Server repositories not getting the permissions correctly when using Enterprise SSO
Fixed an issue that caused Studio to lag and increase memory consumption when using the right-click "Insert operator" popup menu in the Process panel.
Fixed broken replacing (instead it was duplicated) on move of data entries to a different repository
Auto Model: remote executions show new submission screens now which only allows the reset of Auto Model to load the results which avoids problems with multiple remote submissions within the same session
Auto Model: reordering the columns in the column selection table no longer lead to graphics problems
Time Series: Fixed a bug in Extract Peaks, that causes all "_position" features to have an offset of 1 to the Example number
Known issues:
One Hot Encoding does not produce the desired results, this will be fixed with the next patch release.
Special notes:
Columns of type "Integer" that were previously stored as integers are now stored as their double representation. This of course means more range (~53 bit precision), but also means that values are no longer capped. This might have an impact when storing data to disk and rereading it.
Columns of type "Date" no longer store the milliseconds due to the new file format. This might have an impact of equality tests and matching when storing data to disk and rereading it.
Visualizations that have been created locally for data sets stored in repositories will not be found anymore after the update, causing the result visualization to reset to its default. If you have set up complex visualizations that you absolutely want to restore, you can follow these steps:
Open the data set in the Results view of RapidMiner Studio.
Navigate on your disk via your filesystem explorer into the "USER_HOME/.RapidMiner/internal cache/content mapper" folder. There you can find a folder structure matching your repository names and structure.
Find the exact path to the data set (e.g. "C:/Users/xyz/.RapidMiner/internal cache/content mapper/Local Repository/Charts/Demo/12. Pie")
You should see a very similar path right next to it, either ending in ".ioo" or ".rmhdf5table" (e.g. "C:/Users/xyz/.RapidMiner/internal cache/content mapper/Local Repository/Charts/Demo/12. Pie.ioo")
Go into the folder from step 3 (the one without the .ioo ending), and copy the "pc.json" file from it to the folder from step 4 (the one with the .ioo ending)
Close the data set in the Results view
Open it again. It should now have its configuration back!
Development:
The introduction of versioned projects (backed by Git) have forced a major redesign of the Repository API. Up until 9.7, a RepositoryLocation was represented by a string like "//RepositoryName/folder/test" and "test" was guaranteed to be unique. It was either a folder, a process, an ioobject (data) object, or a blob. This is no longer the case!
Since collaboration with Git can introduce naming conflicts which are not actually file-level conflicts (so Git is fine with them), we had to allow these "non-conflicts" into the Repository world as well.
Now a repository location that ends with "test" as the last path element can either depict a folder (RepositoryLocationType#FOLDER), or data (RepositoryLocationType#DATA_ENTRY). Sometimes this is unknown, which is also fine: RepositoryLocationType#UNKNOWN can be used in that case. However, it does not stop there. Since for Git, "test.rmp" and "test.ioo" are also perfectly fine, we had to go one step further and also allow that. Therefore, a RepositoryLocation now also has an expected DataEntry (sub-)type which is used to determine what specific type of a DataEntry to locate (a ProcessEntry, an IOObjectEntry, a ConnectionEntry, or a BinaryEntry).
You can even end up in the undesirable situation of having a "test.ioo" and a "test.rmhdf5table" (both IOObjects) in the same location. Because we cannot determine which IOObject a process should potentially use, these situations must be rectified by the user - the Retrieve operator will throw an error in that case! Looking at the data and renaming one of the entries will work fine, though. This scenario can only happen after a Git pull with the new versioned projects.
In other words, "test" can in our example now be a folder, a process, a data ioobject, a connection entry, or a binary entry. And they can all exist at the very same time in the very same folder. So be sure to specify in the new RepositoryLocationBuilder what exactly you want from the repository, or you may end up getting the first name match it finds, which may be of an unexpected type.
Repositories now distinguish between data and folders, and even between different data subtypes (process, ioobject, connection, binary entry) which means you can have a folder called "A" and e.g. a process called "A" at the same time. This has implications for a large number of APIs, most notably:
com.rapidminer.repository.Repository interface:
locateFolder(String) and locateData(String, Class) have been added and can be implemented, their default implementation points to the RepositoryManager()#locateFolder(String) and locateData(String, Class

New in RapidMiner Studio 9.6.0 (Jun 5, 2020)

New Features:
Added buttons for copying/pasting the active process to the process toolbar.
Equalize Time Series:
Added two new operators (Equalize Numerical Indices and Equalize Time Stamps) which provide the functionality to equalize input time series. The output time series will have new equidistant index values. The operators provide different possibilities to configure the number of examples, the start value and the stop value and the step size of the new index values. The corresponding values of the output time series are computed by using a Replace Missing Values (Series) operation.
Equalize Numerical Indices: Equalize numerical indices into equidistant numerical indices with a numerical step size.
Equalize Time Stamps: Equalize date-time indices into equidistant date-time indices. Either with an exact duration (with millisecond precision) as the step size, or with a period (multiple of days, weeks, months or years) as the step size.
Peak Transformations:
Added two new operators (Z-Score Peak Transformation and Highest Peak Transformation) which perform a peak detection and transformation on time series. They detect peaks in a time series and add an indicator peak series (with the values -1,0,1 as peak flag values) and a peaked series (original values if a peak was detected, missing for non-peak areas).
Z-Score Peak Transformation: performs the peak detection by calculating the local mean and standard deviation and identifies values as peaks when they have a large deviation to this local mean
Highest Peak Transformation: performs the peak detection by dividing the time series in different areas and checking if local minima and maxima are valid peaks or only noise effects.
Peak Feature Extraction:
New operator Extract Peaks which performs a peak detection (by utilizing one of the new Peak Transformation operators and extracts features describing the peaks)
Added optional custom endpoint parameter to Amazon S3 connections. This enables you to use an S3 API compatible storage service other than Amazon S3.
Deployments / Model Ops:
All custom prediction models are now supported in model ops, i.e. models created with the Design view, in addition to Auto Model models
Grouped models are now supported as well which allows combinations of preprocessing models with a prediction model
Model Simulator in Deployments now uses raw data columns as input and performs data prep on the fly
Offer setting if scores should be explained (about 100x faster without), new deployments will have this disabled per default, existing deployments enabled
Show if scores should be explained in overview table
Model Ops initialization happens in background now – no longer blocking UI start of RM if a remote location is not available (anymore)
Some speed improvements for model ops (less objects are loaded from repos which makes things a bit faster for remote deployments
Model Simulator operator now also supports grouped models
Enhancements:
Connections to external data sources like Cassandra or MongoDB are now properly re-used (within reason) and closed when a process is finished. This should lead to less connections to an external data source when using loop constructs, as well as properly closed connections after a process if finished.
Windows and OS X builds now ship with OpenJDK (version 8u232)
Added new timezone parameter to JDBC connections. Note: date handling in databases (and generally) is a tricky subject, and there are quite a few ways to make mistakes while doing so. Some databases/JDBC drivers also don't implement date handling properly. Last but not least, keep in mind that a date_time/date is a fixed point in time, but when it is displayed in a more human readable format than "milliseconds since 01-01-1970 UTC", the display string is converting that instant to your display timezone. So even if for example a date is 13th of Jan in UTC, you may see 12th of Jan when viewing it in Australia, due to the display timezone offset. The actual point in time (milliseconds since 01-01-1970 UTC) however would be identical. See documentation for further information.
When parsing a string to time with Nominal to Date, the associated timestamp now represents that time on the 1st of January, 1970 instead of 1st of February 1970
Added Default User-Agent setting to Preferences / System
Updated MariaDB JDBC driver
You can now see which Java version is being used when looking at the "About" dialog
Improved meta data warning in case the time series attribute selection of time series operators is empty
Added option to autodetect S3 region in Amazon S3 connections
Improved Google Cloud Services connection UI
File chooser icons on OS X are now also supporting HiDPI
When removing a repository, the repository.xml file now gets updated immediately
Visualizations: Tick interval input field now allows to set much larger values for datetime axes as its using milliseconds as a unit to split the chunks
Updated the Step by Step In-Product Tutorial content
Added more search tags to various performance and aggregation operators
Improved error message when download/deserialization of data from a remote repository occurs
Improved error message when SSL certificate was invalid when attempting to connect to a RM Server repository.
Improved logging when trying to connect to a RM Server and unusual exceptions occur, e.g. more details about why SSL connection failed, what the network problem is, etc.
Bugfixes:
Fixed issue that could cause Studio to stop starting and be stuck at the splash screen forever.
Fixed an issue where storing datasets in a database using the automatically created primary key was not possible.
Declare Missing Value no longer crashes if the expression mode is selected and the expression itself returns a missing value. Instead, it will evaluate to false and thus NOT set a missing value for that row.
Fixed models and other IOObjects coming from extensions not being identified correctly in Server repositories.
Fixed Auto Model not being able to use results of a Join operator in some cases.
Fixed broken properties when storing data tables in rare cases.
It is no longer possible to create RapidMiner Server repositories with an invalid name.
Filter Examples now correctly resolves all macros in parameters, including in custom filter attribute names.
Fixed error that could sometimes cause result tables not being able to move to Auto Model via the button in the Results tab.
Fixed an issue that caused Visualizations to not appear on certain Linux systems.
Fixed file chooser icons on OS X.
Fixed bug for scoring in Deployments: if column types are incompatible, they are actually dropped now (which was documented as such but did not happen)
Auto Model will now be restored if the user cancels a deployment by closing the deployment dialog
Other:
It is no longer possible to create legacy connections and other connections which have been replaced with the new repository connection objects in RapidMiner 9.3. Existing connections can still be edited and used, but this functionality will be removed eventually as well. Make sure to migrate existing legacy connections to repository connection objects! See documentation for reference.
Development:
Added caching for connections based on ConnectionAdapterHandler to reduce connection count and give possibility to clean connections up after it is no longer needed (e.g. the process is finished).
GlobalSearch is no longer available in headless mode (aka command line, job container execution, etc)

New in RapidMiner Studio 9.5.1 (Nov 20, 2019)

New in RapidMiner Studio 9.5 (Nov 6, 2019)

New in RapidMiner Studio 9.4.1 (Sep 30, 2019)

New Automated Model Ops
Follow the fully automated data science path: prepare your data using Turbo Prep, create prediction models via Auto Model and finally put them into production with Model Ops.
Deploy the most promising models with one click and score new data via flexible web services or in the UI.
Track model performance on an intuitive dashboard and swap easily to the best performing one. Setup an email alert to get notified if a model outperforms the one in production.
Evaluate each model with respect to their financial impact instead of pure Data Science metrics.
Detect changes in data and their impact on model performance early to address problems.
Use our integrated dashboard to keep track of data drift and model performance.
New map visualizations:
Visualize geospatial data with the new map visualizations. You can choose from multiple map types with many different configuration options, as well as dozens of maps for geographic regions, continents, and countries. Available map types:
Choropleth maps: Used to display numeric values associated to regions (e.g. a country or a state) via a color gradient
Categorical maps: Used to visualize regions that belong to a number of distinct categories
Point maps: These maps offer latitude and longitude support and display a marker for each coordinate on the selected map
New charts:
Three new chart types have been added in addition to some tweaks and fixes to the existing charts:
Sunburst chart
Chord diagram
Parliament chart
Improved Auto Model:
Auto Model features several improvements under the hood as well as a few more visible enhancements:
All predictive processes generated by Auto Model are now much cleaner, well-structured, and can be understood way easier.
Cost-sensitive learning has been added to show the costs / benefits in the validation result. This allows to solve problems (e.g. fraud detection) that involve highly imbalanced data sets (e.g. credit card transaction data).
New data prep and modeling capabilities:
Several new operators have been added to ease and enhance data preparation and machine learning:
New operators Replace All Missings, Handle Unknown Values, One Hot Encoding and Append (Robust) to easily prepare data for modeling and scoring.
New operator Rescale Confidences (Logistic) to rescale confidences even for classification with more than two classes.
New operator Cost-Sensitive Scoring: Novel approach for cost-sensitive learning which works for more than two classes.
New operators Multi Label Modeling and Multi Label Performance to train and validate a combined model for multiple label columns in a single step.
Enhanced time series forecasting:
New operators have been added for:
Forecasting multiple horizons of a time series with any machine learning model (Multi Horizon Forecast)
Validating performance of multi horizon forecasts (Multi Horizon Performance)
Sliding window validation for time series data science problems
Enhanced data source connection framework:
All RapidMiner-supported connectivity extensions on the Marketplace now use the new data source connection framework, which includes handling connections to
MongoDB
Cassandra
Splunk
Solr
Mozenda

New in RapidMiner Studio 9.1.0 (Dec 14, 2018)

New Features:
The Aggregate Operator got the percentile function where the percentile can be changed in the aggregation attributes functions list. It is possible to use an integer like 75 or a floating point value like 80.5 here. It is of course also possible to use a macro here.
SSL certificates stored in .RapidMiner/cacert are now trusted on startup. See trust-certificates for more information.
Added support to open operator tutorial processes directly from the web.
Split the setting to keep operators connected upon disabling or deleting them into these settings:
Drop or bridge operator connections upon deletion
Drop, bridge or keep connections upon disabling
Enhancements:
The "Import Data" dialog for CSV files will try to guess the best matching date format and preselect date for attributes that contain mostly matching date entries
The "Import Data" dialog for Excel files does now differentiate between date, time and datetime columns specified in Excel
Improved CSV import wizard to use the structure found in the header or starting row
Parse Numbers and the Data Import wizards now support exponents in numbers with a leading '+' for positive exponents, e.g. "5.9876E+7"
Improved Cross Validation error handling when the Performance port is not connected
The XML Panel does no longer hide default values
Split thread settings in foreground and background threads (for the currently opened process and processes running in the background, respectively)
Updated bundled Java for Windows and OS X to version 8u181. This should fix right-click issues on OS X
Added support for aggregation functions for Pivot operator and improved performance
When moving operators in the Process view, connected operators will be rearranged and moved to the right if necessary
Bugfixes:
For large ExampleSets with more than ~71.5 million rows, the result table will compress the height of each row a bit to accomodate. Data sets with more than ~86 million rows will only display the first ~86 million rows and show a warning that the rest is cut off.
Fixed an issue that could cause Studio to be stuck for up to ~2 minutes on start-up.
Fixed very rare process error when working with attribute weights.
X-Means item count of cluster model will now show the correct size.
Fixed an issue where (temporary) Access files could not be deleted in a RapidMiner process.
Development:
Added registerLanguage method to the I18N class, which allows to add new languages to the Settings->Preferences->Language selection. The i18n is picked up by providing resource bundles in the usual form of for example GUI_ja.properties and Error_ja.properties. If you want to get a list of not-yet-translated keys, add a file called translation_help.txt in your .RapidMiner folder. After you shut down Studio with your new language selected, it will write all keys for which it did not find the translation in it. This should help you identify keys that you still need to translate.
Added the OperatorPortActionRegistry to add actions to operator ports.
Added identifier for last delivering port to the IOObject's userdata via IOObject.getUserData(DeliveringPortManager.LAST_DELIVERING_PORT)
Added support for parameter dependencies and hidden state to the settings dialog.

New in RapidMiner Studio 9.0.3 (Oct 4, 2018)

New in RapidMiner Studio 9.0.2 (Sep 5, 2018)

New in RapidMiner Studio 9.0.1 (Aug 14, 2018)

New in RapidMiner Studio 9.0.0 (Aug 8, 2018)

NEW FEATURES:
Added TurboPrep, your interactive data preparation in a data-centric UI
Added new Time Series functionality
Added support for Google Cloud Storage with Read Google Storage, Write Google Storage, and Loop Google Storage operators. They work similar to their existing Amazon S3 and Azure Blob Storage counterparts.
Added new online repositories which contain up-to-date help content. These contents are used by our online educational materials.
Added concatenation function to Generate Aggregation
Added a new "admin configuration" feature (documentation here):
Operator Blacklisting
Extension Whitelisting
Telemetry
Studio Settings
ENHANCEMENTS:
Global Search results can now be navigated by keyboard
Operators can now be renamed by double-clicking on their name (indicated by a text cursor)
Improved operator renaming visuals when zoomed in/out of the process
Process panel in Design view can no longer be closed
Updated behavior for Result History panel outside of Result view
Uncloseable panels no longer have close buttons
Updated import wizards for Read CSV and Read Excel operators to make them consistent with the Add Data repository action
Added Remove All Breakpoints entry to Edit menu and right click context menus
A warning is shown for correlation matrices that could not be calculated
Improved the guessing for type of Quotes during CSV import
Improved the guessing on decimal separator in CSV import
Twitter operators now correctly warn about the rate limit when it is exceeded instead of throwing a generic error
Hyperlinks in process notes are now clickable and open the default browser
Repository actions that need write access are now grayed out when a read-only entry is selected
Inserting an operator via Global Search will now correctly grant focus to the Process panel, so you can immediately use the keyboard to manipulate the operator
Added workaround for a bug in the Amazon Redshift JDBC driver so that it can be used now
Saving a process in a read-only repository now offers the SaveAs dialog instead
Repository location chooser (for opening and for saving) no longer sometimes appears as a separate instance of RM Studio in the operating system taskbar
BUG FIXES:
Clicking on a selected operator no longer sometimes selects an operator behind it
Fixed process panel sometimes being opened in other views
Fixed an issue where icons did not show up on Retina displays
Updated vulnerable libraries
Fixed potential UI freeze during the Import Data process
A rare error concerning parallel loops in combination with Generate Attributes was fixed
Fixed an issue that RapidMiner Studio always started in fullscreen mode on Mac OS X
Fixed results view not showing the latest result as the active tab
DEVELOPMENT:
Added callback hook for DataImportWizardBuilder. The callback can be used to determine by the caller what should happen after the user has concluded the data import.

New in RapidMiner Studio 8.2.1 (Jul 6, 2018)

New Features:
Added possibility to disconnect from RapidMiner Server repositories
Enhancements:
Edit Access Rights dialog is now read-only if the user does not have enough permissions to make changes
The Generate Weight Stratification does now warn about mismatching data
Updated tutorial process for Loop Attributes
Bug fixes:
Fixed broken preview when using the Guess value types or Reload data buttons in the Import Configuration Wizard of the Read Excel and Read CSV operators, after manually changing the attribute selection or an attribute role.
Fixed a metadata problem with the Singular Value Decomposition operator showing the wrong type of preprocessing model.
Fixed a bug causing Aggregate to concatenate the same value multiple times even though only distinct was set.
It is no longer possible to toggle breakpoints if Process panel is not visible.
Write CSV is no longer writing Integer values as floating points.
Updated mode aggregation function of Aggregrate to take missing values into account.
Remember can now be used in every iteration of a parallel operator, instead of only the last. No execution order is guaranteed.
The New Revision server repository action does no longer block the UI.
Fixed bug preventing SVM Kernel Scatter Plot from displaying certain variables.
The macro command line argument -M does now work as expected when passed to the rapidminer-batch.bat launcher.
Fixed rare bug that could occur when looking at a subprocess of a parallel operator while zoomed out and trying to run the process.
Fixed pass through port of the Correlation Matrix operator (returned a subset of the input for some data sets).
Fixed missing visual indicator in the top bar for the currently selected view when resizing RM Studio horizontally.
Fixed spelling error in Direct Marketing template.
Fixed spelling error for mikro/makro.
Fixed a problem using undo/redo during a tutorial.
Fixed a rare bug that might occur on restoring a process on startup.
Fixed uncommon bug where Views will break when switching too fast between them.
Fixed bug making Apply Threshold use the wrong mapping.

New in RapidMiner Studio 8.2.0 (May 10, 2018)

Enhancements:
Double-click on an unconnected operator port will connect it to a matching output port of the process.
The menu View -> Show Panel is now scrollable.
Updated visualization of tutorial's next button to go to next tutorial or back to tutorial overview when reaching the end of a tutorial or a chapter respectively.
Removed search button from search bar and changed result dialog to open with one-click logic.
Creating a RapidMiner Server repository no longer stores the credentials automatically. However, if desired you can still do so by selecting the "Remember Password" checkbox when creating the repository.
Panels now always have proper tooltips.
Improved visualization of nested Operators.
Added primary parameter mechanic to some Operators; double clicking an Operator now opens the editor of a primary parameter. This also works for operators that have subprocesses. In that case, pressing the Alt-key while double-clicking activates the primary parameter.
Quickfixes now can be directly accessed after a process run fails from the error bubble.
Improved performance of FP Growth and added support for additional input formats.
The status bar (found at the very bottom of RM Studio) now more precisely displays possible actions when editing a process.
Pressing the arrow keys in the process panel when no operators are selected will now select the first operator.
Bug fixes:
Parallel operators now produce identical results when running in parallel and when running sequentially
Removed several sources for redundant undo steps
Fixed a bug that could lead to incomplete output of Execute Program
Fixed and improved on generic process runtime errors
Fixed erratic behaviour of EMClusterer
Date to Nominal does no longer remove the role of the selected attribute
Fixed a bug where results from Data to Similarity Data could not be processed further
Fixed an issue that could result in the "Drag here" annotation being shown in the process all the time when using the Global Search
Fixed a bug that allowed operators to connect to themselves
Fixed Web Analytics template

New in RapidMiner Studio 8.1.3 (Apr 20, 2018)

New in RapidMiner Studio 8.1.1 (Mar 8, 2018)

New in RapidMiner Studio 8.1.0 (Feb 6, 2018)

Model Wizard and Explorer:
A new working mode for rapid creation, comparison, and exploration of new models. The Modeling Wizard will save you a lot of time in creating processes for multiple models.
Global search:
Find anything within your repository and the operator list using a central search engine: processes, models, operators, extensions… even your past actions! No need to search through all our folder structure any more: everything is now at hand!
Security:
User passwords are now hidden and replaced by stars after typed.
Passwords are now kept encrypted in the .RapidMiner user folder.
Improved performance:
We have re-factored a few operators, including Join, Correlation Matrix and K-Means to drastically improve performance, with up to x10 increases in speed.
New Features:
Added Auto Model feature, a new working mode for rapid creation, comparison, and exploration of new models. It can be found as a new view at the top.
Added a powerful global search functionality which can be found in the top-right corner and activated via Ctrl+F shortcut. You can currently search for operators, repository contents, UI actions, and Marketplace content. See the documentation for more information if you are interested in more complex and powerful search queries (e.g. finding data/models that contain a specific attribute, or were last modified before a certain date, etc).
Enhancements:
New Process Templates upgraded to use the latest operator versions.
Read Excel now allows sheet selection by name.
Read CSV, Read XML and Read Excel have a new expert parameter read all values as polynominal, which allows the user to disable type guessing.
Hide passwords in the Password Manager dialog and store them with a stronger encryption.
Seach Twitter and Get Twitter User Statuses added support for 280-character tweets.
All Twitter operators moved from numerical to nominal attributes for user and status IDs.
Made the Views display at the top more dynamic on resizing to prevent squashed GUI elements for low(er) resolutions and to show more views for high(er) resolutions. To achieve this, both the Undo and Redo buttons for process editing were removed. You can still undo/redo via the top Edit menu, or by pressing Ctrl+Z/Ctrl+Y, or even via the new global search by searching for Undo or Redo.
Bug fixes:
Secured XML parsing against XXE vulnerability
Fixed a rare error when logging inside parallel operators
Fixed problem that caused Parse Numbers to fail if input was an empty value
Fixed a rare error when running Join, Replace Missing Values, or Add inside a parallel loop
Fixed handling of polynominal attributes in Apply Model when applying a Cluster Model
Updated Regularized/Linear/Quadratic Discriminant Analysis to avoid uncaught errors and give more information if an error occurs
Fixed uncaught Runtime Exception when using Loop Parameters and Optimize Parameters (Grid) with log_all_criteria
Fixed issues with duplicated or missing entries, as well as missing groups in the Manage Connections dialog
Refreshing folders in a RapidMiner Server repository no longer blocks the entire Studio interface
Renaming entries in a RapidMiner Server repository no longer blocks the entire Studio interface
Pressing Ctrl-A in an empty process no longer makes the process parameters disappear
Hotkeys for view switches now work properly from all views
Upgraded MSSQL JDBC driver to version 4.2
Upgraded PostGreSQL JDBC driver to version 42.2.1
Development:
The Global Search feature is highly flexible and open to extensions - look at com.rapidminer.search.GlobalSearchable and com.rapidminer.gui.search.GlobalSearchableGUIProvider to get started!
Unsigned 3rd party extensions can now call ParameterService#setParameterValue(String, String) without causing a SecurityException
Please note: We have accumulated lots of outdated code over the years. Anything that is annotated with @Deprecated will be removed at some point in the future. Removal will start with RapidMiner Studio 9.0, so please prepare your extensions by not using any deprecated code anymore. JavaDoc will help guide you to replacement classes/interfaces/methods.

New in RapidMiner Studio 8.0.1 (Jan 15, 2018)

New in RapidMiner Studio 8.0.0 (Dec 4, 2017)

New in RapidMiner Studio 7.6.1 (Sep 6, 2017)

New in RapidMiner Studio 7.6.0 (Sep 6, 2017)

New features:
Sending notification emails can now be configured in the preferences to make use of all modern connection security and authentication mechanisms like TLS 1.2 + PFS
Enhancements:
The sender of notification emails can now be configured in the preferences
Licenses are now valid for the full last day until midnight
Improved handling of infeasible parameter values for Self-Organizing Map
Changed default sampling type parameter for Validation operators to automatic
Write Message now has a parameter option to append to existing files instead of overwriting them
Logistic Regression and Generalized Linear Model learners now have a threshold output where they deliver a threshold value optimized for maximal F-measure
Improved handling of missing and infinite values for Normalize
Improved handling of missing or broken compatibility numbers in the process xml
Made behavior of add as label parameter consistent for all cluster operators
Improved checks for empty example sets in cluster operators
Improved shown capabilities for cluster operators and added quick fixes for inconsistent parameter selection
Reduced some internal logging by moving it behind the debug flag which can be activated in the preferences
Updated Java for Windows and Mac OS X to version 8u141
Bug fixes:
Fixed reproducibility of results when concurrent operators (e.g. Loops) are involved.
Changing the default connection timeout setting in the preferences now takes effect immediately.
Sending notification emails now uses the default connection timeout.
Fixed metadata of Flatten Clustering.
Fixed behavior of Loop Parameter inside parallel loops.
Removed unnecessary warning for clustering operators with nominal input data
Generate Weights (LPR) and Local Polynomial Regression now provide additional kernel parameters for the numerical measure KernelEuclideanDistance instead of failing
Fixed Gradient Boosted Trees renderer, it no longer shows wrong edge labels and incorrect value sets
Logistic Regression, Generalized Linear Model, Gradient Boosted Trees and Deep Learning operators no longer crash the software if certain temporary folder permissions are missing
Logistic Regression and Generalized Linear Model learners now use 0.5 as the threshold as other binominal learners
Fixed behavior of Loop Attributes when only one attribute is selected for parallel execution
Fixed Average for Performance inputs that contain AUC
Fixed side-effects of Apply Threshold in other branches of the process
Fixed rare crash in Create Association Rules under certain parameter configurations

New in RapidMiner Studio 7.5.3 (Jun 30, 2017)

New in RapidMiner Studio 7.5.1 (May 10, 2017)

New in RapidMiner Studio 7.5.0 (May 3, 2017)

New features:
The first iteration of new data core that manages data sets in a much more efficient way has arrived! This results in both better performance and less memory usage for the vast majority of operators.
Added support for Microsoft Azure Blob Storage with Read Azure Blob Storage, Write Azure Blob Storage, and Loop Azure Blob Storage operators. They work exactly like their existing Amazon S3 counterparts.
Added support for Amazon Key Management Service (AWS KMS) for all Amazon S3 operators. You can now optionally add an encryption key id to your Amazon S3 connection to decrypt/encrypt files when working with Amazon S3.
Added a new mechanism to provide help, advice messages, and even important announcements to the user.
Enhancements:
Completely revised result graph interaction, presentation, and visualization (e.g. decision trees, clusters, etc.).
It is now possible to highlight the path to a node of a decision tree in the Results view.
Cluster nodes in the Results view are now scaled according to their relative size.
Undo and redo functionality is now much more intuitive when working with the process canvas. It will now not only restore the process state, but also restore canvas location, operator selection, and the zoom level.
Navigating up and down through subprocesses in the UI is now more user friendly. When entering a subprocess and later going back up, you will see the same part of the process you were looking at before entering the subprocess.
Remove Duplicates now features a new output port called duplicates which returns the examples identified as duplicates.
Fixed memory leaks for Handle Exception, Select Subprocess, and Branch.
Execute Script now caches the parsed scripts for significantly faster execution, especially inside Loop operators or other highly concurrent environments. General performance of script execution has also been improved. Also added operator tags and added a default example script to make usage of the operator easier. Last but not least, error messages now include the causing stacktrace for easier debugging.
Improved AutoMLP performance.
Loading context data shows progress now.
Added new global process macro: %{process_start} which captures the timestamp when a process was started.
It is now possible to close result tabs with the same shortcut as in your web browser: ctrl+w (command+w on OS X)
Added new tutorials for RapidMiner Server and RapidMiner Radoop.
Added some more usable date and datetime format defaults to choose from when importing data.
Added folder buildingblocks in the .RapidMiner directory which will also be searched for .buildingblock files on startup.
The dialog letting you know about an available RapidMiner Studio update now also displays the version number of the update.
Bug fixes:
Fixed a bug making all parallel Loop operators incredibly resource hungry when running hundreds of thousands of iterations
Error bubbles indicating the source of an error in the process now work correctly in nested loops again
Removed empty confidence columns when applying the model from Linear Discriminant Analysis, Quadratic Discriminant Analysis, Regularized Discriminant Analysis, Single Rule Induction, Subgroup Discovery
Regular Discriminant Analysis no longer ignores the alpha parameter
The median for Aggregate now takes the middle point of both middle values in case of an even number of values
Fixed error that made operators which use a connection (e.g. Read Salesforce) unusable after importing a process
Fixed layout of marketplace search link in operator panel
Fixed broken dialog title for package download error
Fixed broken configurable entries due to unnecessary escaping
Fixed delay when trying to view decision trees in the Results view
Fixed major memory leak for Loop, Loop Values, Loop Attributes, and Loop Files
Fixed some operator parameter help tooltips being cut off
Fixed behaviour of Fast Large Margin if learned with bias (parameter)
Fixed pdf/svg image export of the scatter matrix chart
Fixed some spelling errors
Fixed Linear Regression calculation in case use bias is not selected
Fixed confidences of Ada Boost in border cases
Logistic Regression and Generalized Linear Model no longer allow p-value calculation without adding intercept
Fixed problem when trying to delete extensions of which more than one version was installed
Developers:
Concurrency API introduced with 7.4.0 is now available for unsigned extensions
Notes:
Changes to Fast Large Margin might affect behaviour of models learned with prior versions of RapidMiner. If you have an existing Fast Large Margin model which was learned using bias, we suggest you learn the model again with this release to ensure correct predictions.

New in RapidMiner Studio 7.4.0 (Feb 14, 2017)

New features:
Processes can now be executed in the background of Studio while you work on a different process in the user interface. This feature is only available for users with a Large license.
New parallelized Loop operator.
New parallelized Loop Values operator.
New parallelized Loop Attributes operator.
New parallelized Loop Files operator.
Repository entries can now be sorted by date.
Users with Large licenses can now grant additional permissions to unsigned extensions.
Enhancements:
Added a few new templates which can be used as a starting point when creating a new process.
Improved performance of Polynominal Regression.
Improved performance of Linear Regression.
Improved error message in case a selected input attribute for an operator is of the wrong type.
Improved operator progress for Generate Massive Data and several segmentation operators.
Improved performance of LibSVM and Fast Large Margin when sparse input data is not in sparse data format.
Small performance improvements for several operators that read parameters unnecessarily often.
Performance improvement for operators that iterate over all attributes.
Optimize by Generation (Evolutionary Aggregation) no longer shows unnecessary popup.
Repository entry sorting by name now ignores capitalization.
Users with Large licenses can now grant additional permissions to unsigned extensions via a new setting in the Start-up tab in the preferences.
The Log table in the results panel now also uses the new UI look and feel.
Bug fixes:
Fixed useless cipher error when starting Studio for the very first time.
Fixed swapped title in models of Linear Discriminant Analysis and Quadratic Discriminant Analysis.
Fixed side-effects of application of preprocessing models in other branches of the process.
Fixed side-effects of Impute Missing Values in other branches of the process.
Fixed wrong behavior when dismissing confirmation dialog asking for interruption of currently running process.
Fixed Delete File not being able to handle relative paths.
Meta data calculation of Generate Nominal Data can no longer cause freezing.
Optimize by Generation (Evolutionary Aggregation) no longer does one iteration too much.
Fixed Number of threads setting having no effect for Decision Tree and Random Forest if it was set to 1 and then increased again.
Fixed rare error that could occur when displaying a grouped model in the results view.
Developers:
Added a temporary API for operators which should run in a parallelized fashion. Use the com.rapidminer.studio.concurrency.internal.ConcurrencyExecutionServiceProvider to access it.
Notes:
The existing Read SAS operator has been deprecated. There is a new SAS connector extension available on the Marketplace which provides an up-to-date replacement of the operator.
Removed the compatibility level 7.1.1 of the operators Normalize, Replace Missing Values, Replace Infinite Values, Add Noise. These operators will no longer affect other branches of the progress even for processes created with compatibility level 7.1.0 or below.

New in RapidMiner Studio 7.3.1 (Dec 14, 2016)

New in RapidMiner Studio 7.3.0 (Nov 7, 2016)

Enhancements:
New parallel Cross Validation operator replaces X-Validation, Batch X-Validation, and X-Prediction.
Operator search now also searches for matching Marketplace extensions
Greatly improved Proxy UI and logic
Logistic Regression, Generalized Linear Model and Gradient Boosted Trees now return Attribute Weights output as well
Added reproducible parameter to Logistic Regression, Generalized Linear Model and Gradient Boosted Trees. If checked, the result is guaranteed to be the same, because the parallelization level is fixed.
Improved sorting for repository entries.
Performance improvement for Rule Induction and Perceptron operators.
Improved high DPI support.
Improved operator progress for Apply Model and Logistic Regression (SVM).
Improved welcome dialog layout.
Bug fixes:
Fixed NullPointerException in Logistic Regression and Generalized Linear Model with compute p-values on and solver set to AUTO on an input with large number of nominal values
Changed the default of the max_w2 parameter of Deep Learning to 10, as the operator help describes; it also became a non-advanced parameter
Fixed some minor tutorial inconsistencies
If there is a security error, Logistic Regression, Generalized Linear Model, Gradient Boosted Trees and Deep Learning operators can recover without Studio / Server restart
Input data rebalancing in Logistic Regression, Generalized Linear Model, Gradient Boosted Trees and Deep Learning no longer depends on the number of cores but the number of threads (configurable)
Logistic Regression, Generalized Linear Model, Gradient Boosted Trees and Deep Learning operators are now loaded even if javafx package is missing from the Java Runtime Environment
Fixed multiple problems with the GSP operator
Operator progress now vanishes if operator is successfully stopped
Fixed operator progress animation being stuck sometimes
Fixed import excel data UI issues on Mac OS X
Fixed that in-Hadoop scoring of Logistic Regression, Generalized Linear Model, Gradient Boosted Trees and Deep Learning models in Rapidminer Radoop no longer logs something for each row (leads to significant performance improvement)
Development:
Added a centralized API for data table creation: From now on a new ExampleSet should be created via an ExampleSetBuilder provided by the ExampleSets class instead of using MemoryExampleTable
Tweaked project structure for the open source core. This does not affect the functionality of RapidMiner Studio.

New in RapidMiner Studio 7.2.3 (Oct 11, 2016)

New in RapidMiner Studio 6.5.001 (Sep 21, 2015)

New in RapidMiner Studio 6.5.000 (Sep 21, 2015)

Expression engine now offers clearer interface, simpler syntax, and significant performance gains
Improved pre-flight check and runtime error messages
Hive Connector
Enhancements:
Completely overhauled problem and error notifications when running processes
All Learner Models will show an error rather than log a warning when applied on incompatible data
Repositories are now sorted by type and name
Improved churn template when using custom data
Improved performance when navigating RapidMiner Server repositories over a slow connection
Execute Process nesting depth is now limited to prevent endless loops; the maximum depth can be tweaked in the preferences
Added Netezza 7.0 JDBC support
Added a new "Move into new Subprocess" action that allows moving a group of selected operators into a Subprocess operator
Standard dialogs now support hyperlinks in the description
API: ParameterTypeText is now able to handle template text that is shown in the TextPropertyDialog if no text is set
API: Removed SassyReader and kdb dependencies, increased SLF4J API dependency to version 1.7.12
Bug fixes:
BUGFIX: Fixed possible startup problems when the _JAVA_OPTIONS environment variable is set
BUGFIX: Fixed rare cases of Studio becoming unresponsive because dialogs opened behind other dialogs
BUGFIX: When opening a process from the Server Processes view, confirmation is now required before an unsaved process is discarded
BUGFIX: Fixed rare problem when trying to save preferences
BUGFIX: Fixed some copy and paste problems of process notes
BUGFIX: Fixed Generate Data performance when selecting gaussian mixture clusters as the target function
BUGFIX: Fixed several problems when both Process and XML views were open and visible at the same time
BUGFIX: "Sample (Bootstrapping)" now duplicates examples when upsampling data
BUGFIX: Averaging of Performance Vectors can now handle additional or fewer classes after the first iteration
BUGFIX: Aggregate operator now supports non-alphanumerical attribute names for grouping
BUGFIX: Execution order is now up-to-date even if process validation has not finished
BUGIFX: Fixed computation of binary classification criteria (performance) for remapped binominal labels
BUGFIX: Decision Tree and Random Forest can now handle an unbounded number of different label values
BUGFIX: 'Principal Components Analysis', 'Generalized Hebbian Algorithm', 'Independent Component Analysis' or 'Principal Component Analysis (Kernel)' in combination with Apply Model no longer modify the original example set
BUGFIX: Decision Tree(rule) model edge labels now correctly display dates instead of Unix timestamps in the Results perspective
BUGFIX: Read Access and Write Access now work with 64-bit Java and Java 8
BUGFIX: Log operator no longer silently fails if duplicate column names have been entered
BUGFIX: Fixed rare case where the Chart view in the Results perspective was broken
BUGFIX: Fixed rare case where the date format field vanished in data import dialogs
BUGFIX: Context data is no longer loaded when the input port is not connected
BUGFIX: Generate Attributes no longer forgets roles in metadata if an attribute is overwritten
BUGFIX: Read Excel, Read CSV, and Read XML can now be stopped
BUGFIX: Metadata of Execute Process operators is no longer calculated if an endless process loop is suspected
BUGFIX: Loop Files operator now shows an error message if the directory is invalid or the user has insufficient privileges
BUGFIX: Fixed an error that occurred in Write Database with an empty JNDI name
BUGFIX: Fixed problems with reconnecting operators after the 'Replace Operator' action
BUGFIX: Fixed displayed number of combinations for integer parameters in Optimize Parameters (Grid)
BUGFIX: Fixed jumping to correct subprocess when clicking on the cause of a failed process in the error dialog
BUGFIX: Generate Attributes can now be stopped
BUGFIX: Fixed a bug that occurred when trying to install a non-existent extension via one-click installation
BUGFIX: Fixed reading of XLSX files with cells that contain mixed font formats
BUGFIX: Now max 100 attributes are shown in regex dialogs to prevent GUI freezes
BUGFIX: Fixed a rare bug that occurred while refreshing a remote repository with a remote database

New in RapidMiner Studio 6.4.000 (Sep 21, 2015)

A new method of workflow annotation:
Collaboration among stakeholders is key for analytics initiatives and projects. With the new workflow annotation capabilities of RapidMiner Studio, you can now annotate RapidMiner processes using stickers on the Process view canvas. These stickers can be freely placed and re-sized anywhere on the canvas, including attached to individual operators.
With this tool, you can easily and visually document whole analytic processes, highlighted parts of a process, or individual steps within a process--as you build. These capabilities greatly improve collaboration among users as well as ease and streamline the maintenance and auditing of analytic processes. The new workflow annotation feature replaces the old process and operator commenting functionality. Any existing process or operator comment is automatically converted into workflow annotations when loading a process.
New extensions:
Improved R integration: RapidMiner Studio 6.4 features an improved integration of the well-adopted statistical programming language R. The integration focuses on providing the core functionality needed when combining RapidMiner with R. Now, you can execute R code from within a RapidMiner process, passing data to R and passing the result of the R code execution back to RapidMiner after executing the R script. The integration has been completely revised, resulting in not only an easier installation and configuration in RapidMiner Studio and RapidMiner Server, but also in a more stable and secure integration solution. The R integration is delivered as a new extension called R Scripting, which supersedes the earlier R Extension.
Python integration: Analogous to the R integration, RapidMiner Studio 6.4 introduces integration with the data scientist-friendly Python programming language. You can now easily integrate Python code into your RapidMiner processes. As with R, data can be passed seamlessly from RapidMiner to Python, where it can be manipulated and used for model building or charting; Python results can then be transferred back and made available in RapidMiner.
Splunk connector: RapidMiner now provides native connectivity to Splunk, a platform for storing, searching, monitoring, and analyzing machine-generated data. With the RapidMiner Studio 6.4 connector operator, you can now build Splunk data ingestion into RapidMiner processes for deeper analysis.
Extension development kit: RapidMiner Studio 6.4 makes it much simpler to develop new extensions. First, we provide an extension template on Github that users can easily clone. Using Gradle as a modern build tool, we then provide scaffolding capabilities to quickly create a new extension stub. Also provided is documentation on how to: extend RapidMiner, implement specific operators, make use of RapidMiner's data structures, and more.
One-click extension installation: With RapidMiner Studio 6.4, you can install extensions directly from the RapidMiner Marketplace website with a single click. Each Extension page displays a button that, when clicked, starts up RapidMiner Studio and then the automatic extension installation.
New Mac version of RapidMiner Studio:
The RapidMiner Studio 6.4 Mac download contains an installer app that significantly eases and accelerates Mac installation. RapidMiner Studio now feels and behaves like a native Mac application.
Enhancements:
Improved Process history view
Connections to RapidMiner Server no longer require equal license editions for Studio and Server. For example, professional-level RapidMiner Studio can now connect to Enterprise-level RapidMiner Server.
Improved visual feedback for port and connection interactions in the Process view
Drastically improved Process view performance
Cleaned up right-click context menu in the Process view
RapidMiner Server connections are now editable in RapidMiner Studio
Breakpoints in subprocesses are now indicated in the top right corner of the Process view
Dragging multiple repository entries into a process is now possible
Updated keyboard shortcuts and mouse handling improves Mac user experience
Ctrl + Backspace is now available for text inputs and deletes an entire word instead of a single character
On opening, problem display only occurs if a critical problem was detected
In Select Attribute operators, numeric conditions now ignore blank spaces
Improved error message shown when class weights are specified for classes that do not exist
Added display of release platform to the About screen
Unmanaged extensions are now also loaded from ~/.RapidMiner/extensions if not specified otherwise in Preferences
All sample processes have been updated and improved to be compatible with the current version
Added new sampling type of automatic to the X-Validation operator
Operator search only expands groups with hits inside
Operator search is case sensitive when search term starts with an upper case letter
API: Added draw decorator and event hooks for the Process view. See ProcessRendererView#addDrawDecorator() and ProcessRendererView#addEventDecorator().
Bug fixes:
BUGFIX: Safemode dialog on startup is no longer sometimes hidden behind other windows
BUGFIX: Update Database now closes database connections after finishing
BUGFIX: Restarting after activating a license with more memory now correctly increases available memory on Windows
BUGFIX: A more meaningful error message is displayed when an invalid numeric condition is entered as a parameter
BUGFIX: Adding new database drivers via the Manage Database Drivers dialog no longer requires a restart
BUGFIX: Fixed rare error that could prevent the Manage Database Connections dialog from opening
BUGFIX: Fixed broken parameter help content for some operator parameters
BUGFIX: Calculation of a SOM-plot can now be cancelled
BUGFIX: It is no longer possible to drag operators out of the Process view
BUGFIX: Fixed rare error that could occur during automatic operator port connection
BUGFIX: Scrolling speed in the Process view is increased
BUGFIX: Fixed duplicate entry error in the History view
BUGFIX: Fixed Guess Types operator which occasionally took only the last numerical value into account
BUGFIX: A more meaningful error message is displayed when using Add generated primary keys for writing to MSSQL databases
BUGFIX: Fixed broken Execute Process operator help
BUGFIX: Disabled zoom functionality in Histogram Charts
BUGFIX: A more meaningful error message is displayed when using the Hyper Hyper operator with invalid input
BUGFIX: Principal Component Analysis operator works when applied on special attributes with missing values
BUGFIX: Fixed Read Excel operator encoding errors on Windows 8.1
BUGFIX: In Excel import wizard, wrong-typed values are parsed as missing instead of causing an error
BUGFIX: Removed unused parameter attribute type from Discretize by User Specification operator
BUGFIX: Fixed some broken templates and sample processes
BUGFIX: Clustering models now work with special attributes that contain missing values
BUGFIX: K-Medoids operator now always uses the selected measure type
BUGFIX: Fixed rare cases of broken standard coefficients for Linear Regression operator
BUGFIX: Right-clicking an operator now selects it before opening the popup menu (Linux/Mac)
BUGFIX: When installing extensions from Marketplace, dependencies are only added if not yet installed
BUGFIX: Marketplace dialogs now always open in the correct order
BUGFIX: The date functions of Generate Attributes operator now add correct metadata for new attributes
BUGFIX: Operator text parameter dialogs (e.g., the SQL query dialog) can now be closed by pressing Ctrl + Enter
BUGFIX: The log level of the Log view is now correctly restored on each start

New in RapidMiner Studio 6.3.000 (Sep 21, 2015)

Improved Startup and Onboarding:
RapidMiner Studio 6.3 greatly improves the first-time startup and onboarding experience. Manual installation of a license key has become obsolete. Now, simply log on to RapidMiner.com and either a trial license or any commercial license associated with your user account is automatically installed. After license installation, a new onboarding dialog recommends next steps, helping you start using RapidMiner quickly and effectively.
Wisdom of Crowds: New and Improved Recommenders. With the Wisdom of Crowds features, RapidMiner users can get help designing and implementing analytical workflows and building predictive models. These features offer next-step recommendations based on the knowledge and best practices of other RapidMiner users. RapidMiner Studio 6.3 provides the following enhancements:
Context-aware operator recommender: The operator recommender, first introduced in RapidMiner 6.1, helps you design by recommending operators to add to your process. Initially the feature recommended operators based on the complete process; now, the recommender evaluates the current subprocess selection for more granular assistance. For example, recommendations differ significantly when you are looking at the top-level process and when you have drilled into a subprocess (e.g., an X-Validation operator). By considering the context, the operator recommender provides much more focused and accurate recommendations.
Parameter recommender: The new parameter recommender helps you configure operators and set the parameters of a selected operator. The tool not only provides recommendations on which parameters to change, it also suggests appropriate values to select for those parameters.
Improved Excel Import:
RapidMiner Studio 6.3 dramatically improves one of the most widely used RapidMiner features — the import of Excel files. Previously, due to suboptimal parsing of XML-based Excel files (Excel 2007 and above), an Excel import caused excessive memory consumption, and reading large files took quite some time. RapidMiner Studio 6.3 reduces memory consumption overhead (by up to 30x in some test cases) and speeds file reading (up to 5x faster) by moving away from the formerly used library and implementing the necessary functionality within RapidMiner itself.
Version Control:
A new version management feature allows you to start new revisions of a process while keeping them in parallel with older versions. Available with RapidMiner Server 2.3 running with RapidMiner Studio 6.3, processes can now be rolled back and forth between old and new revisions.
Other Changes and Bugfixes:
Progress dialog no longer opens when saving the process to a remote location
The file chooser dialog for 'Read Excel' now defaults .xlsx and .xls files
'Write Excel' format is now XLSX instead of XLS
The operator 'Execute Process' now shows a button to open the selected process in the parameter view
Parameter help is shown in a tool tip window when hovering over the information symbol
Histogram Charts now use date instead of numerical axis in case more than one date attribute is selected
Added Netezza JDBC support
The Application Wizard is now called Accelerator
BUGFIX: The 'Read Salesforce' operator can now handle relationship queries
BUGFIX: Operator recommendations now always appear when creating a new process or switching to the Design perspective
BUGFIX: Fixed process recovery encoding problem on Windows which could break umlauts and other symbols
BUGFIX: Fixed row deletion error in 'Edit Parameter List' dialog
BUGFIX: The recent analysis list in Home perspective no longer extends below the visible area of the monitor
BUGFIX: Naive Bayes is now handles dates correctly
BUGFIX: SVM models can now only be applied on ExampleSets with the same attributes
BUGFIX: Stratified sampling with a defined local random seed now produces the same output on every system
BUGFIX: The Surface 3D chart now limits the number of data points (to ensure good performance)
BUGFIX: The chart for distribution model attributes limits the number of nominal values (to ensure good performance)
BUGFIX: Fixed a validation error that occurred when choosing an inverted set of attributes
BUGFIX: Fixed a validation error that occurred when 'Execute Process' referenced the operator's process
BUGFIX: Fixed local repository not being created in some special cases when starting for the first time
BUGFIX: Tooltips now work with modal dialogs after being focused via F3

New in RapidMiner Studio 6.2.000 (Sep 21, 2015)

Added operators 'Publish to App' and 'Recall from App' and a new view 'App Objects' for RapidMiner Server App manipulations
Resizing the attribute name column in the Statistics view of process results is now possible
New processes can now be saved via save button or ctrl+s
Improved error messages for broken custom filters in the 'Filter Examples' operator
Improved error message when selecting special attributes in an operator despite special attributes not being included
Show Git revision of RapidMiner Studio release in About window
Improved speed and behavior of 'Decision Tree' and 'Random Forest' operators
BUGFIX: Fixes problems with single parameter selection for several Java implementations
BUGFIX: Fixed opening of stored results via the result history
BUGFIX: Operator port tooltips should no longer cover the port
BUGFIX: Charts should now display 'Missing' instead of '1.1.1970' for missing values in date attributes
BUGFIX: 'Update Database' should throw a more reasonable error message in case the database user lacks permission
BUGFIX: 'Neural Net' operator works again when applied on special attributes with missing values
BUGFIX: 'Neural Net' can no longer be applied on incompatible data
BUGFIX: The expression parser function round() now returns a missing value instead of 0 when applied on a missing value
BUGFIX: 'Sample (Bootstrapping)' operator now throws a reasonable error message in case the input example set is empty
BUGFIX: Moving colors in the color scheme dialog of Advanced Charts does not save duplicates anymore
BUGFIX: Fixed a bug which occurred when an optional password field was left empty
BUGFIX: Fixed overwriting an already existing file in Import Binary File Wizard
BUGFIX: Fixed a UI problem that occurred when a Collection with empty ExampleSets was displayed
BUGFIX: Fixed operator tree display in log view which is shown in case of a process error
API: Introduced AbstractConfigurator which deprecates the Configurator class. The AbstractConfigurator improves parameter dependency handling for Configurables
API: removed Encog dependency and all deprecated classes that used Encog
API: Added capability to allow parallel processing inside operators

New in RapidMiner Studio 6.1.000 (Sep 21, 2015)

Overhauled Repositories view: Now multiple elements can be selected, copied, moved and deleted at the same time
Completely revised preferences dialog to make customization of RapidMiner Studio more accessible
Drastically sped up Log view for larger logs
Improved startup code to reduce launch problems. Also memory settings are now based on the actual free memory when starting for Win32 versions. Furthermore added property in 'System' tab in the preferences where the maximum amount of memory for RM Studio can be configured
Improved SQL editor dialog responsiveness
It is now possible to ignore meta data for the 'Filter Examples' GUI
'Weight by' operators: The default value of the parameter normalize weights is now false
BUGFIX: Results containing missing values are sorted correctly
BUGFIX: Update Database now throws a meaningful error when the input example set contains no attributes
BUGFIX: Improved error message when applying a PCA model to incompatible data
BUGFIX: More meaningful error message when a mandatory attribute is not selected
BUGFIX: Loop/Optimize parameters are not longer dismissed if selection changes
BUGFIX: Distribution Models will no longer be able to be applied on subsets of the training set or sets with same name but other type
BUGFIX: Log Operator now uses modern UI to show the result
BUGFIX: Fixed Linear Regression matrix calculation corner cases which could lead to missing values for standard error, t-stat, and p-value
BUGFIX: Fixed an issue that caused Top Down Clustering to fail
BUGFIX: Replace (Dictionary) maps each value only once
BUGFIX: Fixed an issue that sometimes caused data in the results perspective to be shown with a null source
API: Added support for parameter dependencies in the Configurable framework (see Configurator#getParameterHandler())
API: Added operator parameter type which can display a file chooser for arbitrary remote file systems (see ParameterTypeRemoteFile)
API: Added greater control over preferences internationalization and layout (see SettingsDialog)

New in RapidMiner Studio 6.0.008 (Sep 21, 2015)

New in RapidMiner Studio 6.0.007 (Sep 21, 2015)

New in RapidMiner Studio 6.0.006 (Sep 21, 2015)

Improved copy and paste functionality of the process editor
Added new logging mechanism which can also be used by extensions to display their own logs in the default log view
Added parameter to Parse Numbers operator to show an error message or use missing values if a value can't be parsed
On lower screen resolutions smaller plot preview icons will be used
Aggregate operator throws an error when the example set does not contain attributes selected by the parameter "group by"
Improved the ability to stop the process while executing a Join operator
Ports of disabled operators are now highlighted to indicate that interaction is possible
Loop/Optimize Parameters GUI now automatically selects newly added parameter
Refreshing a repository folder is now possible regardless whether a folder or a data entry is selected
New chart type added: Web
Improved tooltip behavior
Improved resizing of subprocesses
Added parameter to Loop/Optimize Parameters which specifies how errors occurring in the inner process should be handled
Switching perspectives now remembers focused tabs and the position of all scroll bars
BUGFIX: Fixed problem with prepared statements in Read Database and Execute SQL operators
BUGFIX: Data readers will no longer automatically choose binominal as the value type to avoid import failures
BUGFIX: Saving a process can no longer freeze the user interface
BUGFIX: Storing/Reading models in XML representation works again when executing the process on RapidMiner Server
BUGFIX: Pasting process xml into the process view directly no longer messes up the layout and the connections
BUGFIX: Execute Process: Number of ports shown by operator matches ports used by embedded process.
BUGFIX: Weighting Operators which require a label attribute now throw an error if no label is present
BUGFIX: Superset and Union operators now fail with a better error message if the special attributes do not match
BUGFIX: macro() can now be used in the expression condition at Branch
BUGFIX: Loop Repository: using the parent folder name as filtered string does not throw an error anymore
BUGFIX: The Cumulative Variance plot for the PCA now displays the correct values
BUGFIX: Excel Operators show a human readable Error if wrong sheet is selected
BUGFIX: Aggregate now detects DATE_TIME in MetaData
BUGFIX: Predefined operator macros are working again
BUGFIX: Data import operators of extensions are no longer sometimes displayed as disabled for some licenses
BUGFIX: Use correct file filter for Loop Zip-File Entries file chooser
BUGFIX: Read and Update Database operators can now be stopped
BUGFIX: Generate Macro will no longer add unnecessary zeros to the end of numbers
BUGFIX: Reduced logging at Generate Function Set if NaN was generated
BUGFIX: Operators which provide a subset selection now show an error if selected attributes are not present
BUGFIX: Correct display of operator status when starting a process
BUGFIX: Catch errors when trying to parse empty strings to numbers
BUGFIX: Remember/Recall operators now use a more sensible default for the io object type
BUGFIX: Fixed endless loop in Logistic Regression
BUGFIX: Generate Data can now be stopped
BUGFIX: Import wizards now ignore the check for duplicate names regarding columns that are disabled
BUGFIX: Linear, Quadratic and Regularized Discriminant Analysis can now be stopped
BUGFIX: K-Means, Linear Regression and SVM now ignore missing values in special attributes, except for the label
BUGFIX: Generate Nominal Data operator can now be stopped
BUGFIX: The arrange operators function no longer adds horizontal space between operators unnecessarily
BUGFIX: Fixed Filter Examples operator failing on date filters for dates before 1970
BUGFIX: The Split operator correctly outputs missing values if the input value was missing
BUGFIX: The Replace (Dictionary) operator now displays a meaningful error message if the to or from parameters are left undefined
BUGFIX: The displayed error, when using an invalid expression in the Branch operator, now contains a link to the operator
BUGFIX: Fixed a rare error while loading extensions on startup
BUGFIX: RapidMiner remembers all tabs that are visible and keeps them focused between perspective switches
BUGFIX: Tooltips in New Operator Dialog are now correctly formatted
BUGFIX: The Loop Repository operator now shows an error when the selected repository location does not exist
BUGFIX: Building Block Numerical X-Validation now defaults to shuffled sampling
BUGFIX: Improved error handling when pasting an unsupported file into the process editor
BUGFIX: More meaningful error message when a wrong attribute is selected in some operators

New in RapidMiner Studio 6.0.005 (Sep 21, 2015)

Improved copy and paste functionality of the process editor
Added new logging mechanism which can also be used by extensions to display their own logs in the default log view
Added parameter to Parse Numbers operator to show an error message or use missing values if a value can't be parsed
On lower screen resolutions smaller plot preview icons will be used
Aggregate operator throws an error when the example set does not contain attributes selected by the parameter "group by"
Improved the ability to stop the process while executing a Join operator
Ports of disabled operators are now highlighted to indicate that interaction is possible
Loop/Optimize Parameters GUI now automatically selects newly added parameter
Refreshing a repository folder is now possible regardless whether a folder or a data entry is selected
New chart type added: Web
Improved tooltip behavior
Improved resizing of subprocesses
Added parameter to Loop/Optimize Parameters which specifies how errors occurring in the inner process should be handled
Switching perspectives now remembers focused tabs and the position of all scroll bars
BUGFIX: Data readers will no longer automatically choose binominal as the value type to avoid import failures
BUGFIX: Saving a process can no longer freeze the user interface
BUGFIX: Storing/Reading models in XML representation works again when executing the process on RapidMiner Server
BUGFIX: Pasting process xml into the process view directly no longer messes up the layout and the connections
BUGFIX: Execute Process: Number of ports shown by operator matches ports used by embedded process.
BUGFIX: Weighting Operators which require a label attribute now throw an error if no label is present
BUGFIX: Superset and Union operators now fail with a better error message if the special attributes do not match
BUGFIX: macro() can now be used in the expression condition at Branch
BUGFIX: Loop Repository: using the parent folder name as filtered string does not throw an error anymore
BUGFIX: The Cumulative Variance plot for the PCA now displays the correct values
BUGFIX: Excel Operators show a human readable Error if wrong sheet is selected
BUGFIX: Aggregate now detects DATE_TIME in MetaData
BUGFIX: Predefined operator macros are working again
BUGFIX: Data import operators of extensions are no longer sometimes displayed as disabled for some licenses
BUGFIX: Use correct file filter for Loop Zip-File Entries file chooser
BUGFIX: Read and Update Database operators can now be stopped
BUGFIX: Generate Macro will no longer add unnecessary zeros to the end of numbers
BUGFIX: Reduced logging at Generate Function Set if NaN was generated
BUGFIX: Operators which provide a subset selection now show an error if selected attributes are not present
BUGFIX: Correct display of operator status when starting a process
BUGFIX: Catch errors when trying to parse empty strings to numbers
BUGFIX: Remember/Recall operators now use a more sensible default for the io object type
BUGFIX: Fixed endless loop in Logistic Regression
BUGFIX: Generate Data can now be stopped
BUGFIX: Import wizards now ignore the check for duplicate names regarding columns that are disabled
BUGFIX: Linear, Quadratic and Regularized Discriminant Analysis can now be stopped
BUGFIX: K-Means, Linear Regression and SVM now ignore missing values in special attributes, except for the label
BUGFIX: Generate Nominal Data operator can now be stopped
BUGFIX: The arrange operators function no longer adds horizontal space between operators unnecessarily
BUGFIX: Fixed Filter Examples operator failing on date filters for dates before 1970
BUGFIX: The Split operator correctly outputs missing values if the input value was missing
BUGFIX: The Replace (Dictionary) operator now displays a meaningful error message if the to or from parameters are left undefined
BUGFIX: The displayed error, when using an invalid expression in the Branch operator, now contains a link to the operator
BUGFIX: Fixed a rare error while loading extensions on startup
BUGFIX: RapidMiner remembers all tabs that are visible and keeps them focused between perspective switches
BUGFIX: Tooltips in New Operator Dialog are now correctly formatted
BUGFIX: The Loop Repository operator now shows an error when the selected repository location does not exist
BUGFIX: Building Block Numerical X-Validation now defaults to shuffled sampling
BUGFIX: Improved error handling when pasting an unsupported file into the process editor
BUGFIX: More meaningful error message when a wrong attribute is selected in some operators

New in RapidMiner Studio 6.0.003 (Sep 21, 2015)

Added new dialog to create and manage various connections
Tasks (shown in the lower right corner) should no longer unintentionally block each other
Process result display creation should be much faster now
Added attribute statistics when hovering over a table header in the example set result view.
New order for special attributes in data and meta data result view
Execute SQL dialog now has syntax highlight and content assist (ctrl+space)
Extension can now declare more than one dependency
Added 'unmatched example set' output port to Filter Examples operator which outputs all examples that did not match the specified condition
Added parameter to De-Normalize operator to control handling of missing attributes
Added parameter to Execute Process which allows to control if process should fail if you define a macro which is not defined in the context of the embedded process
Added GUI parameter rapidminer.gui.plotter.default.maximum which defines the maximum size of an example set for which a default plot will be created
BUGFIX: Vote operator should be functional again
BUGFIX: Excel 2007 import no longer fails when the sheet contains nominal formula values
BUGFIX: Custom filters for the Filter Examples operator should no longer crash when selecting the 'matches' filter on empty input
BUGFIX: FindThreshold operator now throws error if the confidence role has the wrong name or does not exist
BUGFIX: Fixed bug preventing storage of Lift charts in the repository
BUGFIX: Fixed bug in expression parser which did not remove faulty expressions, leading to errors in later runs
BUGFIX: Fixed bug that prevented the usage of global process-related macros
BUGFIX: Loop Repositories operator can now be stopped
BUGFIX: Fixed recent processes being sometimes cut off in the Welcome perspective
BUGFIX: Fixed wrong default file extension for directory and file parameters
BUGFIX: Fixed rearranging of operators in subprocesses
BUGFIX: Fixed bug when creating charts for an empty example set
BUGFIX: Optimize Parameters Operator now interrupts with an understandable explanation when no performance values were delivered
BUGFIX: Fixed error with password fields when the password is less than 4 characters long
BUGFIX: Vector Linear Regression now checks for missing values
BUGFIX: Fixed scrolling when moving operators outside of visible area
BUGFIX: Support Vector Machine(LibSVM) can now be stopped
BUGFIX: Fix result of Join operator with only missing values in ID nominal attribute
BUGFIX: Decision Tree operators no longer fail with a cryptic error message when the label attribute contains missing values
BUGFIX: Generate Macro no longer proceeds if an error occurred during macro generation
BUGFIX: Using undefined macros as operator parameters now causes an error when executing the process
BUGFIX: Applying a k-NN model can now be stopped
BUGFIX: Logistic Regression (Evolutionary) can now be stopped
BUGFIX: NominalToNumerical can now be stopped
BUGFIX: Optimize Parameters (Evolutionary) can now be stopped
BUGFIX: Polynomial Regression can now be stopped
BUGFIX: Remove Duplicates operator can now be stopped
BUGFIX: Self-Organizing Map operator can now be stopped
BUGFIX: Support Vector Machine (Evolutionary) can now be stopped
BUGFIX: In most cases programs executed with Execute Program operator can now be stopped properly
BUGFIX: The chart selection menu in the results perspective should no longer appear in strange locations

New in RapidMiner Studio 5.2.008 (Jul 10, 2012)

New in RapidMiner Studio 5.2.003 (Mar 27, 2012)

New in RapidMiner Studio 5.2.002 (Mar 7, 2012)

New in RapidMiner Studio 5.2.001 (Feb 24, 2012)

New in RapidMiner Studio 5.2.000 (Feb 2, 2012)

New in RapidMiner Studio 5.1.016 (Jan 5, 2012)

New in RapidMiner Studio 5.1.015 (Dec 21, 2011)

New in RapidMiner Studio 5.1.006 (Mar 31, 2011)

New in RapidMiner Studio 5.1.000 (Dec 16, 2010)

New in RapidMiner Studio 5.0.010 (Aug 9, 2010)

New in RapidMiner Studio 4.5 (Jul 21, 2009)

New in RapidMiner Studio 4.4 (Mar 16, 2009)

New operators:
ExampleSetSuperset
ExampleSetUnion
MacroConstruction
CumulateSeries
FastLargeMargin
Split
Construction2Names
NeuralNetSimple
Parameters will now be adapted according to an operator rename, for example the settings of operators like the ProcessLog or the parameter optimization operators are automatically corrected to the new operator names
Graphs like the similarity graph display the strengths of the edges now by their color
Added new tree layout algorithm for the decision trees preventing most overlapping, the old tighter version is available as layout type "Tree (Tight)"
Decision trees now show the subtree size as tool tip for the inner nodes, the edges are now darker for larger subtrees and brighter for smaller ones
Decision trees are learned faster now due to internal optimizations in the splitted example set handling
Tables like the (meta) data view now supports a new context menu for common table operations like column sorting or row / column selection
The "New Operator" dialog now also supports full text search in the description texts of the operators
RapidMiner now stores all parameter values in the process files including the default values which ensures a better compatibility with future versions. The XML tab, however, only shows the values differing from the default
Plugins can now define a class com.rapidminer.PluginInit providing a method "initPlugin()" which will be invoked during plugin initialization
Univariate and multivariate series windowing operators now also support nominal attributes and even mixed types in cases where the series is represented by the examples (rows) of the data set
The range statistics of nominal attributes in the meta data view now shows the values with highest and lowest occurrency counts, sorts the values according to the counts, and displays only an excerpt of the occurring values if large amounts of different values exist
List of recent files is now directly saved after opening a new process and not only during shutdown
Changes in the process setup are now allowed even during process runtime, e.g. when waiting at a breakpoint
NaiveBayes can now handle new nominal values during the model application phase
Deprecated operators are now rendered with a gray color in the new operator tab and dialog
Updated to the latest version of Weka (as of February 26th, 2009)
Updated to the latest version of Joone, optimized some of the neural network default parameters
Added some new sample processes to the sample directory as well as to the tutorial
ExampleFilter and most important discretization parameters are no longer expert parameters
ArffExampleSource now states an error message in cases where attributes containing a space which is not quoted
New binominal classification performance measures:
positive predictive value
negative predictive value
psep
Implementation details:
SplittedExampleSet has been improved leading to faster data access times for operators like cross validation or decision tree learning
Plugins can now define a class com.rapidminer.PluginInit providing a method "initPlugin()" which will be invoked during plugin initialization
Bugfixes:
fixed bug accuracy criterion for the revised decision tree learner
Fixed bug in parameter list of ValueSubgroupIterator
Fixed bug in ExceptionHandling which sometimes led to doubled outputs
Fixed bug in ProcessBranch which sometimes led to doubled outputs
ViewAttributes did not add min and max statistics so that those statistics where not calculated on data table views
Fixed bug in Windows GUI start script (linebreak)
Fixed bug for surface 3D plot where x and y were replaced by each other
Fixed paths to icons for building blocks
Fixed issue with ROC plots in cases where several points with same confidence occurred
Fixed potential thread deadlock during the filling of the plotter list
Fixed bug for distance weighted vote and k = 1 in NearestNeighbors
Fixed a bug in ChiSquaredWeighting for mixed-type data sets where the number of bins was smaller than the maximum number of nominal values
The default global random seed in the preferences dialog was not allowed to be set to -1
The property keys of the preferences dialog were editable
Fixed bug in PolynomialRegression
Range normalization now delivers maximum value for constant attributes
Weighted precision and recall do now no longer deliver NaN if a class did not occur

New in RapidMiner Studio 4.3 (Nov 24, 2008)

New operators:
AccessExampleSource
Example2AttributePivoting
Attribute2ExamplePivoting
PolynomialRegression
Similarity2ExampleSet
ExampleSet2SimilarityExampleSet
Nominal2String
String2Nominal
Date2Numerical
Real2Integer
Numerical2Real
Nominal2Numerical
Numerical2Binominal
Numerical2Polynominal
AbsoluteDiscretization
ConditionedFeatureGeneration
AttributeAggregation
SupportVectorCounter
MutualInformationMatrix
GaussFeatureConstructionOperator
ProductGenerationOperator
AbsoluteValues
MovingAverage
ExponentialSmoothing
SeriesMissingValueReplenishment
DifferentiateSeries
IndexSeries
Numerical2Real
Real2Integer
FillDataGaps
EnsureMonotonicity
WindowExamples2ModelingData
WindowExamples2OriginalData
ProcessLog2AttributeWeights
Mapping
Substring
Trim
Replace
AddValue
MergeValues
AttributeConstruction
ValueIterator
IOStorer
IORetriever
SQLExecution
ClearProcessLog
ProcessLog2ExampleSet
Data2Performance
Data2Log
Macro2Log
DataMacroDefinition
LiftParetoChart
Deprecated Operators:
Nominal2Numeric (please use Nominal2Numerical instead)
Numeric2Binominal (please use Numerical2Binominal instead)
Numeric2Polynominal (please use Numerical2Polynominal instead)
LinearCombination (please use AttributeAggregation instead)
AttributeValueMapper (please use Mapping instead)
AttributeValueSubstring (please use Substring instead)
AddNominalValue (please use AddValue instead)
MergeNominalValues (please use MergeValues instead)
New implementation of clusterings for more efficient computing and memory usage:
Reimplemented or adapted operators:
AgglomerativeClustering
ClusterModel2ExampleSet
DBScanClustering
ExampleSet2ClusterModel
FlattenClusterModel
KMeans
KMedoids
KernelKMeans
RandomFlatClustering
SupportVectorClustering
TopDownClustering
ClusterModelWriter
ClusterModelReader
TransitionMatrix
Removed operators:
AgglomerativeFlatClustering, use AgglomerativeClustering and FlattenClusterModel instead - BregmanHardClustering, use KMeans with BregmanDivergences instead - ExampleSet2ClusterConstraintList - MPCKMeans - TopDownRandomClustering, use TopDownClustering with RandomFlatClustering as inner learner - UPGMAClustering, use AgglomerativeClustering with average link instead - SimilarityComparator
The new AttributeConstruction operator supports infix written formulas, a simple format for constants and new calculation methodsBetter support for special characters in process XML
Macros are now also supported in parameter lists and for numerical parameters Added new overwriting mode to the DatabaseExampleSetWriter named "first overwrite, then append"
Replaced "append" parameter in ExampleSetWriter by the new overwriting modes "none", "overwrite", "append", and "first overwrite, then append"
ExampleFilter can now use regular expressions for the values of the nominal attribute value filtering
New Plotter: Pareto Chart
New Plotter: Series Multiple
New Plotter: Scatter Multiple
The old scatter plotter has been divided into a new Scatter plot and the new Scatter Multiple plot
Most plotters now support panning during zooming by pressing the Ctrl Key while dragging the mouse
The file chooser in the modern look and feel now always remembers the last directory from which a file was chosen as an additional default bookmark (on the left)
Changed the order the in which models are added to the grouped model (ModelGrouper), i.e. the last created model will now be added as last one
The wizards of the database reading and writing operators are now initialized with the last settings
The feature selection and feature weighting operators are now based on double arrays which should lead to smaller memory footprints
Added new performance measures: sensitivity, specificity, Youden index, relative error lenient, relative error strict
The CachedDatabaseExampleSource operator has now a more appropriate wizard
The plotters now provide consistent colors for classes
Improved the names of the features of the (multi-)variate windowing operators
Multivariate windowing now also supports a name for the label column in addition to the index
Multivariate windowing can now also applied without the creation of a label and even with horizon 0
Improved the graph and plotter panel for long column / item names, long names are now displayed in a short fashion and the full name is shown as tool tip
DecisionTree now supports a new parameter min_size_for_split
Added new process branch conditions: attribute_available, min_examples, max_examples, min_attributes, max_attributes.
The viewers for symmetrical matrices like correlations etc. now always shows the values of the first column
Improved the range names of discretized data
Added selection of criterion to AssociationRulesGenerator, also improved the visualization of association rules by adding a selector for the criterion used for the minimum value slider
Added new option for Normalization. Now might chose from z-transformation, range-transformation or the new proportional transformation via category selection.
LinearRegression is now also applicable on binominal classification tasks
Added support for logging only the top-k or bottom-k objects with the ProcessLog operator
Improved the parameter optimization / iteration dialog: small numbers are no longer cut off, GUI is more consistent, dialog now used icons Improved the CachedDatabaseExampleSource operator and database handling: now arbitrary tables are accepted and primary keys (index) and / or mapping tables are automatically handled
Integrated the latest version of the JFreeChart library
A dialog informs the users now if any unknown parameters were part of the process during loading
A SimpleVoteModel now supports the output of textual results
(Multivariate) Windowing on example based input representations now keep the input id attribute
Added writing of intermediate weights for GeneticAlgorithm (feature selection) and EvolutionaryWeighting (feature weighting), both operators now also support the initialization with attribute weights (e.g. from the last run)
Implementation Details:
Moved AnovaMatrix(Operator) into the package com.rapidminer.operatir.visualization.dependencies
Moved all attributes based matrix operators (correlation, covariance etc.) into the new package com.rapidminer.operatir.visualization.dependencies
Moved aggregation functions into package com.rapidminer.tools.math.function.aggregation
Bugfixes:
processes now only write the logged information from the run, not the global information for example collected from the GUI. Hence, the logging will also no longer directly overwrite old log files right after loading
switch workspace and initial workspace selection now prevent the selection of the RapidMiner main directory and all subdirectories in order to prevent a recursive copy
switched weight "direction" for corpus based weighting
fixed bug in evolutionary parameter optimization in combination with logging
fixed bug in Wizard for ExampleSource preventing the correct guess of value types (were always nominal)
fixed error in nominal re-mapping for cases where the nominal values of training and test set did not match
fixed jittering bug in Histogram plots causing the bins to drop out of the plotter
fixed minor bug in ExampleSetWriter which caused the ExampleSource operator to state a warning
fixed bug if special characters were part of the process XML
DistributionModel is updatable now
AttributeValueSubstring ignores missing values and is able to extract single characters now
Fixed a GUI error only occurring in Java 6 Update 10
Fixed bug in FeatureSubsetIteration where the specified maximum number of features was not used
Fixed bug in PerformanceVector writing from the result dialog (Save button) which led to large data files and long runtimes until the data was actually saved
Fixed bug in uninstaller which under certain circumstances also removed non-RapidMiner files in the installation directory

New in RapidMiner Studio 4.1 (May 19, 2008)

New operators
New 64 bit version for Windows x64 OS now provided; other 64 bit systems are supported by using a64 bit Java version
Parameter optimization operators now provide a nicer wizard dialog for setting the parameters
All GUI elements provide now longer descriptions for operators SplitChain and AbsoluteSplitChain were moved from the postprocessing into the meta group
Meta group was restructured and two subgroups (control and other) were added Fixed a memory leak in the result history which was affectingthe GUI for multiple processes if they were performed in asingle sequence
SOMDimensionalityReduction and SVDReduction are now able to createa preprocessing model BruteForce and GeneticAlgorithm feature selection now support a minimum and maximum number of features and also the selection of a exact number of features
RapidMiner now offers two different look and feels: modern(recommended) and classic
Improved comment tab so that it already registers and saves new text directly after it was typed (instead of changing the tab) DataStatistics (IOObject) now shows the standard deviation like in the GUI instead of the variance
Robustified ExampleSource wizard: the same output files as the input file are no longer allowed
Series Plotter does now no longer scale the axis ranges ina way that zero must be contained
All SVM and other hyperplane models now supports the visualization of a sortable data table for the coefficients (weights) An error message now indicates if XML entities are used for operator names which is not allowed
Anova calculator now allows value editing in table and the specification of the significance level
Meta data views can now be correctly sorted according to sum or unknown value columns MissingValueImputation: added warnings in the case that not all values could be imputed, improved attribute ordering (ascending and descending sorting, sort by number of missing values), added log messages
Naive Bayes distribution model now uses the same class coloring for both numerical and nominal distributions
Latest available Weka version integrated (as of 2008/05/09)
The AttributeParser no longer supports batch generations
The ClusterModel reader is now able to read both compressed and uncompressed files
PCA and GHA now use global covariance matrix calculation
LibSVMLearner now provides the correct range for the nuparameter
Fixed bug in AttributeParser which prevents the correct calculation for nested generations or cases where the generation is divided into several operators
Fixed bug in value type guessing for numerical columnswith missing values
Fixed bug in ExampleSetTranspose for missing values in nominal attributes
Fixed bug in DatabaseExampleSource Wizards for userdefined URLs Parameter lists are now cloned correctly
Fixed bug for quoted input files occuring in some cases where the quoted string was part of the line before
Fixed a bug for learning with example weights with the JMySVM learner
Fixed a NPE if empty example sets were used as input for feature selection operators
Fixed wrong normalization for confidences predicted by distribution models (e.g. NaiveBayes)
AttributeEditor and ExampleSource wizard did not regard the decimal point character (and quotes)
The value type guessing operators did not take a possible decimal point character different from '.' into account
Fixed tool tip for z-transform in Normalization operator: changed "variance" to "standard deviation"
Fixed locale for Ok
Cancel dialogs to US locale like the rest of RapidMiner
Fixed bug in operator tree which caused the reconstruction of the expansion state to be faulty in some cases
Fixed statistics copy bug introduced in 4.1beta2 for predicted label statistics

New in RapidMiner Studio 4.1 Beta 2 (Feb 19, 2008)

New operators:
ProcessBranch
FileEcho
ExchangeAttributeRoles
ChangeAttributeRole
SeriesPrediction
Deprecated operators: ChangeAttributeType (use ChangeAttributeRole instead)
New version of chart plotting library
New plotter: Series
Removed the numerical sample sizes for the tree and rule learners
Introduced different shapes for plotter points
Use bigger strokes for plotter lines
Added max_items parameter for FPGrowth
Changed default mode for view creation of preprocessing models
Added signum generator for manual feature generation and for generation with YAGGA2
Relief can now handle missing values
Changed default data representation back to double because toohigh number of rounding errors otherwise for larger data ranges
Introduced AttributeDescriptions and AttributeTransformations in order to lower large memory consumptions due to clones and to avoid re-wrappings for new views on the example set view stack
removed clone of mappings for clones of nominal attributes
Changed DataRow methods from package private to protected
ConditionedExampleSets no longer support dynamical conditions
Changed default data representation back to "double"
The visualization of integers and the nominal statistics calculation are now based on longs instead of integers
Fixed MAJOR bug introduced in 4.1beta in example sets /views which occured after a new view was created ontop of a splitted example set (e.g. in a cross validation) and has hidden the partition then
Fixed some problems (due to too much cloned objects, seeabove) which caused much more memory usage in 4.1beta
Fixed bug in PredictionTrendAccuracy calculation
Fixed wrong linefeeds in unix start scripts
Fixed bug in aggregation function selection of the chart plotters
Fixed ID handling bug for example sets (views) which prevented the correct application of Id-based operators like the ExampleSetJoin operator
Fixed bug in table index assignment of view attributes
Fixed bug in SortedExampleSet
Fixed bug in some plotters based on JMathplot
Removed remapId() call in IdUtils which increased theruntime of some clustering schemes (especially DBScan and SupportVectorClustering)
Fixed bug in RuleLearner for nominal attributes
Fixed bug for (operator / parameter) pair parameter values for the parameter iteration and optimizationoperators
Fixed wrong name for continous attributes in C45 loader
ConditionedExampleSet caused some problems if the base attributes for conditions were removed after thefiltering
Fixed a bug in getNominalValue(Attribute) of Example which delivered the first nominal value instead of missing values
File filters do now accept lower and upper caseextensions
Fixed wrong colors after sorting a column of the ANOVA matrix
Removed unnecessary statistics registration in nominal attributes consuming unused memory and runtime
Fixed rounding error in the stepwise parameteroperators
Removed data representation type query during firststartup since rounding errors are often too high
AbsoluteSampling produced sample with duplicates