| scikit-mutliflow | |
|---|---|
| Original authors | Jacob Montiel, Jesse Read, Albert Bifet, Talel Abdessalem |
| Developers | The scikit-mutliflow development team and the open research community |
| Initial release | January 2018 (2018-01) |
| Stable release | |
| Written in | Python, Cython |
| Operating system | Linux, macOS, Windows |
| Type | Library for machine learning |
| License | BSD 3-clause license |
| Website | scikit-multiflow |
| Repository | https://github.com/scikit-multiflow/scikit-multiflow |
scikit-mutliflow (also known as skmultiflow) is a free and open source software machine learning library for multi-output/multi-label and stream data written in Python.3
Overview
scikit-multiflow allows to easily design and run experiments and to extend existing stream learning algorithms.3 It features a collection of classification, regression, concept drift detection and anomaly detection algorithms. It also includes a set of data stream generators and evaluators. scikit-multiflow is designed to interoperate with Python's numerical and scientific libraries NumPy and SciPy and is compatible with Jupyter Notebooks.
Implementation
The scikit-multiflow library is implemented under the open research principles and is currently distributed under the BSD 3-clause license. scikit-multiflow is mainly written in Python, and some core elements are written in Cython for performance. scikit-multiflow integrates with other Python libraries such as Matplotlib for plotting, scikit-learn for incremental learning methods4 compatible with the stream learning setting, Pandas for data manipulation, Numpy and SciPy.
Components
The scikit-multiflow is composed of the following sub-packages:
- anomaly_detection: anomaly detection methods.
- data: data stream methods including methods for batch-to-stream conversion and generators.
- drift_detection: methods for concept drift detection.
- evaluation: evaluation methods for stream learning.
- lazy: methods in which generalisation of the training data is delayed until a query is received, i.e., neighbours-based methods such as kNN.
- meta: meta learning (also known as ensemble) methods.
- neural_networks: methods based on neural networks.
- prototype: prototype-based learning methods.
- rules: rule-based learning methods.
- transform: perform data transformations.
- trees: tree-based methods, e.g. Hoeffding trees which are a type of decision tree for data streams.
History
scikit-multiflow started as a collaboration between researchers at Télécom Paris (Institut Polytechnique de Paris5) and École Polytechnique. Development is currently carried by the University of Waikato, Télécom Paris, École Polytechnique and the open research community.
References
References
- "scikit-mutliflow Version 0.5.3".
- "scikit-learn 0.5.3". Python Package Index.
- Montiel, Jacob; Read, Jesse; Bifet, Albert; Abdessalem, Talel (2018). "Scikit-Multiflow: A Multi-output Streaming Framework". Journal of Machine Learning Research. 19 (72): 1–5. ISSN 1533-7928.
- "scikit-learn — Incremental learning". scikit-learn.org. Retrieved 2020-04-08.
- "Institut Polytechnique de Paris". Retrieved 2020-04-08.
- Bifet, Albert; Holmes, Geoff; Kirkby, Richard; Pfahringer, Bernhard (2010). "MOA: Massive Online Analysis". Journal of Machine Learning Research. 11 (52): 1601–1604. ISSN 1533-7928.
- Read, Jesse; Reutemann, Peter; Pfahringer, Bernhard; Holmes, Geoff (2016). "MEKA: A Multi-label/Multi-target Extension to WEKA". Journal of Machine Learning Research. 17 (21): 1–5. ISSN 1533-7928.