About Scikit Learn:
Scikit is one of the most popular libraries for Machine Learning that is used in Python programming languages. Supervised and unsupervised learning algorithms are provided by Scikit.
Scikit is built upon the most popular technologies and libraries like Scipy, numpy, pandas and matplotlib.
Modules for Scipy care were named as Scikit and these modules provided learning algorithms and were known as Scikit Learn.
Versions of Scikit Learn:
The first version of Scikit Learn beta v0.1 was released in the year 2010 January. The latest stable version of scikit Learn was Scikit learn 0.23.0 in May 2020.
The overall scikit learn versions are:
- Scikit Learn 0.24.0 in the development stage
- Scikit Learn 2.23.2 on 3rd August 2020
- Scikit Learn 0.23.1 on 18th May 2020
- Scikit learn 0.23.0 on 12th May 2020
- Scikit learn 0.22.0 in December 2013
- Scikit Learn 0.21.0 in May 2019
- Scikit learn 0.20.0 in September 2019
- Scikit learn 0.19.0 in July 2017
- Scikit Learn 0.18.0 in September 2016
- Scikit Learn 0.17.0 in November 2015
- Scikit Learn 0.16.0 in March 2015
- Scikit Learn 0.15.0 in July 2014
- Scikit Learn 0.14 in August 2013
For higher performance linear algebra and array operations scikit uses numpy and Scikit is mainly written in Python and some of the Scikit algorithms are written in Cython.
How to install and check the version of Scikit Learn:
You can install Scikit Learn in different ways. This totally depended on your choice. The different types of installing Scikit Learn are:
- Installing the latest official release: This is one of the best ways many users follow. By following this way it provides a stable version and pre-built.
- Scikit learn can also be installed in a different way that is provided by OS or Python distribution. The main disadvantage of following this way of installation is that they do not provide the latest stable release version.
- If users need the latest and greatest features then it’s good to build packages from source.
Download the latest version of Python 64bit and install it from https://www.python.org
Then run as $ pip install -U scikit-learn
In-order to check which version you have installed and where Scikit Learn is installed you can use
$ python -m pip show scikit-learn #
If you want to know which packages are installed in the active virtual environment then one can use
$ python -m pip freeze # $ python -c “import sklearn; sklearn.show_versions()”.
For more details about the installation process on different operating systems visit: https://scikit-learn.org/stable/install.html#install-official-release
Features of scikit learn:
It offers a lot of major features with the release of Version 0.23.0 and some advanced features of version 0.24. These features will work with at least python 3.6.
The features are:
- New regression models
- Stable and faster KMeans estimator
- Improved histogram-based gradient boosting estimator
- Improved lasso and Elastic net
- Interactive HTML visualization of pipelines and estimators
- Loading inbuilt data sets as Data frames
- Drop selected categories during one-hot encoding
- Centers of Gaussian Blobs
- Inverse transformation of the imputed values
- Mean absolute percentage error
- The optional color bar in confusion matrix plot
- Data sets
- Feature extraction
- Feature selection
- Parameter tuning
- Cross-validation and prediction by using the function cross_val_predict
- Supervised models
- Unsupervised models
- Dimensionality reduction
- Ensemble methods
- Clone estimator
- Identify estimators as Classifiers or Regressors by using the functions is_classifier and is_regressor
- Select columns with make_column_selector
- Plotting the decision tree: visualize a decision tree by using the function plot_tree
- Fetch data sets from openml by using the function fetch_openml
- Learning curve
- Select important features by using SelectFromModel Function
- Function Transformers
- Target Data type determination by using the function type_of_target
- Add dummy features by using the function add_dummy_feature
- Iterative imputer
- Hyperparameter tuning using random search
- Loading text files by using the function load_files
- New plotting API
- Stacked generalization
- Gradient boosting missing value support
- KNN based missing value imputation