Modeling using Python

General comments

R equivalents

Scikit-learn (sklearn)

Statsmodels (statsmodels)

  • Classical statistical techniques with inference
    • ANOVAs, LMM, GLM, hypothesis testing, etc.
    • Regularization (Elastic net, Rigde, LASSO)
    • Rich family of GLM distributions
  • Uses R-like formulas to describe models

Scipy stats module (scipy.stats)

  • Implements some basic statistical functions:
    • Distributions
    • Estimators
    • Hypothesis tests
    • Transformations
    • Gaussian KDE

Categorical Data

Logistic Regression

Other GLM

Ridge Classifier (Ridge regression on -1/+1 responses)

Discriminant analysis

Ensemble and Tree-based Methods

Gaussian Process

Naive Bayes

K-Nearest-Neighbors

Neural Networks

Support Vector Machines

Multiclass and Multilabel Data

Numerical Data

Linear Regression, ANOVA and Linear Mixed Models

GLM

Kernel Linear Regression

Ensemble and Tree-based Methods

Gaussian Process

K-Nearest-Neighbors

Neural Networks

Support Vector Machines

Unsupervised Learning

Clustering

Gaussian Mixture Model

Dimensionality Reduction

Previous
Next