scipy β Scientific Computing#
What it is#
SciPy builds on NumPy and provides algorithms for statistics (scipy.stats), optimization (scipy.optimize), integration (scipy.integrate), signal processing (scipy.signal), sparse matrices (scipy.sparse), and linear algebra (scipy.linalg). It is the standard library for scientific and engineering computations.
Install#
pip install scipy
Quick example β statistics#
from scipy.stats import norm
print(f"PDF at 0: {norm.pdf(0):.4f}")
print(f"CDF at 1.96: {norm.cdf(1.96):.4f}")
print(f"PPF(0.975): {norm.ppf(0.975):.4f}")
Output:
PDF at 0: 0.3989
CDF at 1.96: 0.9750
PPF(0.975): 1.9600
When / why to use it#
- Hypothesis testing (t-test, chi-squared, ANOVA, Mann-Whitney).
- Curve fitting and nonlinear least squares.
- Numerical integration of functions or ODE systems.
- Signal filtering and frequency analysis (FFT, Butterworth filters).
- Sparse matrix operations for large graphs or finite-element problems.
Common pitfalls#
[!WARNING] βUse NumPy firstβ β many operations people reach for scipy (e.g. matrix multiply, basic stats) are already in NumPy. Only import scipy when you need a specialized algorithm.
[!WARNING] Distributions return frozen vs unfrozen objects β
norm.pdf(0)uses the standard normal. Passlocandscaleto fit a different distribution:norm(loc=5, scale=2).pdf(0).
[!TIP]
scipy.stats.describe(data)gives count, mean, variance, skewness, and kurtosis in one call.
Hypothesis testing#
from scipy.stats import ttest_ind, mannwhitneyu
import numpy as np
rng = np.random.default_rng(42)
group_a = rng.normal(loc=5.0, scale=1.0, size=30)
group_b = rng.normal(loc=5.5, scale=1.0, size=30)
t_stat, p_value = ttest_ind(group_a, group_b)
print(f"t={t_stat:.3f}, p={p_value:.4f}")
print("Significant at 0.05" if p_value < 0.05 else "Not significant")
Output:
t=-1.782, p=0.0800
Not significant
Richer example β curve fitting#
from scipy.optimize import curve_fit
import numpy as np
def exponential_decay(x, a, b):
return a * np.exp(-b * x)
x_data = np.array([0, 1, 2, 3, 4, 5])
y_data = np.array([5.0, 3.1, 1.9, 1.2, 0.8, 0.5])
popt, pcov = curve_fit(exponential_decay, x_data, y_data, p0=[5, 0.5])
a, b = popt
print(f"Fitted: a={a:.3f}, b={b:.3f}")
print(f"Predicted y(6) = {exponential_decay(6, a, b):.3f}")
Output:
Fitted: a=4.942, b=0.476
Predicted y(6) = 0.311
Useful submodules at a glance#
| Module | Example use |
|---|---|
scipy.stats | Distributions, tests: ttest_ind, chi2_contingency, pearsonr |
scipy.optimize | Root finding: fsolve; minimization: minimize; curve fit: curve_fit |
scipy.integrate | Numerical integration: quad; ODE solver: solve_ivp |
scipy.signal | Butterworth filter: butter + sosfilt; find_peaks |
scipy.linalg | solve, eig, svd, cholesky (faster than np.linalg for large arrays) |
scipy.sparse | CSR/CSC sparse matrices; scipy.sparse.linalg.spsolve |
scipy.interpolate | interp1d, CubicSpline, griddata |
scipy.ndimage | Image morphology, Gaussian filter, label connected regions |