Neural Networks & Automatic Differentiation ============================================ Neural networks provide powerful interpolation capabilities with exact automatic differentiation. Unlike traditional methods that approximate derivatives numerically, neural networks can compute exact gradients through backpropagation, making them ideal for optimization and machine learning applications. 🧠 **Core Concepts** ------------------- **Automatic Differentiation (AutoDiff)** computes derivatives by applying the chain rule systematically to elementary operations. This provides machine-precision accuracy for derivatives, unlike numerical approximation methods. **Universal Function Approximation**: Neural networks can approximate any continuous function to arbitrary precision given sufficient capacity, making them extremely versatile interpolators. **Backpropagation**: The algorithm that efficiently computes gradients by propagating error signals backward through the network. 🔧 **Neural Network Interpolator** --------------------------------- The ``NeuralNetworkInterpolator`` combines the universal pydelt API with deep learning backends: .. code-block:: python from pydelt.interpolation import NeuralNetworkInterpolator import numpy as np # Create neural network interpolator nn_interp = NeuralNetworkInterpolator( hidden_layers=[64, 32, 16], # Network architecture activation='relu', # Activation function learning_rate=0.001, # Optimizer learning rate epochs=1000, # Training iterations backend='pytorch' # 'pytorch' or 'tensorflow' ) **Key Parameters**: - ``hidden_layers``: List of hidden layer sizes [64, 32] creates 2 hidden layers - ``activation``: 'relu', 'tanh', 'sigmoid', 'swish', 'gelu' - ``learning_rate``: Adam optimizer learning rate (0.001-0.01 typical) - ``epochs``: Training iterations (500-5000 depending on complexity) - ``backend``: 'pytorch' (default) or 'tensorflow' 🎯 **Example 1: Nonlinear Function Approximation** ------------------------------------------------- **Classic Example: Runge Function** The Runge function is notoriously difficult for polynomial interpolation but neural networks handle it well: .. code-block:: python import numpy as np import matplotlib.pyplot as plt from pydelt.interpolation import NeuralNetworkInterpolator, SplineInterpolator # Runge function: f(x) = 1 / (1 + 25x²) def runge_function(x): return 1 / (1 + 25 * x**2) def runge_derivative(x): return -50 * x / (1 + 25 * x**2)**2 # Training data (sparse sampling) x_train = np.linspace(-1, 1, 15) y_train = runge_function(x_train) # Neural network interpolator nn_interp = NeuralNetworkInterpolator( hidden_layers=[128, 64, 32], activation='tanh', # Good for smooth functions learning_rate=0.005, epochs=2000 ) nn_interp.fit(x_train, y_train) # Compare with spline spline = SplineInterpolator(smoothing=0.0) spline.fit(x_train, y_train) # Evaluation points x_test = np.linspace(-1, 1, 200) y_true = runge_function(x_test) dy_true = runge_derivative(x_test) # Predictions y_nn = nn_interp.predict(x_test) y_spline = spline.predict(x_test) # Derivatives (automatic vs numerical) nn_deriv_func = nn_interp.differentiate(order=1) spline_deriv_func = spline.differentiate(order=1) dy_nn = nn_deriv_func(x_test) dy_spline = spline_deriv_func(x_test) # Error analysis nn_func_error = np.sqrt(np.mean((y_nn - y_true)**2)) nn_deriv_error = np.sqrt(np.mean((dy_nn - dy_true)**2)) spline_func_error = np.sqrt(np.mean((y_spline - y_true)**2)) spline_deriv_error = np.sqrt(np.mean((dy_spline - dy_true)**2)) print("Function Approximation Errors:") print(f"Neural Network: {nn_func_error:.6f}") print(f"Spline: {spline_func_error:.6f}") print("\nDerivative Errors:") print(f"Neural Network: {nn_deriv_error:.6f}") print(f"Spline: {spline_deriv_error:.6f}") **Expected Results**: Neural networks typically achieve 10-100x better accuracy than splines for the Runge function, especially near the boundaries where polynomial methods struggle. 🌊 **Example 2: Fluid Dynamics - Velocity Field** ------------------------------------------------ **Application**: Reconstructing velocity fields from particle tracking data in fluid mechanics. .. code-block:: python # Simulate 2D fluid flow around a cylinder (potential flow) def potential_flow_velocity(x, y, U_inf=1.0, R=0.5): """Velocity field around a cylinder in cross-flow""" r_sq = x**2 + y**2 # Avoid singularity at origin r_sq = np.maximum(r_sq, 1e-10) # Velocity components for flow around cylinder u = U_inf * (1 - R**2 * (x**2 - y**2) / r_sq**2) v = U_inf * (-R**2 * 2 * x * y / r_sq**2) return u, v # Generate training data (sparse particle tracking) np.random.seed(42) n_particles = 200 x_particles = np.random.uniform(-2, 2, n_particles) y_particles = np.random.uniform(-2, 2, n_particles) # Remove particles inside cylinder mask = (x_particles**2 + y_particles**2) > 0.6**2 x_particles = x_particles[mask] y_particles = y_particles[mask] # Get velocity components u_true, v_true = potential_flow_velocity(x_particles, y_particles) # Add measurement noise u_measured = u_true + 0.05 * np.random.randn(len(u_true)) v_measured = v_true + 0.05 * np.random.randn(len(v_true)) # Prepare input data (x,y positions) and output data (u,v velocities) input_data = np.column_stack([x_particles, y_particles]) output_data = np.column_stack([u_measured, v_measured]) # Neural network for vector-valued function nn_flow = NeuralNetworkInterpolator( hidden_layers=[128, 128, 64], activation='swish', # Good for fluid dynamics learning_rate=0.002, epochs=3000 ) nn_flow.fit(input_data, output_data) # Create evaluation grid x_grid = np.linspace(-2, 2, 50) y_grid = np.linspace(-2, 2, 50) X, Y = np.meshgrid(x_grid, y_grid) # Remove points inside cylinder mask_grid = (X**2 + Y**2) > 0.6**2 x_eval = X[mask_grid] y_eval = Y[mask_grid] eval_points = np.column_stack([x_eval, y_eval]) # Predict velocity field velocity_pred = nn_flow.predict(eval_points) u_pred = velocity_pred[:, 0] v_pred = velocity_pred[:, 1] # Compute derivatives for vorticity analysis # ∂u/∂y and ∂v/∂x for vorticity ω = ∂v/∂x - ∂u/∂y deriv_func = nn_flow.differentiate(order=1) derivatives = deriv_func(eval_points) print(f"Reconstructed {len(eval_points)} velocity vectors from {len(x_particles)} measurements") print(f"Velocity field ready for vorticity and strain rate analysis") 📊 **Example 3: Time Series with Complex Dynamics** -------------------------------------------------- **Application**: Chaotic time series analysis (Lorenz attractor). .. code-block:: python from scipy.integrate import odeint # Lorenz system parameters def lorenz_system(state, t, sigma=10, rho=28, beta=8/3): x, y, z = state dxdt = sigma * (y - x) dydt = x * (rho - z) - y dzdt = x * y - beta * z return [dxdt, dydt, dzdt] # Generate Lorenz attractor data t = np.linspace(0, 20, 2000) initial_state = [1.0, 1.0, 1.0] trajectory = odeint(lorenz_system, initial_state, t) # Extract x-component time series x_series = trajectory[:, 0] # Subsample for training (simulate sparse measurements) indices = np.arange(0, len(t), 10) # Every 10th point t_train = t[indices] x_train = x_series[indices] # Add measurement noise x_noisy = x_train + 0.5 * np.random.randn(len(x_train)) # Neural network interpolator nn_chaos = NeuralNetworkInterpolator( hidden_layers=[256, 128, 64, 32], # Deep network for complex dynamics activation='gelu', # Good for chaotic systems learning_rate=0.001, epochs=4000 ) nn_chaos.fit(t_train, x_noisy) # Predict full time series x_pred = nn_chaos.predict(t) # Compute instantaneous rate of change rate_func = nn_chaos.differentiate(order=1) dx_dt_pred = rate_func(t) # True derivative from Lorenz equations dx_dt_true = 10 * (trajectory[:, 1] - trajectory[:, 0]) # Analysis reconstruction_error = np.sqrt(np.mean((x_pred - x_series)**2)) derivative_error = np.sqrt(np.mean((dx_dt_pred - dx_dt_true)**2)) print(f"Time series reconstruction error: {reconstruction_error:.3f}") print(f"Derivative reconstruction error: {derivative_error:.3f}") # Phase space reconstruction quality correlation = np.corrcoef(x_pred, x_series)[0, 1] print(f"Correlation with true attractor: {correlation:.4f}") ⚡ **Advantages of Neural Networks** ---------------------------------- **1. Exact Derivatives** - Automatic differentiation provides machine-precision gradients - No numerical approximation errors - Consistent accuracy across all derivative orders **2. Universal Approximation** - Can represent any continuous function - Handles highly nonlinear relationships - Scales to high-dimensional problems **3. Noise Robustness** - Implicit regularization through architecture - Dropout and batch normalization for stability - Learns underlying patterns despite measurement noise **4. Scalability** - GPU acceleration for large datasets - Batch processing for efficiency - Parallel computation of derivatives 🔧 **Advanced Configuration** ---------------------------- **Custom Architecture Design**: .. code-block:: python # Deep network for complex functions complex_nn = NeuralNetworkInterpolator( hidden_layers=[512, 256, 128, 64, 32], activation='swish', learning_rate=0.0005, epochs=5000, batch_size=64, dropout_rate=0.1 ) # Wide network for high-frequency components wide_nn = NeuralNetworkInterpolator( hidden_layers=[1024, 1024], activation='relu', learning_rate=0.002, epochs=2000 ) **Training Monitoring**: .. code-block:: python # Enable training progress tracking nn_interp = NeuralNetworkInterpolator( hidden_layers=[128, 64], epochs=1000, verbose=True, # Print training progress early_stopping=True, # Stop when validation loss plateaus validation_split=0.2 # Use 20% of data for validation ) **Backend Selection**: .. code-block:: python # PyTorch backend (default, recommended) nn_pytorch = NeuralNetworkInterpolator(backend='pytorch') # TensorFlow backend nn_tensorflow = NeuralNetworkInterpolator(backend='tensorflow') ⚠️ **Limitations & Considerations** ---------------------------------- **Computational Cost**: - Training time scales with network size and data complexity - GPU recommended for large networks (>1000 parameters) - Memory usage grows with batch size and network depth **Hyperparameter Sensitivity**: - Learning rate requires tuning (too high: instability, too low: slow convergence) - Architecture choice affects approximation quality - Overfitting possible with insufficient data **Reproducibility**: - Random initialization affects results - Set random seeds for reproducible results: .. code-block:: python import torch import numpy as np # Set seeds for reproducibility torch.manual_seed(42) np.random.seed(42) nn_interp = NeuralNetworkInterpolator(...) 🎓 **Best Practices** -------------------- **Architecture Guidelines**: 1. **Start simple**: Begin with [64, 32] hidden layers 2. **Go deeper for complexity**: Add layers for highly nonlinear functions 3. **Go wider for detail**: Increase layer sizes for high-frequency components 4. **Use appropriate activations**: 'relu' (general), 'tanh' (smooth), 'swish' (modern) **Training Tips**: 1. **Monitor convergence**: Use validation split to track overfitting 2. **Adjust learning rate**: Decrease if training is unstable 3. **Early stopping**: Prevent overfitting with patience parameter 4. **Data normalization**: Scale inputs to [-1, 1] or [0, 1] range **Derivative Accuracy**: - Neural network derivatives are exact (no approximation error) - Accuracy depends on function approximation quality - Higher-order derivatives may amplify approximation errors 🔗 **Next Steps** ---------------- Neural networks excel at univariate and simple multivariate problems. For advanced multivariate calculus operations, continue to: - **Multivariate Calculus**: Gradients, Jacobians, and Hessians for vector-valued functions - **Stochastic Computing**: Probabilistic neural networks with uncertainty quantification The automatic differentiation capabilities of neural networks become especially powerful when combined with multivariate operations and stochastic link functions.