Physics-Informed Neural Networks (PINNs): Theory & applications

Physics-informed neural networks represent a groundbreaking fusion of deep learning and scientific computing, enabling machines to learn from both data and the fundamental laws of physics. Unlike traditional neural networks that rely solely on data, PINNs incorporate physical constraints directly into the learning process, making them particularly powerful for solving complex problems in engineering, physics, and applied mathematics.

This innovative approach addresses a critical limitation of conventional machine learning: the tendency to produce predictions that violate known physical laws. By embedding differential equations and boundary conditions into the neural network architecture, physics informed neural networks ensure that their predictions remain physically consistent while maintaining the flexibility and generalization capabilities of deep learning.

Content

1. Understanding physics-informed neural networks

What are PINNs?

Physics-informed neural networks are deep learning models that incorporate physical laws, typically expressed as partial differential equations (PDEs), directly into the training process. The key innovation lies in the loss function, which combines data-driven terms with physics-based penalties that enforce compliance with governing equations.

At their core, PINNs are universal function approximators that learn solutions to PDEs by minimizing a composite loss function. This loss function typically consists of three components:

$$\mathcal{L}_{\text{total}}
= \mathcal{L}_{\text{data}}
+ \lambda_{\text{PDE}} \, \mathcal{L}_{\text{PDE}}
+ \lambda_{\text{BC}} \, \mathcal{L}_{\text{BC}}$$

where $\mathcal{L}_{\text{data}}$ measures the fit to observed data, $\mathcal{L}_{\text{PDE}}$ enforces the governing physical equations, and $\mathcal{L}_{\text{BC}}$ ensures boundary and initial conditions are satisfied. The parameters $\lambda_{\text{PDE}}$ and $\lambda_{\text{BC}}$ balance these competing objectives.

The mathematical foundation

Consider a general PDE of the form:

$$ \mathcal{F}[u(x,t); \lambda] = 0, \quad x \in \Omega, \quad t \in [0,T] $$

where $u(x,t)$ represents the solution field, $\mathcal{F}$ is a differential operator, and $\lambda$ denotes physical parameters. A physics informed neural network approximates $u(x,t)$ with a neural network $u_\theta(x,t)$, where $\theta$ represents the network parameters.

The physics-informed loss for the PDE is computed by automatic differentiation:

$$\mathcal{L}_{\text{PDE}} = \frac{1}{N_{\text{PDE}}} \sum_{i=1}^{N_{\text{PDE}}}
\left| \mathcal{F}\big[u_\theta(x_i, t_i); \lambda \big] \right|^2 $$

This formulation allows the network to learn solutions without requiring large amounts of training data, as the PDE itself provides infinite training examples through collocation points sampled within the domain.

Key advantages over traditional methods

Physics-informed neural networks offer several compelling advantages compared to conventional numerical methods and pure data-driven approaches:

Mesh-free computation: Unlike finite element or finite difference methods, PINNs don’t require structured grids, making them ideal for complex geometries and high-dimensional problems.

Data efficiency: By leveraging physical laws, PINNs can learn accurate solutions from sparse or noisy data, addressing a major limitation of traditional machine learning.

Inverse problem solving: PINNs naturally handle inverse problems, simultaneously learning unknown parameters in the governing equations alongside the solution field.

Continuous representations: Neural networks provide continuous, differentiable approximations that can be evaluated at arbitrary points without interpolation errors.

2. The architecture and implementation of PINNs

Network architecture design

The architecture of a physics informed neural network typically consists of fully connected layers with smooth activation functions. The choice of activation is crucial because automatic differentiation is used extensively to compute derivatives for the PDE loss.

Common activation functions include:

tanh: $\tanh(x) = \frac{e^x – e^{-x}}{e^x + e^{-x}}$ – smooth and bounded, widely used in PINNs
sine: $\sin(x)$ – periodic activation useful for problems with periodic solutions
softplus: $\log(1 + e^x)$ – smooth approximation of ReLU

A typical PINN architecture for a 2D problem might have 8-10 hidden layers with 20-50 neurons per layer. The input layer takes spatial and temporal coordinates $(x, y, t)$, and the output layer produces the solution field (u).

Implementation example

Here’s a practical implementation of a PINN for solving the heat equation:

import numpy as np
import torch
import torch.nn as nn

class PINN(nn.Module):
    def __init__(self, layers):
        super(PINN, self).__init__()
        self.layers = nn.ModuleList()
        
        # Build the network
        for i in range(len(layers) - 1):
            self.layers.append(nn.Linear(layers[i], layers[i+1]))
        
    def forward(self, x):
        for i in range(len(self.layers) - 1):
            x = torch.tanh(self.layers[i](x))
        x = self.layers[-1](x)  # No activation on output layer
        return x

# Heat equation: u_t = alpha * u_xx
class HeatPINN:
    def __init__(self, layers, alpha=0.01):
        self.model = PINN(layers)
        self.alpha = alpha
        self.optimizer = torch.optim.Adam(self.model.parameters(), lr=1e-3)
        
    def pde_loss(self, x, t):
        # Enable gradient computation
        x.requires_grad = True
        t.requires_grad = True
        
        # Concatenate inputs
        inputs = torch.cat([x, t], dim=1)
        u = self.model(inputs)
        
        # Compute derivatives using automatic differentiation
        u_t = torch.autograd.grad(u, t, grad_outputs=torch.ones_like(u),
                                   create_graph=True)[0]
        u_x = torch.autograd.grad(u, x, grad_outputs=torch.ones_like(u),
                                   create_graph=True)[0]
        u_xx = torch.autograd.grad(u_x, x, grad_outputs=torch.ones_like(u_x),
                                    create_graph=True)[0]
        
        # PDE residual: u_t - alpha * u_xx = 0
        pde_residual = u_t - self.alpha * u_xx
        return torch.mean(pde_residual**2)
    
    def boundary_loss(self, x_bc, t_bc, u_bc):
        inputs = torch.cat([x_bc, t_bc], dim=1)
        u_pred = self.model(inputs)
        return torch.mean((u_pred - u_bc)**2)
    
    def train_step(self, x_pde, t_pde, x_bc, t_bc, u_bc):
        self.optimizer.zero_grad()
        
        # Compute losses
        loss_pde = self.pde_loss(x_pde, t_pde)
        loss_bc = self.boundary_loss(x_bc, t_bc, u_bc)
        
        # Total loss
        loss = loss_pde + 10.0 * loss_bc  # Weight boundary conditions more
        
        # Backpropagation
        loss.backward()
        self.optimizer.step()
        
        return loss.item(), loss_pde.item(), loss_bc.item()

# Example usage
layers = [2, 32, 32, 32, 1]  # Input: (x,t), Output: u
pinn = HeatPINN(layers, alpha=0.01)

# Generate training data
N_pde = 10000
x_pde = torch.rand(N_pde, 1) * 2.0 - 1.0  # x in [-1, 1]
t_pde = torch.rand(N_pde, 1)  # t in [0, 1]

# Boundary conditions (example: u(x,0) = sin(pi*x))
N_bc = 100
x_bc = torch.rand(N_bc, 1) * 2.0 - 1.0
t_bc = torch.zeros(N_bc, 1)
u_bc = torch.sin(np.pi * x_bc)

# Training loop
for epoch in range(5000):
    loss, loss_pde, loss_bc = pinn.train_step(x_pde, t_pde, x_bc, t_bc, u_bc)
    
    if epoch % 500 == 0:
        print(f"Epoch {epoch}: Loss = {loss:.6f}, PDE = {loss_pde:.6f}, BC = {loss_bc:.6f}")

Training strategies and optimization

Training physics informed neural networks presents unique challenges compared to standard deep learning tasks. The multi-objective nature of the loss function requires careful tuning and specialized optimization strategies.

Loss balancing: The relative magnitudes of different loss components can vary by orders of magnitude. Adaptive weighting schemes help maintain balanced gradients during training.

Curriculum learning: Starting with simpler problems or fewer collocation points and gradually increasing complexity can improve convergence.

Multi-stage training: Alternating between optimizing data fit and physics consistency can lead to better final solutions.

3. Neural ordinary differential equations and their connection to PINNs

Understanding neural ODEs

Neural ordinary differential equations represent another paradigm in scientific machine learning where the dynamics of hidden states are modeled as continuous-time ODEs. While PINNs embed known physics into the loss function, neural ODE architectures treat the network itself as a dynamical system.

A neural ode defines the hidden state evolution as:

$$ \frac{dh(t)}{dt} = f_\theta(h(t), t) $$

where $h(t)$ is the hidden state at time $t$, and $f_\theta$ is a neural network parameterized by $\theta$. The output is obtained by solving this ODE:

$$ h(t_1) = h(t_0) + \int_{t_0}^{t_1} f_\theta(h(t), t) dt $$

Bridging PINNs and neural ODEs

The connection between physics informed neural networks and neural ordinary differential equations lies in their treatment of continuous dynamics. Both frameworks recognize that many physical phenomena evolve continuously in time, and discrete-time models may introduce artifacts or limitations.

PINNs with temporal components can be viewed as learning solutions to ODEs or time-dependent PDEs, while neural ode models learn the vector field that generates the dynamics. When the governing equations are known, PINNs are typically preferred. When the dynamics must be learned from data alone, neural ordinary differential equations offer more flexibility.

Practical implementation of neural ODEs

Here’s an example implementing a neural ode for a simple dynamical system:

import torch
import torch.nn as nn
from scipy.integrate import solve_ivp

class ODEFunc(nn.Module):
    def __init__(self, hidden_dim=32):
        super(ODEFunc, self).__init__()
        self.net = nn.Sequential(
            nn.Linear(2, hidden_dim),
            nn.Tanh(),
            nn.Linear(hidden_dim, hidden_dim),
            nn.Tanh(),
            nn.Linear(hidden_dim, 2)
        )
    
    def forward(self, t, y):
        # y is the state, t is time
        return self.net(y)

class NeuralODE(nn.Module):
    def __init__(self, ode_func):
        super(NeuralODE, self).__init__()
        self.ode_func = ode_func
        
    def forward(self, y0, t_span):
        # y0: initial condition
        # t_span: time points to evaluate
        
        def ode_wrapper(t, y):
            y_tensor = torch.tensor(y, dtype=torch.float32)
            with torch.no_grad():
                dydt = self.ode_func(t, y_tensor)
            return dydt.numpy()
        
        # Solve ODE using numerical integrator
        solution = solve_ivp(
            ode_wrapper, 
            [t_span[0], t_span[-1]], 
            y0.numpy().flatten(),
            t_eval=t_span,
            method='RK45'
        )
        
        return torch.tensor(solution.y.T, dtype=torch.float32)

# Example: Learning a spiral dynamics
ode_func = ODEFunc(hidden_dim=32)
neural_ode = NeuralODE(ode_func)

# Generate training data (true spiral)
t_data = np.linspace(0, 10, 100)
y_true = np.zeros((100, 2))
y_true[:, 0] = np.cos(t_data) * np.exp(-0.1 * t_data)
y_true[:, 1] = np.sin(t_data) * np.exp(-0.1 * t_data)

# Training
optimizer = torch.optim.Adam(ode_func.parameters(), lr=1e-3)
y0 = torch.tensor([[1.0, 0.0]], dtype=torch.float32)

for epoch in range(1000):
    optimizer.zero_grad()
    
    # Forward pass
    y_pred = neural_ode(y0, t_data)
    
    # Compute loss
    loss = torch.mean((y_pred - torch.tensor(y_true, dtype=torch.float32))**2)
    
    # Backward pass
    loss.backward()
    optimizer.step()
    
    if epoch % 100 == 0:
        print(f"Epoch {epoch}: Loss = {loss.item():.6f}")

4. Advanced topics: Neural operators and Fourier neural operators

The neural operator framework

While PINNs learn solutions for specific initial and boundary conditions, neural operators take a more ambitious approach: learning mappings between infinite-dimensional function spaces. A neural operator learns to map any input function (e.g., initial conditions) to the corresponding solution function.

Mathematically, a neural operator $\mathcal{G}_\theta$ learns the mapping:

$$ \mathcal{G}_\theta: \mathcal{A} \rightarrow \mathcal{U} $$

where $\mathcal{A}$ is the space of input functions (parameters, forcing terms, or initial conditions) and $\mathcal{U}$ is the space of solution functions. This is particularly powerful for parametric PDEs where you need solutions for many different configurations.

Fourier neural operator architecture

The fourier neural operator (FNO) is a breakthrough architecture in the neural operator family. It leverages the Fourier transform to efficiently capture global dependencies in the solution field, making it particularly effective for solving PDEs across different resolutions.

The key innovation is the Fourier layer, which operates in spectral space:

$$ (Kv)(x) = \mathcal{F}^{-1}(R \cdot \mathcal{F}(v))(x) $$

where $\mathcal{F}$ denotes the Fourier transform, $R$ is a learnable weight matrix in frequency space, and $v$ is the input function. This allows the network to learn correlations across the entire spatial domain efficiently.

FNO implementation for PDE solving

Here’s a simplified implementation of a Fourier neural operator layer:

import torch
import torch.nn as nn
import torch.fft as fft

class SpectralConv1d(nn.Module):
    def __init__(self, in_channels, out_channels, modes):
        super(SpectralConv1d, self).__init__()
        self.in_channels = in_channels
        self.out_channels = out_channels
        self.modes = modes  # Number of Fourier modes to keep
        
        # Learnable weights in Fourier space (complex-valued)
        scale = 1.0 / (in_channels * out_channels)
        self.weights = nn.Parameter(
            scale * torch.rand(in_channels, out_channels, modes, 
                             dtype=torch.cfloat)
        )
    
    def forward(self, x):
        # x shape: (batch, channels, spatial_points)
        batch_size = x.shape[0]
        
        # Compute FFT
        x_ft = fft.rfft(x, dim=-1)
        
        # Multiply relevant Fourier modes
        out_ft = torch.zeros(batch_size, self.out_channels, 
                            x_ft.shape[-1], dtype=torch.cfloat, 
                            device=x.device)
        
        out_ft[:, :, :self.modes] = torch.einsum(
            "bix,iox->box", 
            x_ft[:, :, :self.modes], 
            self.weights
        )
        
        # Inverse FFT
        x_reconstructed = fft.irfft(out_ft, n=x.shape[-1], dim=-1)
        return x_reconstructed

class FNO1d(nn.Module):
    def __init__(self, modes, width):
        super(FNO1d, self).__init__()
        self.modes = modes
        self.width = width
        
        # Lifting: map input to higher dimensional space
        self.fc0 = nn.Linear(2, width)  # Input: (x, a(x))
        
        # Fourier layers
        self.conv0 = SpectralConv1d(width, width, modes)
        self.conv1 = SpectralConv1d(width, width, modes)
        self.conv2 = SpectralConv1d(width, width, modes)
        
        # Local linear transformations
        self.w0 = nn.Linear(width, width)
        self.w1 = nn.Linear(width, width)
        self.w2 = nn.Linear(width, width)
        
        # Projection: map to output
        self.fc1 = nn.Linear(width, 128)
        self.fc2 = nn.Linear(128, 1)
    
    def forward(self, x):
        # x: (batch, spatial_points, 2)
        x = self.fc0(x)
        x = x.permute(0, 2, 1)  # (batch, width, spatial_points)
        
        # Fourier layers with residual connections
        x1 = self.conv0(x)
        x2 = self.w0(x.permute(0, 2, 1))
        x = torch.relu(x1 + x2.permute(0, 2, 1))
        
        x1 = self.conv1(x)
        x2 = self.w1(x.permute(0, 2, 1))
        x = torch.relu(x1 + x2.permute(0, 2, 1))
        
        x1 = self.conv2(x)
        x2 = self.w2(x.permute(0, 2, 1))
        x = x1 + x2.permute(0, 2, 1)
        
        x = x.permute(0, 2, 1)  # (batch, spatial_points, width)
        x = self.fc1(x)
        x = torch.relu(x)
        x = self.fc2(x)
        
        return x

# Example usage
model = FNO1d(modes=16, width=64)
x_grid = torch.linspace(0, 1, 128).unsqueeze(0).unsqueeze(-1)
input_function = torch.sin(2 * np.pi * x_grid)
input_data = torch.cat([x_grid.expand(1, -1, 1), input_function], dim=-1)

output = model(input_data)
print(f"Output shape: {output.shape}")

Comparing approaches: PINNs vs neural operators

Physics informed neural networks and neural operators serve different purposes in scientific machine learning:

PINNs excel when: You need high-accuracy solutions for specific problems, have limited data, want to incorporate known physics directly, or need to solve inverse problems with parameter discovery.

Neural operators excel when: You need to solve the same PDE family many times with different parameters, require real-time inference, work with multi-scale problems, or need resolution-invariant predictions.

The choice between these approaches depends on your specific application. For one-off high-accuracy solutions, PINNs are typically preferred. For surrogate modeling where speed is critical, fourier neural operator and other neural operator architectures shine.

5. Real-world applications and use cases

Fluid dynamics and computational fluid mechanics

Physics informed neural networks have shown remarkable success in fluid dynamics problems, where traditional computational fluid dynamics (CFD) methods can be prohibitively expensive. PINNs can solve the Navier-Stokes equations, which govern fluid flow:

$$ \frac{\partial \mathbf{u}}{\partial t} + (\mathbf{u} \cdot \nabla)\mathbf{u} = -\frac{1}{\rho}\nabla p + \nu \nabla^2 \mathbf{u} $$

$$ \nabla \cdot \mathbf{u} = 0 $$

where $\mathbf{u}$ is velocity, $p$ is pressure, $\rho$ is density, and $\nu$ is kinematic viscosity.

Applications include predicting flow around aircraft wings, optimizing cardiovascular flows for medical applications, and modeling ocean currents. The ability to incorporate sparse sensor data while respecting physical constraints makes PINNs particularly valuable for these applications.

Heat transfer and thermal management

Heat conduction problems are classical applications for physics informed neural networks. The heat equation:

$$ \frac{\partial u}{\partial t} = \alpha \nabla^2 u $$

appears in diverse contexts from electronics cooling to climate modeling. PINNs can efficiently solve inverse heat transfer problems, such as identifying material properties or heat source locations from temperature measurements.

In electronics thermal management, PINNs help design cooling systems by predicting temperature distributions across circuit boards. The mesh-free nature of PINNs makes them ideal for complex geometries typical in modern electronics.

Seismic imaging and geophysics

The geosciences have embraced scientific machine learning for seismic wave propagation and subsurface imaging. PINNs solve the wave equation:

$$ \frac{\partial^2 u}{\partial t^2} = c^2 \nabla^2 u $$

where $c$ is the wave speed, which varies with subsurface properties. By incorporating seismic sensor data, PINNs can reconstruct subsurface structures and identify oil and gas reservoirs.

This application demonstrates PINNs’ strength in inverse problems: simultaneously learning the wave field and unknown subsurface parameters from sparse surface measurements.

Biomechanics and medical applications

Medical applications of physics informed neural networks include modeling blood flow in arteries, predicting tumor growth, and simulating drug diffusion in tissues. These problems often involve complex geometries, patient-specific parameters, and limited measurement data—all areas where PINNs excel.

For cardiovascular modeling, PINNs can predict blood pressure and flow patterns from non-invasive measurements, potentially improving diagnosis of vascular diseases. The ability to incorporate medical imaging data while ensuring physical consistency is particularly valuable.

Materials science and structural mechanics

In structural engineering and materials science, PINNs solve elasticity equations and predict material behavior under stress. Applications range from optimizing building designs to predicting material failure.

The ability to handle inverse problems makes PINNs valuable for material characterization: determining material properties from mechanical tests. This is particularly useful for new composite materials where properties aren’t well established.

6. Challenges and future directions

Current limitations

Despite their promise, physics informed neural networks face several challenges that active research seeks to address:

Training difficulties: The multi-objective loss function can lead to optimization challenges, with different terms competing during training. Balancing physics constraints with data fitting remains an art requiring problem-specific tuning.

Computational cost: While PINNs avoid meshing, training can be computationally expensive, especially for high-dimensional problems or stiff PDEs. Each training iteration requires computing multiple derivatives through automatic differentiation.

Accuracy for complex problems: For highly nonlinear or chaotic systems, PINNs may struggle to achieve the accuracy of specialized numerical methods. Turbulent flows and shock waves present particular challenges.

Limited theoretical guarantees: Unlike traditional numerical methods with well-established convergence theory, PINNs lack strong theoretical guarantees about solution accuracy and uniqueness.

Emerging research directions

The field of scientific machine learning continues to evolve rapidly, with several promising directions:

Adaptive sampling: Developing intelligent strategies to select collocation points where the PDE residual is large, improving efficiency and accuracy.

Architecture innovations: Exploring specialized architectures like Fourier neural operators and attention-based models for improved performance on specific problem classes.

Multi-fidelity learning: Combining low-fidelity numerical simulations with sparse high-fidelity data to train more accurate models efficiently.

Uncertainty quantification: Incorporating Bayesian approaches and ensemble methods to quantify prediction uncertainty, crucial for safety-critical applications.

Domain decomposition: Developing methods to partition large problems into smaller subproblems that can be solved independently and combined, enabling scaling to massive systems.

Integration with traditional methods

Rather than replacing classical numerical methods, the future likely involves hybrid approaches that leverage the strengths of both paradigms. PINNs could provide rapid initial guesses for traditional solvers, handle complex geometries while finite elements handle simple regions, or serve as reduced-order models for large-scale simulations.

The integration of neural operators with traditional pde solver methods offers particularly exciting possibilities. Fourier neural operator models trained on simulation data can provide real-time predictions for design optimization, while traditional methods provide training data and accuracy verification.

8. Conclusion

Physics-informed neural networks represent a paradigm shift in how we approach computational science and engineering problems. By seamlessly integrating data-driven learning with fundamental physical laws, PINNs offer a powerful framework that addresses limitations of both traditional numerical methods and pure machine learning approaches. Their ability to work with sparse data, handle complex geometries without meshing, and solve inverse problems makes them invaluable across diverse applications from fluid dynamics to medical imaging.

As the field of scientific machine learning continues to mature, physics informed neural networks alongside related approaches like neural ordinary differential equations and fourier neural operator architectures are poised to become standard tools in the computational scientist’s toolkit. While challenges remain in training stability, computational efficiency, and theoretical guarantees, ongoing research is rapidly addressing these limitations. The future of computational science lies not in replacing traditional methods, but in intelligently combining physics-based modeling with modern deep learning to solve problems that were previously intractable.

Explore more: