MA_ML_assisted_SUPG

Release v1.0

1. Gradient descend optimization loop

This video was made using Pyvista. An implementation can be found in Example_1_multiplot_optim. It ensures that the data for the solution of the approximation, the SUPG-parameters (weights), the local loss-values and the gradient are actually updated in the loop and thus that all the components work as expected. It is clear that not all of the windows plotted here will be evaluated in later examinations, since for example the weights depend linearly on the gradients.

Problem:

The convection diffusion problem consideren here is:

$\Omega=(0,1)^2,\ \Gamma^D=\Gamma,\ \varepsilon = 10^{-8},\ \mathbf{b}=(1,0)^T,\ c=0,\ f=1,\ u_D=0$.
The initial choice of weigts is $0.1$ on all cells.
The loss function is not evaluated on the Dirichlet boundary.

Observations:

Weights were initialized a bit too high. which can be seen in the first window where the solution is more flattened than it should be. When the loop starts, we can see how the weights in the region of the layer are reduced which leads to steeper gradients in the solution.
Since there is no boundary evaluation, the clearly visible errors in the boundary do not contribute to the local loss and remain zero at the boundary. The optimization loop however is still mainly altering the values at the layers in the boundary.
The gradient vanishes everywhere except close to the boundary which leads to no changes in the weights in these areas. Taking point 1 into consideration, it is unlikely that the coice of the weights in these areas is optimal. We will introduce a regularization term leter to mitigate that and possibly experiment with other pertubation methods in the loss function.
Sporious oscillations can still be seen in the boundary. It remains to check whether it makes a difference to include boundary evaluation.

<!DOCTYPE html>

Video Demo

SGD-optim loop

2. Evaluation of the boundary

This is a follow on the questions that arise from the observations above. Mainly we want to check whether enabling boundary evaluation can resolve the oscillations in the layers. Here a Pytorch SGD optimization loop is implemented which is equivalent to the gradient descend from Example 1 as is shown in Test 2. There are two supg.data-objects initialized with the same data except one has boundary_eval turned on and the other not.

Observations

Boundary evaluation leads to drastically different results in the solution. We can see more smearing and and the prediction overshoots trying to compensate the large loss term in the boundary layer.

Conjecture

If we have inner layers, like in example problem 4, removing boundary evaluation might not yield satisfactory results because the local loss term in the layer might dominate the optimization algorithm. Removing this area with a custom meshtag and ufl measure might mitigate this problem. To be tested in next Release.

<!DOCTYPE html>

Video Demo

Boundary_eval compare

3. Fully conected Network vs. gradient descent

Here we used the same example as above and a fully connected neural network let it train for 1.500 iterations with randomly initialized weights. After that anoth supg.data-object is initialized with the weights from the last forward pass. The first object continues to be optimized by the neural network. The second by a standard gradient descent.

Implementation details

The loss term of the neural network has a regularization term added which is just an MSE-loss function with target value 0 for each weight.
The neural network takes tabulated input data (eps, b_1, b_2, c, f) averaged over all cells and runs them through a pooling layer with 5 outputs. In this case those are just the values of the functions in the problem data since they are constant.
The data is then run through two hidden layers that have 16 nodes each and are full connected.
This is essentially the AiStab-NN $_\theta$ from Artificial neural network-augmented stabilized finite element method that Yadav and Ganesan introduced without pre- and post-processing.

Observations

As expected both models behave differently.
in this iteration, the neural network is able to handle the gap in the initial solution. However it is noteworthy that this gap was caused by initializing the network with random weights and without pre- and post-processing.
Repeating the code in Example 2 that was used to create this video yields noticeably different results each time.

<!DOCTYPE html>

Video Demo

5-16-16-100 NN vs SGD

Arranging DOFs in 2 dimensions

In order to reshape the one dimensional DOF-array to a two dimensional array the points need to be arranged by there geometrical position. As seen here, the grid on the unit_square is arranged in a diagonal pattern.

from supg.sp_problems import dat1
from supg import supg
import numpy as np

sd = supg.data(*dat1)

pts = sd.Yh.tabulate_dof_coordinates()[0]
x = pts[:,0]
y = pts[:,1]
print(pts[np.lexsort((pts[:,0], -pts[:,1]))].reshape(10,-1,2))
print(np.lexsort((pts[:,0], -pts[:,1])).reshape(-1,10))

This can also be seen by ploting a simple DG0-function.

from supg.sp_problems import domain, stg
import dolfinx.fem as fem
import numpy as np
import pyvista as pv

Ch = fem.functionspace(domain, ('DG',0))
cell_scalar_fn = fem.Function(Ch)
cell_scalar_fn.x.array[:] = np.arange(len(cell_scalar_fn.x.array))

cell_scalar_fn.x.scatter_forward()

stg.add_data(cell_scalar_fn, point=False)
p = pv.Plotter()
p.add_mesh(stg.grid, show_edges=True)
p.camera_position='xy'

p.show()