Collision
This program addresses the computation of 2-2, 2-3, and 3-2 collision terms in different systems:
where the 2-2 collision term is:
2-3 collision term is:
3-2 collision term is:
To understand the background and principles of the application of the program you can refer to the following article
Towards a full solution of the relativistic Boltzmann equation for quark-gluon matter on GPUs
Advantages of the program
- Solving high-dimensional integrals by Monte Carlo methods;
- Conservation of particle number by “symmetric sampling” method;
- Reducing computation time with CUDA_Atomic calculations;
- Parallelizing programs with CUDA.
Structure of the program
Since the program structures in the three collision processes are basically the same, only the 2-2 collision process is described in detail, while the differences in the 2-3, 3-2 collision program are labeled.
- Init
- collision_type_for_all_species()
- Main
- Collision_term()
- collision_term_kernel()
- collision_term_at_specific_point()
- accumulate_collision_term()
- Collision_term()
- other_calculation
- Amplitude_square()
- H_fun()
- C1_fun()
- C2_fun()
- threeD_to_oneD()
The running logic of the whole program is shown below:
collision_term_kernel()
This function performs the parallelization of the program and contains two main functions:
1、Corresponding the block indexes of the parallel computation to the momentum space and the position space;
2、Computing the momentum and energy of the particles.
- The program corresponds the vector, positional indexes to the ‘thread’, and the 6 dimensions become 1 dimension;
i_grid = cuda.grid(1)
if i_grid < total_grid:
# convert one-d index into six-d
ipz = i_grid%npz
ipz_rest = i_grid//npz
ipy = ipz_rest%npy
ipy_rest = ipz_rest//npy
ipx = ipy_rest%npx
ipx_rest = ipy_rest//npx
iz = ipx_rest%nz
iz_rest = ipx_rest//nz
iy = iz_rest%ny
iy_rest = iz_rest//ny
ix = iy_rest%nx
- Calculate the momentum and energy of different particles:
for p_type in range(num_of_particle_types):
px = ((ipx+0.5)*dpx[p_type] - half_px[p_type])/(npx**i_level)
py = ((ipy+0.5)*dpy[p_type] - half_py[p_type])/(npy**i_level)
pz = ((ipz+0.5)*dpz[p_type] - half_pz[p_type])/(npz**i_level)
mp_squared = masses[p_type]**2
p0 = math.sqrt(mp_squared*c**2+px**2+py**2+pz**2)
collision_term_at_specific_point()
This function consists of two main functions:
1、Calculate int_vloume;
2、Calculate k1z1, k1z2 using Monte Carlo method;
-
Calculate int_volume:
2-2 Collision:
if collision_type[i_collision] < 1000:
flavor0 = flavor[i_collision,0]
flavor1 = flavor[i_collision,1]
flavor2 = flavor[i_collision,2]
int_volume = 2**5*half_px[flavor0]*half_py[flavor0]*half_px[flavor2]*half_py[flavor2]\
*half_pz[flavor2]/(num_samples*4)
m1 = masses[flavor0]
m2 = masses[flavor1]
m3 = masses[flavor2]
2-3 Collision:
if collision_type[i_collision] < 1000:
int_volume = 2**8*half_px[flavor[i_collision,0]]*half_py[flavor[i_collision,0]]\
*half_px[flavor[i_collision,2]]*half_py[flavor[i_collision,2]]*half_pz[flavor[i_collision,2]]\
*half_px[flavor[i_collision,3]]*half_py[flavor[i_collision,3]]*half_pz[flavor[i_collision,3]]\
/(num_samples*5)
3-2 Collision:
if collision_type[i_collision] < 1000:
int_volume = 2**8*half_px[flavor[i_collision,3]]*half_py[flavor[i_collision,3]]\
*half_px[flavor[i_collision,2]]*half_py[flavor[i_collision,2]]*half_pz[flavor[i_collision,2]]\
*half_px[flavor[i_collision,0]]*half_py[flavor[i_collision,0]]*half_pz[flavor[i_collision,0]]\
/(num_samples*5)
-
Calculation of
k1z1
,k1z2
using Monte Carlo for the purpose of calculatingJa
:2-2 Collision:
for i_sample in range(num_samples):
k1x = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_px[flavor0]
k1y = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_py[flavor0]
k3x = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_px[flavor2]
k3y = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_py[flavor2]
k3z = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_pz[flavor2]
k30 = math.sqrt(m3**2*c**2+k3x**2+k3y**2+k3z**2)
C1 = C1_fun(m1,k1x,k1y,m2,k30,k3x,k3y,k3z,p0,px,py,pz,c)
C2 = C2_fun(m1,k1x,k1y,m2,k30,k3x,k3y,k3z,p0,px,py,pz,c)
H = H_fun(m1,k1x,k1y,m2,k30,k3x,k3y,k3z,p0,px,py,pz,c)
if H>1e-13 and abs(C2)>1e-13:
k1z1, k1z2 = (C1+math.sqrt(H))/C2, (C1-math.sqrt(H))/C2
**2-3 Collision: **
for i_sample in range(num_samples):
# masses of the particles
m1 = masses[flavor[i_collision,0]]
m2 = masses[flavor[i_collision,1]]
m3 = masses[flavor[i_collision,2]]
m4 = masses[flavor[i_collision,3]]
mp = masses[flavor[i_collision,4]]
# randomly generate k1x, k1y, k3x, k3y, k3z
# the momentum values need to be sampled according to the specific particle types
k1x = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_px[flavor[i_collision,0]]
k1y = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_py[flavor[i_collision,0]]
k3x = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_px[flavor[i_collision,2]]
k3y = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_py[flavor[i_collision,2]]
k3z = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_pz[flavor[i_collision,2]]
k4x = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_px[flavor[i_collision,3]]
k4y = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_py[flavor[i_collision,3]]
k4z = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_pz[flavor[i_collision,3]]
k30 = math.sqrt(m3**2*c**2+k3x**2+k3y**2+k3z**2)
k40 = math.sqrt(m4**2*c**2+k4x**2+k4y**2+k4z**2)
# solving for k1z via energy conservation
# k1z has two values
C1 = C1_fun(m1,k1x,k1y,m2,k30,k3x,k3y,k3z,m4,k40,k4x,k4y,k4z,p0,px,py,pz,c)
C2 = C2_fun(m1,k1x,k1y,m2,k30,k3x,k3y,k3z,m4,k40,k4x,k4y,k4z,p0,px,py,pz,c)
H = H_fun(m1,k1x,k1y,m2,k30,k3x,k3y,k3z,m4,k40,k4x,k4y,k4z,p0,px,py,pz,c)
# take the value if H>1e-13 and abs(C2)>1e-13
if H>1e-13 and abs(C2)>1e-13:
k1z1, k1z2 = (C1+math.sqrt(H))/C2, (C1-math.sqrt(H))/C2
**3-2 Collision: **
for i_sample in range(num_samples):
# masses of the particles
m1 = masses[flavor[i_collision,0]]
m2 = masses[flavor[i_collision,1]]
m3 = masses[flavor[i_collision,2]]
m4 = masses[flavor[i_collision,3]]
mp = masses[flavor[i_collision,4]]
# randomly generate k4x, k4y, k3x, k3y, k3z and k1x, k1y, k1z
# the momentum values need to be sampled according to the specific particle types
k1x = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_px[flavor[i_collision,0]]
k1y = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_py[flavor[i_collision,0]]
k1z = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_pz[flavor[i_collision,0]]
k3x = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_px[flavor[i_collision,2]]
k3y = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_py[flavor[i_collision,2]]
k3z = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_pz[flavor[i_collision,2]]
k4x = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_px[flavor[i_collision,3]]
k4y = (xoroshiro128p_uniform_float64(rng_states, i_grid)*2-1)*half_py[flavor[i_collision,3]]
k30 = math.sqrt(m3**2*c**2+k3x**2+k3y**2+k3z**2)
k10 = math.sqrt(m1**2*c**2+k1x**2+k1y**2+k1z**2)
# solving for k1z via energy conservation
# k1z has two values
C1 = C1_fun(k10,k1x,k1y,k1z,m2,k30,k3x,k3y,k3z,m4,k4x,k4y,p0,px,py,pz,c)
C2 = C2_fun(k10,k1x,k1y,k1z,m2,k30,k3x,k3y,k3z,m4,k4x,k4y,p0,px,py,pz,c)
H = H_fun(k10,k1x,k1y,k1z,m2,k30,k3x,k3y,k3z,m4,k4x,k4y,p0,px,py,pz,c)
# take the value if H>1e-13 and abs(C2)>1e-13
if H>1e-13 and abs(C2)>1e-13:
k4z1, k4z2 = (C1+math.sqrt(H))/C2, (C1-math.sqrt(H))/C2
accumulate_collision_term()
This function mainly solves for the final parameters and contains four functions:
1、Calculation Ja、momentum_product;
2、Calculation f0、f1、f2、fp;
3、Calculation amplitude_square;
4、Calculation collision term.
- Calculation Ja and momentum_product
Here, we choose to express k2
as k3+p-k1
, which leads to a more stable integration. All computed values of k2
undergo a validation process to ensure they fall within the phase space defined by the particle. The specific calculation formula is depicted in the diagram below:
**2-2 Collision: **
Ja
momentum_product
if k1z > -half_pz[flavor0] and k1z < half_pz[flavor0]:
# obtain k2x, k2y, k2z via momentum conservation
k2x = k3x + px - k1x
k2y = k3y + py - k1y
k2z = k3z + pz - k1z
# k2x, k2y, k2z must also be in the momentum box range of particle 1
if (k2x > -half_px[flavor1] and k2x < half_px[flavor1] and
k2y > -half_py[flavor1] and k2y < half_py[flavor1] and
k2z > -half_pz[flavor1] and k2z < half_pz[flavor1]):
# energy for particle 1 and 2
k10, k20 = math.sqrt(m1_squared*c**2+k1x**2+k1y**2+k1z**2), math.sqrt(m2_squared*c**2+k2x**2+k2y**2+k2z**2)
# energy must be conserved
if abs(k10+k20-k30-p0) < 10**(-18):
# jacobin term
Ja = abs(k1z/k10 - (-k1z+k3z+pz)/k20)
# momentum product
momentum_product = k10*k20*k30*p0
**2-3 Collision: **
Ja
momentum_product
if k1z > -half_pz[flavor[i_collision,0]] and k1z < half_pz[flavor[i_collision,0]]:
# obtain k2x, k2y, k2z via momentum conservation
k2x = k3x + k4x + px - k1x
k2y = k3y + k4y + py - k1y
k2z = k3z + k4z + pz - k1z
# k2x, k2y, k2z must also be in the momentum box range of particle 1
if (k2x > -half_px[flavor[i_collision,1]] and k2x < half_px[flavor[i_collision,1]] and
k2y > -half_py[flavor[i_collision,1]] and k2y < half_py[flavor[i_collision,1]] and
k2z > -half_pz[flavor[i_collision,1]] and k2z < half_pz[flavor[i_collision,1]]):
# energy for particle 1 and 2
k10, k20 = math.sqrt(m1_squared*c**2+k1x**2+k1y**2+k1z**2), math.sqrt(m2_squared*c**2+k2x**2+k2y**2+k2z**2)
# energy must be conserved
if abs(k10+k20-k30-k40-p0) < 10**(-18):
# jacobin term
Ja = abs(k1z/k10 - (-k1z+k3z+k4z+pz)/k20)
# momentum product
momentum_product = k10*k20*k30*k40*p0
**3-2 Collision: **
Ja
momentum_product
if k4z > -half_pz[flavor[i_collision,3]] and k4z < half_pz[flavor[i_collision,3]]:
# obtain k2x, k2y, k2z via momentum conservation
k2x = k4x + px - k1x - k3x
k2y = k4y + py - k1y - k3y
k2z = k4z + pz - k1z - k3z
# k2x, k2y, k2z must also be in the momentum box range of particle 1
if (k2x > -half_px[flavor[i_collision,1]] and k2x < half_px[flavor[i_collision,1]] and
k2y > -half_py[flavor[i_collision,1]] and k2y < half_py[flavor[i_collision,1]] and
k2z > -half_pz[flavor[i_collision,1]] and k2z < half_pz[flavor[i_collision,1]]):
# energy for particle 4 and 2
k40, k20 = math.sqrt(m4_squared*c**2+k4x**2+k4y**2+k4z**2), math.sqrt(m2_squared*c**2+k2x**2+k2y**2+k2z**2)
# energy must be conserved
if abs(k10+k20+k30-k40-p0) < 10**(-18):
# jacobin term
Ja = abs(-k4z/k40 + (-k1z-k3z+k4z+pz)/k20)
# momentum product
momentum_product = k10*k20*k30*k40*p0
- Calculate f and F
The first step is to find the “location” of the particle according to the index.
if Ja>1e-5 and momentum_product > 10**(-19):
# for particle 0
ik1x, ik1y, ik1z = int((k1x + half_px[flavor0])//dpx[flavor0]), \
int((k1y + half_py[flavor0])//dpy[flavor0]), \
int((k1z + half_pz[flavor0])//dpz[flavor0])
# convert grid index in one dimension
i_phase_grid0 = threeD_to_oneD(ix,iy,iz,ik1x,ik1y,ik1z, \
nx, ny, nz, npx, npy, npz)
# for particle 1
ik2x, ik2y, ik2z = int((k2x + half_px[flavor1])//dpx[flavor1]), \
int((k2y + half_py[flavor1])//dpy[flavor1]), \
int((k2z + half_pz[flavor1])//dpz[flavor1])
# convert grid index in one dimension
i_phase_grid1 = threeD_to_oneD(ix,iy,iz,ik2x,ik2y,ik2z, \
nx, ny, nz, npx, npy, npz)
# for particle 2
ik3x, ik3y, ik3z = int((k3x + half_px[flavor2])//dpx[flavor2]), \
int((k3y + half_py[flavor2])//dpy[flavor2]), \
int((k3z + half_pz[flavor2])//dpz[flavor2])
# convert grid index in one dimension
i_phase_grid2 = threeD_to_oneD(ix,iy,iz,ik3x,ik3y,ik3z, \
nx, ny, nz, npx, npy, npz)
Secondly, according to the index to find the corresponding particles of f and F and the difference
**2-2 Collision: **
f = cuda.local.array(shape=4, dtype=float64)
f[0] = f_x_p_t0level[flavor0, i_phase_grid0]
f[1] = f_x_p_t0level[flavor1, i_phase_grid1]
f[2] = f_x_p_t0level[flavor2, i_phase_grid2]
f[3] = f_x_p_tilevel[p_type, i_grid]
# feed values of distribution function and its quantum correction
tf = cuda.local.array(shape=4, dtype=float64)
# particle_type: 0, 1, 2 for classical, fermi and Bosonic
for i_particle in range(4):
i_flavor = flavor[i_collision,i_particle]
if particle_type[i_flavor] == 1:
tf[i_particle] = 1 - (2*math.pi*hbar)**3*f[i_particle]/degeneracy[i_flavor]
elif particle_type[i_flavor] == 2:
tf[i_particle] = 1 + (2*math.pi*hbar)**3*f[i_particle]/degeneracy[i_flavor]
else:
tf[i_particle] = 1
distribution_terms = f[0]*f[1]*tf[2]*tf[3]*amplitude_square/degeneracy[flavor[i_collision,0]]/degeneracy[flavor[i_collision,1]] - tf[0]*tf[1]*f[2]*f[3]*amplitude_square/degeneracy[flavor[i_collision,2]]/degeneracy[flavor[i_collision,3]]
**2-3 Collision: **
f = cuda.local.array(shape=5, dtype=float64)
f[0] = f_x_p_t0level[flavor[i_collision,0], i_phase_grid0]
f[1] = f_x_p_t0level[flavor[i_collision,1], i_phase_grid1]
f[2] = f_x_p_t0level[flavor[i_collision,2], i_phase_grid2]
f[3] = f_x_p_t0level[flavor[i_collision,3], i_phase_grid3]
f[4] = f_x_p_tilevel[p_type, i_grid]
# feed values of distribution function and its quantum correction
tf = cuda.local.array(shape=5, dtype=float64)
# particle_type: 0, 1, 2 for classical, fermi and Bosonic
for i_particle in range(5):
i_flavor = flavor[i_collision,i_particle]
if particle_type[i_flavor] == 1:
tf[i_particle] = 1 - (2*math.pi*hbar)**3*f[i_particle]/degeneracy[i_flavor]
elif particle_type[i_flavor] == 2:
tf[i_particle] = 1 + (2*math.pi*hbar)**3*f[i_particle]/degeneracy[i_flavor]
else:
tf[i_particle] = 1
distribution_terms = (1/(2*math.pi*hbar)**3)*f[0]*f[1]*tf[2]*tf[3]*tf[4]*amplitude_square/degeneracy[flavor[i_collision,0]]/degeneracy[flavor[i_collision,1]] - tf[0]*tf[1]*f[2]*f[3]*f[4]*amplitude_square/degeneracy[flavor[i_collision,2]]/degeneracy[flavor[i_collision,3]]/degeneracy[flavor[i_collision,4]]
**3-2 Collision: **
f = cuda.local.array(shape=5, dtype=float64)
f[0] = f_x_p_t0level[flavor[i_collision,0], i_phase_grid0]
f[1] = f_x_p_t0level[flavor[i_collision,1], i_phase_grid1]
f[2] = f_x_p_t0level[flavor[i_collision,2], i_phase_grid2]
f[3] = f_x_p_t0level[flavor[i_collision,3], i_phase_grid3]
f[4] = f_x_p_tilevel[p_type, i_grid]
# feed values of distribution function and its quantum correction
tf = cuda.local.array(shape=5, dtype=float64)
# particle_type: 0, 1, 2 for classical, fermi and Bosonic
for i_particle in range(5):
i_flavor = flavor[i_collision,i_particle]
if particle_type[i_flavor] == 1:
tf[i_particle] = 1 - (2*math.pi*hbar)**3*f[i_particle]/degeneracy[i_flavor]
elif particle_type[i_flavor] == 2:
tf[i_particle] = 1 + (2*math.pi*hbar)**3*f[i_particle]/degeneracy[i_flavor]
else:
tf[i_particle] = 1
distribution_terms = f[0]*f[1]*f[2]*tf[3]*tf[4]*amplitude_square/degeneracy[flavor[i_collision,0]]/degeneracy[flavor[i_collision,1]]/degeneracy[flavor[i_collision,2]] - (1/(2*math.pi*hbar)**3)*tf[0]*tf[1]*tf[2]*f[3]*f[4]*amplitude_square/degeneracy[flavor[i_collision,3]]/degeneracy[flavor[i_collision,4]]
- Calculate amplitude_square;
For a specific and detailed description, please refer to amplitude_square, where the function is thoroughly explained.
- Calculate collision term
**2-2 Collision: **
The “symmetric sampling” method is used to ensure the conservation of particle number, the details of which can be found in the paper published by Prof. Jun-Jie Zhang. Towards a full solution of the relativistic Boltzmann equation for quark-gluon matter on GPUs
if flavor0==flavor1:
symmetry_factor = 0.5
else:
symmetry_factor = 1.0
# accumulate collision kernel
# some factors are compensated later
result = distribution_terms*symmetry_factor\
*hbar**2*c/Ja*int_volume/(momentum_product*64*math.pi**2)
cuda.atomic.add(collision_term0level, \
(flavor0, i_phase_grid0), -result/dp[flavor0]*dp[p_type])
cuda.atomic.add(collision_term0level, \
(flavor1, i_phase_grid1), -result/dp[flavor1]*dp[p_type])
cuda.atomic.add(collision_term0level, \
(flavor2, i_phase_grid2), result/dp[flavor2]*dp[p_type])
cuda.atomic.add(collision_termilevel, (p_type, i_grid), result)
**2-3 Collision: **
if flavor[i_collision,0]==flavor[i_collision,1]:
symmetry_factor = 0.5
else:
symmetry_factor = 1.0
# accumulate collision kernel
# some factors are compensated later
result = distribution_terms*symmetry_factor*\
hbar**3*c*int_volume/(momentum_product*128*math.pi**2)/Ja
cuda.atomic.add(collision_term0level, \
(flavor[i_collision,0], i_phase_grid0), -result/dp[flavor[i_collision,0]]*dp[p_type])
cuda.atomic.add(collision_term0level, \
(flavor[i_collision,1], i_phase_grid1), -result/dp[flavor[i_collision,1]]*dp[p_type])
cuda.atomic.add(collision_term0level, \
(flavor[i_collision,2], i_phase_grid2), result/dp[flavor[i_collision,2]]*dp[p_type])
cuda.atomic.add(collision_term0level, \
(flavor[i_collision,3], i_phase_grid3), result/dp[flavor[i_collision,3]]*dp[p_type])
cuda.atomic.add(collision_termilevel, (p_type, i_grid), result)
**3-2 Collision: **
if flavor[i_collision,0]==flavor[i_collision,1]==flavor[i_collision,2]:
symmetry_factor = 1/3
elif flavor[i_collision,0]==flavor[i_collision,1] or \
flavor[i_collision,1]==flavor[i_collision,2] or \
flavor[i_collision,0]==flavor[i_collision,2]:
symmetry_factor = 0.5
else:
symmetry_factor = 1.0
# accumulate collision kernel
# some factors are compensated later
result = distribution_terms*symmetry_factor*\
hbar**3*c*int_volume/(momentum_product*128*math.pi**2)/Ja
cuda.atomic.add(collision_term0level, \
(flavor[i_collision,0], i_phase_grid0), -result/dp[flavor[i_collision,0]]*dp[p_type])
cuda.atomic.add(collision_term0level, \
(flavor[i_collision,1], i_phase_grid1), -result/dp[flavor[i_collision,1]]*dp[p_type])
cuda.atomic.add(collision_term0level, \
(flavor[i_collision,2], i_phase_grid2), -result/dp[flavor[i_collision,2]]*dp[p_type])
cuda.atomic.add(collision_term0level, \
(flavor[i_collision,3], i_phase_grid3), result/dp[flavor[i_collision,3]]*dp[p_type])
cuda.atomic.add(collision_termilevel, (p_type, i_grid), result)