parameter

Description of your first forum.
Post Reply
fxm1
Posts: 9
Joined: Fri Jan 23, 2026 11:52 am

parameter

Post by fxm1 »

Dear Develoeprs,
I am running Lumen 2.0.1 and noticed that in the new release note “calculation of Xo response function in G-space: The implementation of the Xo response function has been re-written by taking advantage of lapack GEMM. The new implementation is fully GPU ported by using devxlib_xGEMM interface. Roughly a factor 2 speed-up is obtained and also a memory reduction of a factor 2 for the allocation of the G-space response function.”
I have found a new option as:
BZINDX_CPU= "" # [PARALLEL] CPUs for each role
BZINDX_ROLEs= "" # [PARALLEL] CPUs roles (k)
Do these parameters relate to the new Xo response function calculation?(If not,how should we set the related parameters about the new feature)

Thanks!

Xiaomeng
Davide
Posts: 11
Joined: Mon Dec 29, 2025 8:50 am

Re: parameter

Post by Davide »

Dear Xiaomeng,
the two point are not related.

1 - The new implementation in Xo (thanks to Andrea Ferretti and co-workers) is used automatically, nothing to do on your side.

2 - The X_{G,G'} can be further distributed in memory if you use the parallelization input variables for the response function, using the "g" role as below. PLease notice that other roles duplicte X_{G,G'} in memory, but are in general more efficient. Especially "c","v", and "k".

Code: Select all

X_and_IO_ROLEs= "g.q.k.c.v"
X_and_IO_CPU="4.1.2.1.1"
3 - Reagardless of X_{G,G'} being memory distributed or not, the solution of it's dyson equation is usually performed with Lapack, but it can be also performed with scalapack (but this is completely independent from points 1 and 2).

Code: Select all

X_nCPU_LinAlg_INV=4
4 - The variables BZINDX instead are not new, and they refer to the memory distribution of the variable qindx_B, used for the bse runlevel. This variable is usually duplicated in memory. You can activate its distribution with (it might be needed if you have very dense k-meshes)

Code: Select all

BZINDX_CPU= "8" # [PARALLEL] CPUs for each role
BZINDX_ROLEs= "8" # [PARALLEL] CPUs roles (k)
The implementation has been recently improved. See this point in the release notes:
Fixed performance issue in qindx_B distribution in BSE (switched from HDF5 I/O to MPI WinCreate)
Best,
D.
Davide Sangalli, PhD
Piazza Leonardo Da Vinci, 32, 20133 – Milano
CNR, Istituto di Struttura della Materia (ISM)
https://sites.google.com/view/davidesangalli
Post Reply