Test suite when VASP is compiled using Intel 2020.1.217
Moderators: Global Moderator, Moderator
-
- Newbie
- Posts: 19
- Joined: Wed Nov 06, 2019 3:12 pm
Test suite when VASP is compiled using Intel 2020.1.217
I have found that the 2018 toolchain has numerical issues with instruction sets newer than AVX2. I am using Intel 2020.1.217 for this build. I have found that VASP built with this toolset fails on tests involving the Andersen thermostat in the VASP test-suite. I have listed them below.
andersen_nve_constrain_fixed andersen_nve_constrain_fixed_MDALGO=11 andersen_nve_constrain_fixed_MDALGO=11_RPR andersen_nve_constrain_fixed_RPR andersen_nve_fixed andersen_nve_fixed_MDALGO=11 andersen_nve_fixed_MDALGO=11_RPR andersen_nve_fixed_RPR andersen_nvt_fixed andersen_nvt_fixed_MDALGO=11 andersen_nvt_fixed_MDALGO=11_RPR andersen_nvt_fixed_RPR
John Low
Argonne National Laboratory
andersen_nve_constrain_fixed andersen_nve_constrain_fixed_MDALGO=11 andersen_nve_constrain_fixed_MDALGO=11_RPR andersen_nve_constrain_fixed_RPR andersen_nve_fixed andersen_nve_fixed_MDALGO=11 andersen_nve_fixed_MDALGO=11_RPR andersen_nve_fixed_RPR andersen_nvt_fixed andersen_nvt_fixed_MDALGO=11 andersen_nvt_fixed_MDALGO=11_RPR andersen_nvt_fixed_RPR
John Low
Argonne National Laboratory
-
- Global Moderator
- Posts: 492
- Joined: Mon Nov 04, 2019 12:41 pm
- Contact:
Re: Test suite when VASP is compiled using Intel 2020.1.217
This post was originally made on a different thread:
forum/viewtopic.php?f=4&t=17952
This is a different question so I made a new thread.
Could you please post the log file with the exact error that you got in this run?
forum/viewtopic.php?f=4&t=17952
This is a different question so I made a new thread.
Could you please post the log file with the exact error that you got in this run?
-
- Newbie
- Posts: 10
- Joined: Wed Nov 20, 2019 10:24 pm
Re: Test suite when VASP is compiled using Intel 2020.1.217
I have also come across this problem in vasp.6.2.0 (and vasp6.1.2) It is due to the use of xHost in the standard makefiles.
The attached two outputs were run on a large Intel SkyLake cluster at the Australian National University.
They used Intel compilers and libraries 2020.0.166 which is one of the 'verified' toolchains.
The standard makefile.include makefile.include.linux_intel_omp was used.
The output labelled 'fail' was from a program build with -xHost, as in the supplied files.
The output labelled 'correct' was build without using xHost.
The job example is andersen_nve_constrain_fixed from the test suite, but all jobs with andersen+fixed fail the same way, using xHost, but work without.
If you want someone more specific, they fail if AVX512 instructions are requested.
The attached two outputs were run on a large Intel SkyLake cluster at the Australian National University.
They used Intel compilers and libraries 2020.0.166 which is one of the 'verified' toolchains.
The standard makefile.include makefile.include.linux_intel_omp was used.
The output labelled 'fail' was from a program build with -xHost, as in the supplied files.
The output labelled 'correct' was build without using xHost.
The job example is andersen_nve_constrain_fixed from the test suite, but all jobs with andersen+fixed fail the same way, using xHost, but work without.
If you want someone more specific, they fail if AVX512 instructions are requested.
You do not have the required permissions to view the files attached to this post.
-
- Global Moderator
- Posts: 12
- Joined: Wed Nov 06, 2019 8:44 am
Re: Test suite when VASP is compiled using Intel 2020.1.217
Hi John, Hi Roger,
I can confirm that this is an issue that is triggered by requesting AVX512 instructions (-xCORE-AVX512 or -xHost on an applicable host), and disappears when limiting things to AVX2.
We have not seen this before because only recently we acquired an AVX512 capable machine (a Cascade Lake Xeon).
I reproduced this with the Intel 19.1.2.254 compilers (which means some 2020 version of Parallel Studio --- confusing).
I have not checked whether it is solved in the new oneAPI distros, but will try to do so ASAP.
Just in case you are interested: I traced the problem to a few completely innocuous lines:
The change shown above solves the problem .. go figure :-)
I will incorporate this change in the upcoming release, just in case it is still a problem with the current Intel oneAPI compilers & tools.
Cheers!
I can confirm that this is an issue that is triggered by requesting AVX512 instructions (-xCORE-AVX512 or -xHost on an applicable host), and disappears when limiting things to AVX2.
We have not seen this before because only recently we acquired an AVX512 capable machine (a Cascade Lake Xeon).
I reproduced this with the Intel 19.1.2.254 compilers (which means some 2020 version of Parallel Studio --- confusing).
I have not checked whether it is solved in the new oneAPI distros, but will try to do so ASAP.
Just in case you are interested: I traced the problem to a few completely innocuous lines:
Code: Select all
diff --git a/src/mymath.F b/src/mymath.F
index f46d930b..919fda62 100644
--- a/src/mymath.F
+++ b/src/mymath.F
@@ -1449,9 +1449,7 @@
Ltxyz=1
DO i=1,T_INFO%NIONS
DO j=1,3
- IF (.NOT. T_INFO%LSFOR(j,i)) THEN
- Ltxyz(j)=0
- ENDIF
+ IF (.NOT. T_INFO%LSFOR(j,i) .AND. Ltxyz(j)==1) Ltxyz(j)=0
ENDDO
ENDDO
The change shown above solves the problem .. go figure :-)
I will incorporate this change in the upcoming release, just in case it is still a problem with the current Intel oneAPI compilers & tools.
Cheers!
-
- Newbie
- Posts: 19
- Joined: Wed Nov 06, 2019 3:12 pm
Re: Test suite when VASP is compiled using Intel 2020.1.217
Martin and Roger,
I have added Martin's patch to VASP6.2.0 and tested it on the Skylake clusters at eagle.nrel.gov. This build passed all the tests in the VASP testsuite.
"Intel(R) MPI Library for Linux* OS, Version 2019 Update 7 Build 20200312 (id: 5dc2dd3e9)" and "ifort version 19.1.1.217" was used in this build.
I used the attached makefile.include which includes the compiler flag "-fp-model precise" to avoid other numerical errors.
Thank you Martin for the patch!
John
I have added Martin's patch to VASP6.2.0 and tested it on the Skylake clusters at eagle.nrel.gov. This build passed all the tests in the VASP testsuite.
"Intel(R) MPI Library for Linux* OS, Version 2019 Update 7 Build 20200312 (id: 5dc2dd3e9)" and "ifort version 19.1.1.217" was used in this build.
I used the attached makefile.include which includes the compiler flag "-fp-model precise" to avoid other numerical errors.
Thank you Martin for the patch!
John
You do not have the required permissions to view the files attached to this post.
-
- Newbie
- Posts: 19
- Joined: Wed Nov 06, 2019 3:12 pm
Re: Test suite when VASP is compiled using Intel 2020.1.217
The patch for AVX-512 helped with vasp built for KNLs (MIC-AVX512) but there are still a few issues with SCAN on the KNLs.
I have attached the makefile.include I used and the results from the testsuite on a KNL.
If anyone is interested in helping me with this issue, but does not have access to KNLs. I might be able to arrange access to KNLs at lcrc.anl.gov.
John J. Low
Argonne National Laboratory.
I have attached the makefile.include I used and the results from the testsuite on a KNL.
If anyone is interested in helping me with this issue, but does not have access to KNLs. I might be able to arrange access to KNLs at lcrc.anl.gov.
John J. Low
Argonne National Laboratory.
You do not have the required permissions to view the files attached to this post.
-
- Global Moderator
- Posts: 492
- Joined: Mon Nov 04, 2019 12:41 pm
- Contact:
Re: Test suite when VASP is compiled using Intel 2020.1.217
Sorry for my delayed answer.
Thanks for your report.
We are currently looking into this issue and will let you know as soon as we've found something.
Thanks for your report.
We are currently looking into this issue and will let you know as soon as we've found something.
-
- Global Moderator
- Posts: 492
- Joined: Mon Nov 04, 2019 12:41 pm
- Contact:
Re: Test suite when VASP is compiled using Intel 2020.1.217
Ok, we (@mmarsman and I) looked more carefully at your makefile.include
The reason for the discrepancy is probably the flag "-DnoAugXCmeta" and not the MIC-AVX512 architecture.
This "-DnoAugXCmeta" tag falls in the category of "Deprecated/Not-recommended":
wiki/index.php/Precompiler_flags#Deprec ... ecommended
We added some further explanation in the wiki for the reason it should not be used.
The reason for the discrepancy is probably the flag "-DnoAugXCmeta" and not the MIC-AVX512 architecture.
This "-DnoAugXCmeta" tag falls in the category of "Deprecated/Not-recommended":
wiki/index.php/Precompiler_flags#Deprec ... ecommended
We added some further explanation in the wiki for the reason it should not be used.
Could you try recompiling the code without this flag and let us know if the test suite passes?This option was added to compute the metaGGA contributions from the non-augmented pseudo density (instead of the augmented density). There is a condition concerning the behavior of the von-Weizsäcker kinetic energy density (second derivative of the charge density) and the kinetic energy density computed from the orbitals ingrained into TPSS and revTPSS. This condition can be strongly violated when one augments the charge density. For the TPSS and revTPSS the functionals can become unstable in those cases. SCAN and its derivates (RSCAN, R2SCAN, etc) do not assume the aforementioned conditions to be met and remain stable for the augmented density as well so this option should not be used as it will negatively affect the final results.
-
- Newbie
- Posts: 19
- Joined: Wed Nov 06, 2019 3:12 pm
Re: Test suite when VASP is compiled using Intel 2020.1.217
Henrique,
Thanks for the tip on the -DnoAugXCmeta flag. I have built vasp 6.2.1 with that flag removed from my makefile.include file and it passes all the tests in the vasp test suite!
I have attached a tar archive with my makefile.include, the testing scripts and results from the test suite.
Note that one of the tests (bulk_BN_SCAN+rVV10) failed when run with 8 MPI processes with 8 OMP threads for each MPI process. But did pass when run with 4 MPI processes with two OMP threads each.
Sorry for the delay in following up on this!
Thanks for the help!
John Low
Argonne National Laboratory
Thanks for the tip on the -DnoAugXCmeta flag. I have built vasp 6.2.1 with that flag removed from my makefile.include file and it passes all the tests in the vasp test suite!
I have attached a tar archive with my makefile.include, the testing scripts and results from the test suite.
Note that one of the tests (bulk_BN_SCAN+rVV10) failed when run with 8 MPI processes with 8 OMP threads for each MPI process. But did pass when run with 4 MPI processes with two OMP threads each.
Sorry for the delay in following up on this!
Thanks for the help!
John Low
Argonne National Laboratory
You do not have the required permissions to view the files attached to this post.
-
- Full Member
- Posts: 189
- Joined: Tue Oct 13, 2020 11:32 pm
Re: Test suite when VASP is compiled using Intel 2020.1.217
I have built vasp 6.3.0 with the recent/latest Intel oneAPI base and hpc toolkits. Based on my validation, the bulk_BN_SCAN+rVV10 test runs successfully for both 8 MPI processes with 8 OMP threads for each MPI process and 4 MPI processes with 2 OMP threads for each. I have attached a tar archive which inclues my makefile.include, the testing scripts and results from the test suite.john_low1 wrote: ↑Wed Aug 11, 2021 3:43 pm Henrique,
Thanks for the tip on the -DnoAugXCmeta flag. I have built vasp 6.2.1 with that flag removed from my makefile.include file and it passes all the tests in the vasp test suite!
I have attached a tar archive with my makefile.include, the testing scripts and results from the test suite.
Note that one of the tests (bulk_BN_SCAN+rVV10) failed when run with 8 MPI processes with 8 OMP threads for each MPI process. But did pass when run with 4 MPI processes with two OMP threads each.
Best regards,
Hongsheng Zhao
You do not have the required permissions to view the files attached to this post.
-
- Full Member
- Posts: 189
- Joined: Tue Oct 13, 2020 11:32 pm
Re: Test suite when VASP is compiled using Intel 2020.1.217
I tried with the bulk_BN_SCAN+rVV10 example with different combinations of nranks and nthrds. I found that their time performance may be very different. In this test, nranks=16 nthrds=16 is very time-consuming, and I terminated this testing step before it was over. For a summary of the time benchmarks corresponding to the tests here, see the following:
So a natural question is: what combination of nranks and nthrds is optimal for a specific computational task? Is there a rule of thumb?
Regards,
HZ
Code: Select all
nranks=4 nthrds=2
real 0m13.734s
user 1m21.187s
sys 0m4.244s
nranks=8 nthrds=8
real 0m12.930s
user 8m31.540s
sys 0m30.591s
nranks=16 nthrds=16
^C
Regards,
HZ
-
- Full Member
- Posts: 189
- Joined: Tue Oct 13, 2020 11:32 pm
Re: Test suite when VASP is compiled using Intel 2020.1.217
I've tried to discuss this question here and got some useful advice. Interested users can refer to the above discussion for some relevant clues.So a natural question is: what combination of nranks and nthrds is optimal for a specific computational task? Is there a rule of thumb?
Regards,
HZ
Yours,
HZ