[WIP] Fortran bindings
Created Fortran Bindings for quicksched master, including a few extra functions required to avoid declaring the qsched structure inside fortran (because of pthread difficulty).
Tested with OpenMP version (don't have a pthread setup to test with i think).
Makefile.am in the fortran_examples folder is ignored.
This is created hopefully to test DL_POLY with at some point soon, but should be generally easy(ish) to use in FORTRAN codes.
Do we want any other examples? I'm loathe to implement anything particularly complex but currently I have not tested:
qsched_adduse
, qsched_addlock
, qsched_free
, qsched_reset
, qsched_ensure
, qsched_res_own
.
New functions:
f_qsched_create
, f_qsched_destroy
control the creation of the quicksched object.
Will quickly fix them emory leak (no call to qsched_free
) in the super simple example
Merge request reports
Activity
Reassigned to @nnrw56
Added 1 commit:
- 99a739bf - Added the missing qsched_free call to the fortran test
Added 1 commit:
- b16508de - fixed copyright stuff
Added 1 commit:
- ea3779e7 - getting thread ID is now just a simple call that doesn't require the user to pas…
Added 1 commit:
- ee7273b6 - Fixed a bug with pthread IDs and made the scheduler threadsafe using our gnu99 n…
Added 1 commit:
- 69434db3 - Added various qsched_..._none values to fortran bindings
1646 1674 lock_init( &s->lock ); 1647 1675 1648 1676 } 1677 1678 1679 struct qsched * f_qsched_create() 1680 { 1681 struct qsched *s; 1682 s = (struct qsched *) malloc(sizeof(struct qsched)); 1683 return s; 1684 } 1685 void f_qsched_destroy( struct qsched *s) 1686 { 1687 free(s); 1688 } 1646 1674 lock_init( &s->lock ); 1647 1675 1648 1676 } 1677 1678 1679 struct qsched * f_qsched_create() 1680 { 1681 struct qsched *s; 1682 s = (struct qsched *) malloc(sizeof(struct qsched)); 1683 return s; 1684 } 1685 void f_qsched_destroy( struct qsched *s) 1686 { 1687 free(s); 1688 } I originally created them so these are separate from qsched_init and qsched_free to mirror C more closely where you can do
struct qsched s; qsched_init(s,..); ... qsched_free(s); qsched_init(s,...); ... qsched_free(s);
If this function also does
qsched_free
I felt it was more likely to lead to crashes where people callqsched_free
thenf_qsched_destroy
- but I'm not sure which is better.
1646 1674 lock_init( &s->lock ); 1647 1675 1648 1676 } 1677 1678 1679 struct qsched * f_qsched_create() 1680 { 1681 struct qsched *s; 1682 s = (struct qsched *) malloc(sizeof(struct qsched)); 1683 return s; 1684 } 1685 void f_qsched_destroy( struct qsched *s) 1686 { 1687 free(s); 1688 } Added 1 commit:
- 39024352 - Missed the module file for including when using make install
@nnrw56 remembered what I need you to check if possible - why the makefiles for the fortran_example folder isn't working.
Added 1 commit:
- 644009c6 - Attempt to fix up the Makefile.am to make building more reliable
Have not managed to get any performance from this yet - my attempt to use this as part of DL_POLY results in the task code (i.e. the functions called inside the
runner
function) performing X times slower than in serial where X is (roughly) the number of threads active, and I have no reasonable explanation for the behaviour I see. Profilers just tell me "this function is slower", as does wrapping the function inomp_get_wtime()
.The scheduler itself seems to behave sensibly (quicksched timers result in 24 threads resulting in no more than 3-4x longer spent in the scheduler), so I am completely stuck with tuning DL_POLY for this at the moment.
I'll try to come up with a testcase for the example folder which we can actually do a runtime test in and try to work out if there is something weird going on - maybe the bh test is fairly easy to do in FORTRAN but I'm not sure yet.
The only thing that I can imagine is that the FORTRAN code is pinning it's main thread -- and all its child threads -- on a single core. Do you have a way of checking what physical core your threads are running on to check this?
If this is the case, there are two possible routes to solve it:
- Some flag to the FORTRAN compiler that lets it use all the threads?
- Setting the preferred CPU when creating the
pthread
s in QuickSched, much as we do in SWIFT?
I'm using the OpenMP version (as the pthread version will be incorrect as the code is reliant on !$omp threadprivate inside the runner function in FORTRAN), and I'm in theory using all the cores according to
KMP_AFFINITY=verbose,scatter
, also having triedOMP_PROC_BIND
.But it definitely looks like what you would expect an affinity issue to look like. The threads should be created once (which is inside the FORTRAN code), pinned to whichever core, but then should be fine to be used wherever.
A non-hierarchical example is probably easiest, QR could be possible but switching everything from pointers to indices is annoying, but want to be pointer-free to check performance (and there are some issues with FORTRAN/C interoperability and pointers if you aren't careful atm). I could do something stupid like a "block-based" particle method, where i create "cells" of particles (at random) and compute N^2 interaction between particles
Ok, created a super naive n^2 interaction test and it scales at 90% to 24 cores. Its available in b7e3c77f and it should work fine (accidentally in the wrong branch but we can sort that later).