Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH Action to build and test #10

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

GH Action to build and test #10

wants to merge 10 commits into from

Conversation

dpo
Copy link
Collaborator

@dpo dpo commented Oct 9, 2022

No description provided.

@dpo dpo force-pushed the gh-action branch 5 times, most recently from b95bd35 to ae6870c Compare October 9, 2022 15:25
@dpo
Copy link
Collaborator Author

dpo commented Oct 9, 2022

@nimgould Here is a GitHub Action to automatically build a standalone GALAHAD for Linux and macOS. By clicking on the "Details" link below 👇 ("standalone / ubuntu" or "standalone / macos"), you can see the build steps and the details of the build. Currently, both fail because MA27 is not present.

If we merge this pull request and you submit your future changes in the form of pull requests, this action will run automatically and you'll be able to assess directly whether the builds succeed.

ps: the fact that the installation process is interactive is a real obstacle here. (It's the same in CUTEst & Co.) We should really think of modernizing the install script (at the very least). The process is brittle because the "user selections" change each time you add a platform, a compiler, or some other interactive option.

@dpo dpo force-pushed the gh-action branch 2 times, most recently from 2bce766 to 92728bb Compare October 9, 2022 17:43
@dpo dpo force-pushed the gh-action branch 4 times, most recently from 8011b3a to 5b22c73 Compare October 25, 2022 03:46
@dpo
Copy link
Collaborator Author

dpo commented Oct 25, 2022

@nimgould A few updates here:

  1. I managed to build GALAHAD on Linux by installing libhwloc-dev with apt-get. I haven't gotten Homebrew hwloc to work (I think because of a glibc version mismatch but need to investigate more). This will be an issue for Homebrew users on Linux. However, there are issues in the unit tests:

    • In the C interface unit tests, there are some nonzero statuses. Is that a problem? I was expecting status = 0;
    • LANCELOT-B segfaults;
    • Tests related to the SIF interface for LANCELOT shouldn't be run (I did not install the SIF interface).
  2. GALAHAD still doesn't install on macOS because of the SSIDS issue.

@nimgould
Copy link
Contributor

  1. I would expect some nonzero status values if the requested linear solvers are not present.
    I have modified both sls and ils to try to detect this, but this is ongoing. Need to watch, and
    also to test with HSL present to see if these disappear.
  2. Need to find out why. I will check here.
  3. Agreed. Need to check for this. On my list
@dpo dpo force-pushed the gh-action branch 2 times, most recently from f409a42 to ccbe396 Compare October 26, 2022 18:10
@dpo
Copy link
Collaborator Author

dpo commented Oct 26, 2022

@nimgould I updated the CI to the latest ARCHDefs and GALAHAD. We see the shared libs for macOS fail to build. I'll have a look when I can.

@dpo dpo force-pushed the gh-action branch 2 times, most recently from b1eabe1 to 18aeb74 Compare November 7, 2022 17:38
@dpo
Copy link
Collaborator Author

dpo commented Nov 7, 2022

@nimgould I refreshed this with the latest changes in master. There are compilation issues on Linux starting here:
https://github.com/ralna/GALAHAD/actions/runs/3412660790/jobs/5678455681#step:7:393

Build error on macOS apparently due to incorrect flags: https://github.com/ralna/GALAHAD/actions/runs/3412660790/jobs/5678455792#step:7:709

@dpo
Copy link
Collaborator Author

dpo commented Nov 10, 2022

Hi @nimgould. I rebased this on top of the latest changes in master. On Linux, the unit tests are still trying to run the SIF interface to LANCELOT:

 Testing SIF interface to LANCELOT B
/bin/sh: 1: sdgal: not found
make[1]: *** [/home/runner/work/GALAHAD/GALAHAD/src/lancelot/makemaster:208: test_cutest_single] Error 127
make: *** [/home/runner/work/GALAHAD/GALAHAD/src/makemaster:421: tests_single] Error 2
Error: Process completed with exit code 2.

On macOS, still lots of undefined symbols when building the shared libs: https://github.com/ralna/GALAHAD/actions/runs/3439387926/jobs/5736620985#step:7:710

I can probably help with that. Has the shared library building process stabilized on Linux?

@nimgould
Copy link
Contributor

I believe that this is because you are still using the makefile stanza "tests" not "test" in the workfile. "test" just performs the comprehensive stand-alone tests while "tests" does this + the sif/cutest tests.

Yes, please, for the shared libs. I think you need to check with @amontoison

@amontoison
Copy link
Member

amontoison commented Nov 11, 2022

For the shared libraries, I need your help @nimgould.
For the Julia interface, I checked what we need and the conclusion is that we only need the shared library libgalahad_c.

We can't use the "glue" workaround here and also this hack is only working on Linux for information (even if you just want the Python interface).
The last issue that I have is #19 and I can't continue without a fix.

Nick, if we can't compile with -Wl,--undefined, It doesn't work on Windows or Mac.
Windows doesn't allow incomplete shared libraries and we need to tune the options of Mac compilers to allow undefined symbols.

@dpo dpo force-pushed the gh-action branch 5 times, most recently from 7eb4caa to 2ccaf88 Compare November 11, 2022 20:04
@dpo
Copy link
Collaborator Author

dpo commented Nov 18, 2022

Ok, so the CI is now at a point where

  • both linux versions fail in the unit tests when using SSIDS with single precision GALAHAD;
  • both macOS versions fail when building the dynamic libraries.

My next task is to look into those dynamic libraries.

@nimgould
Copy link
Contributor

SPRAL can take either Metis. HSL needs metis 4, MUMPS and Pastix need metis 5. I presume that we can simply have a libmetis4.a/so and libmetis.a/so (for v5). I believe that the functions in 4 and 5 can coexist as they have different signatures. I shall rename the dummy metis4 in galahad to be galahad_metis4.a to make this clear.

@nimgould
Copy link
Contributor

nimgould commented Nov 18, 2022

More broadly, I now think we should provide two variables METIS4 and METIS5 in every compiler.* file in ARCHDEFS, setting as defaults

METIS4='-lgalahad_metis4'
METIS5=

This will then permeate into the $GALAHAD/makefiles/* on install, and will provide users opportunities subsequently to replace these with real versions of metis4 and metis5 if they are available. Incidentally, metis5 (and parametis 4) is available as part of the Linux distributions, as is mumps with or without (p)scotch orderings. Is that true on Macs as well? I have always downloaded and compiled metis 4 as described in the README.external file in galahad.
(same address as provided by @dpo above).

What do you think @dpo @jfowkes @amontoison ? If you agree, someone will have to make a global change to the archdefs files, and we need to coordinate the change with those for galahad so that they stay in sync

@nimgould
Copy link
Contributor

I have turned off the build of ssids in the single precision case for the time being. I had a look at the c++ an cuda parts that would need to be auto-sed-matically translated, but I do not feel confident enough in my C (or at all in my C++) to be sure that I got this right. Maybe one day.

@jfowkes
Copy link
Collaborator

jfowkes commented Nov 18, 2022

Happy to have a look at this with you at some point, but no promises we'll be able to make it work.

@nimgould
Copy link
Contributor

I am afraid that ssids in double precision is as leaky as a paper bag in a storm. Even on the small comprehensive test program, valgrind reports

=4076300== LEAK SUMMARY:
==4076300== definitely lost: 72 bytes in 3 blocks
==4076300== indirectly lost: 10,357,512 bytes in 43 blocks
==4076300== possibly lost: 16,777,248 bytes in 2 blocks
==4076300== still reachable: 456 bytes in 3 blocks
==4076300== suppressed: 0 bytes in 0 blocks

all of which relate to ssids c++ creates. Not good

@nimgould
Copy link
Contributor

Thank you @jfowkes , but not high priority I fear

@jfowkes
Copy link
Collaborator

jfowkes commented Nov 18, 2022

Yes unfortunately we have several open bugs in SPRAL relating to memory leaks, but no idea how to fix them...

@nimgould
Copy link
Contributor

Fortunately once we switch to a robust solver (such as dsytrf/s), the rest of GALAHAD is now leak free (excepting the leaks of valgrind instructions and a few c kernel functions). That's where my two weeks have gone! On to mumps next.

@dpo
Copy link
Collaborator Author

dpo commented Nov 18, 2022

Incidentally, metis5 (and parametis 4) is available as part of the Linux distributions, as is mumps with or without (p)scotch orderings. Is that true on Macs as well?

None of that is available on Macs by default in any official way. Precompiled binaries are usually available from Homebrew. METIS 5 is available but more obscure packages like MUMPS and ParMETIS are not officially available (I maintain the "taps" in my "free" time).

I would be very reluctant to rely on Linux package managers as they are distribution dependent. Unless you're on Ubuntu 22, you're stuck with buggy MUMPS 5.2 from 5 years ago. What about the 2000 other linux distros? Though it's not perfect, I've found that Homebrew is my preferred package manager on linux too.

@amontoison
Copy link
Member

amontoison commented Nov 18, 2022

@dpo, I think we should provide precompiled binaries of GALAHAD with Yggdrasil and rely on it even if the user is not using Julia.
It will be probably enough for 95% of the users.

@amontoison
Copy link
Member

amontoison commented Nov 18, 2022

@nimgould @jfowkes
We forgot to share with you that precompiled binaries of SPRAL are available online here:
https://github.com/JuliaBinaryWrappers/SPRAL_jll.jl
https://github.com/JuliaBinaryWrappers/SPRAL_jll.jl/releases/tag/SPRAL-v0.1.0%2B0

It means that a user can just download an archive of SPRAL (and its dependencies) for its platform and it directly works.

Everytime that we recompile SPRAL with BinaryBuilder / Yggdrasil, new archives are automatically uploaded.

@dpo
Copy link
Collaborator Author

dpo commented Nov 19, 2022

Here's the latest on the shared libraries for macOS. Firsly, I have to ask gfortran to compile with -O2 in order to work around a bug in the linker that should be fixed in the next release of the command-line tools. I'll open a pull request to ARCHDefs.

Secondly, if I install GALAHAD without the shared libraries, and then call create_one_shared (from which I removed --no-undefined), I get the following undefined symbols

❯ make -f /Users/dpo/dev/ralna/GALAHAD/makefiles/mac64.osx.gfo create_one_shared                        13:15:16
cd /Users/dpo/dev/ralna/GALAHAD/objects/mac64.osx.gfo/double; CC=gcc-12 FORTRAN=gfortran-12 OPTIMIZATION=-O2 \
                   SHARED=-shared DLEXT=dylib LOADALL=-all_load \
                   LOADNONE=-noall_load \
                   sh /Users/dpo/dev/ralna/GALAHAD/bin/create_one_shared
 creating single GALAHAD shared library in
  /Users/dpo/dev/ralna/GALAHAD/objects/mac64.osx.gfo/double/shared
 creating libgalahad_all.dylib
gfortran-12 -O2 -shared -o shared/libgalahad_all.dylib -Wl,-all_load libgalahad.a libgalahad_hsl.a libgalahad_spral.a             libgalahad_mkl_pardiso.a libgalahad_pardiso.a libgalahad_wsmp.a             libgalahad_pastix.a libgalahad_mumps.a libgalahad_umfpack.a             libgalahad_metis.a libgalahad_lapack.a libgalahad_blas.a             libgalahad_cutest_dummy.a libgalahad_hsl_c.a libgalahad_c.a -Wl,-noall_load -lstdc++ -lgomp
ld: warning: option -noall_load is obsolete and being ignored
Undefined symbols for architecture arm64:
  "___hsl_ma77_double_MOD_ma77_alter_double", referenced from:
      ___galahad_sls_double_MOD_sls_alter_d in libgalahad.a(sls.o)
  "___hsl_ma77_double_MOD_ma77_analyse_double", referenced from:
      ___galahad_sls_double_MOD_sls_analyse in libgalahad.a(sls.o)
  "___hsl_ma77_double_MOD_ma77_enquire_indef_double", referenced from:
      ___galahad_sls_double_MOD_sls_enquire in libgalahad.a(sls.o)
  "___hsl_ma77_double_MOD_ma77_enquire_posdef_double", referenced from:
      ___galahad_sls_double_MOD_sls_enquire in libgalahad.a(sls.o)
  "___hsl_ma77_double_MOD_ma77_factor_double", referenced from:
      ___galahad_sls_double_MOD_sls_factorize in libgalahad.a(sls.o)
  "___hsl_ma77_double_MOD_ma77_finalise_double", referenced from:
      ___galahad_sls_double_MOD_sls_terminate in libgalahad.a(sls.o)
      ___galahad_sls_double_MOD_sls_factorize in libgalahad.a(sls.o)
      ___galahad_sls_double_MOD_sls_analyse in libgalahad.a(sls.o)
  "___hsl_ma77_double_MOD_ma77_input_reals_double", referenced from:
      ___galahad_sls_double_MOD_sls_factorize in libgalahad.a(sls.o)
  "___hsl_ma77_double_MOD_ma77_input_vars_double", referenced from:
      ___galahad_sls_double_MOD_sls_analyse in libgalahad.a(sls.o)
  "___hsl_ma77_double_MOD_ma77_open_double", referenced from:
      ___galahad_sls_double_MOD_sls_analyse in libgalahad.a(sls.o)
      ___hsl_ma77_double_iface_MOD_ma77_open_main in libgalahad_hsl_c.a(hsl_ma77d_ciface.o)
  "___hsl_ma77_double_MOD_ma77_restart_double", referenced from:
      ___galahad_sls_double_MOD_sls_factorize in libgalahad.a(sls.o)
  "___hsl_ma77_double_MOD_ma77_scale_double", referenced from:
      ___galahad_sls_double_MOD_sls_factorize in libgalahad.a(sls.o)
  "___hsl_ma77_double_MOD_ma77_solve_double", referenced from:
      ___galahad_sls_double_MOD_sls_solve_multiple_rhs.constprop.0 in libgalahad.a(sls.o)
      ___galahad_sls_double_MOD_sls_solve_one_rhs.constprop.0 in libgalahad.a(sls.o)
      ___galahad_sls_double_MOD_sls_sparse_forward_solve in libgalahad.a(sls.o)
      ___galahad_sls_double_MOD_sls_part_solve in libgalahad.a(sls.o)
  "___hsl_ma77_double_MOD_ma77_solve_fredholm_double", referenced from:
      ___galahad_sls_double_MOD_sls_fredholm_alternative in libgalahad.a(sls.o)
  "___hsl_ma86_double_MOD_ma86_analyse_double", referenced from:
      ___galahad_sls_double_MOD_sls_analyse in libgalahad.a(sls.o)
  "___hsl_ma86_double_MOD_ma86_factor_double", referenced from:
      ___galahad_sls_double_MOD_sls_factorize in libgalahad.a(sls.o)
  "___hsl_ma86_double_MOD_ma86_finalise_double", referenced from:
      ___galahad_sls_double_MOD_sls_terminate in libgalahad.a(sls.o)
  "___hsl_ma86_double_MOD_ma86_solve_mult_double", referenced from:
      ___galahad_sls_double_MOD_sls_solve_multiple_rhs.constprop.0 in libgalahad.a(sls.o)
  "___hsl_ma86_double_MOD_ma86_solve_one_double", referenced from:
      ___galahad_sls_double_MOD_sls_solve_one_rhs.constprop.0 in libgalahad.a(sls.o)
      ___galahad_sls_double_MOD_sls_sparse_forward_solve in libgalahad.a(sls.o)
      ___galahad_sls_double_MOD_sls_part_solve in libgalahad.a(sls.o)
  "___hsl_ma97_double_MOD_ma97_alter_double", referenced from:
      ___galahad_sls_double_MOD_sls_alter_d in libgalahad.a(sls.o)
  "___hsl_ma97_double_MOD_ma97_analyse_double", referenced from:
      ___galahad_sls_double_MOD_sls_analyse in libgalahad.a(sls.o)
  "___hsl_ma97_double_MOD_ma97_enquire_indef_double", referenced from:
      ___galahad_sls_double_MOD_sls_enquire in libgalahad.a(sls.o)
  "___hsl_ma97_double_MOD_ma97_enquire_posdef_double", referenced from:
      ___galahad_sls_double_MOD_sls_enquire in libgalahad.a(sls.o)
  "___hsl_ma97_double_MOD_ma97_factor_double", referenced from:
      ___galahad_sls_double_MOD_sls_factorize in libgalahad.a(sls.o)
  "___hsl_ma97_double_MOD_ma97_finalise_double", referenced from:
      ___galahad_sls_double_MOD_sls_terminate in libgalahad.a(sls.o)
  "___hsl_ma97_double_MOD_ma97_solve_double", referenced from:
      ___galahad_sls_double_MOD_sls_solve_multiple_rhs.constprop.0 in libgalahad.a(sls.o)
  "___hsl_ma97_double_MOD_ma97_solve_fredholm_double", referenced from:
      ___galahad_sls_double_MOD_sls_fredholm_alternative in libgalahad.a(sls.o)
  "___hsl_ma97_double_MOD_ma97_solve_one_double", referenced from:
      ___galahad_sls_double_MOD_sls_solve_one_rhs.constprop.0 in libgalahad.a(sls.o)
      ___galahad_sls_double_MOD_sls_part_solve in libgalahad.a(sls.o)
  "___hsl_ma97_double_MOD_ma97_sparse_fwd_solve_double", referenced from:
      ___galahad_sls_double_MOD_sls_sparse_forward_solve in libgalahad.a(sls.o)
  "_hwloc_get_nbobjs_by_depth", referenced from:
      __ZNK5spral11hw_topology13HwlocTopology14get_numa_nodesEv in libgalahad_spral.a(guess_topology.o)
  "_hwloc_get_obj_by_depth", referenced from:
      __ZNK5spral11hw_topology13HwlocTopology14get_numa_nodesEv in libgalahad_spral.a(guess_topology.o)
  "_hwloc_get_type_depth", referenced from:
      __ZNK5spral11hw_topology13HwlocTopology14get_numa_nodesEv in libgalahad_spral.a(guess_topology.o)
  "_hwloc_topology_destroy", referenced from:
      _spral_hw_topology_guess in libgalahad_spral.a(guess_topology.o)
  "_hwloc_topology_init", referenced from:
      _spral_hw_topology_guess in libgalahad_spral.a(guess_topology.o)
  "_hwloc_topology_load", referenced from:
      _spral_hw_topology_guess in libgalahad_spral.a(guess_topology.o)
  "_hwloc_topology_set_type_filter", referenced from:
      _spral_hw_topology_guess in libgalahad_spral.a(guess_topology.o)
ld: symbol(s) not found for architecture arm64
collect2: error: ld returned 1 exit status
make: *** [create_one_shared] Error 1

I can resolve the hwloc symbols by adding -L/opt/homebrew/opt/hwloc/lib -lhwloc to create_one_shared, but the missing HSL symbols remain.

@amontoison
Copy link
Member

amontoison commented Nov 19, 2022

Thanks to Clang.jl, I explicitly listed the dependencies of each GALAHAD packages with a C interface here if it can help.
If one day we want to update the build system, we will have the dependency tree.
HSL packages are used almost everywhere!

@nimgould
Copy link
Contributor

I really don't understand what the mac loader is doing. All of the dependencies should be in libgalahad_hsl.a. Could you try an ar t on libgalahad_hsl.a (in your/Users/dpo/dev/ralna/GALAHAD/objects/mac64.osx.gfo/double). Here I get

% ar t libgalahad_hsl.a
dummy_hsl.o
ma61d.o
ma27d.o
mc71d.o
ma57d.o
mc21d.o
mc34d.o
mc47d.o
mc59d.o
mc64d.o
hsl_ma57d.o
hsl_ad02d.o
kb07i.o
hsl_kb22l.o
hsl_ma54d.o
hsl_ma64d.o
hsl_of01i.o
hsl_of01d.o
hsl_ma77d.o
hsl_mc34d.o
hsl_mc78i.o
hsl_ma86d.o
hsl_ma87d.o
hsl_zb01i.o
hsl_mc68i.o
hsl_mc69d.o
hsl_mc64d.o
mc30d.o
mc77d.o
hsl_mc80d.o
hsl_ma97d.o
mc60d.o
mc61d.o
mc13d.o
mc20d.o
ma33d.o
hsl_zb01d.o
hsl_ma48d.o
hsl_mi28d.o
fa14d.o
fd15d.o
mc29d.o
la15d.o
la04d.o
and these object files contain the (dummy) references your linker claims are missing. I wonder if the order matters, i.e., perhaps libgalahad_hsl.a needs to occur before libgalahad.a?

@nimgould
Copy link
Contributor

You are right about the missing -lhwloc, I missed it as I compiled with dummy ssids rather than the real one. But once I add this, the script forms the shared library under Linux.

@amontoison
Copy link
Member

I wonder if the order matters, i.e., perhaps libgalahad_hsl.a needs to occur before libgalahad.a?

It could explain a lot of things because MacOS doesn't have ranlib.

@dpo
Copy link
Collaborator Author

dpo commented Nov 20, 2022

Yes I'm sure order matters, just as it does when compiling. However, I'm not finding the right order. The linux linker is too permissive.

@dpo
Copy link
Collaborator Author

dpo commented Nov 20, 2022

Actually, I think the problem runs deeper. Those ...MOD... symbols do not appear in any of the libraries. They refer to the F90 modules. I tried adding -I$GALAHAD/modules/$ARCH/double to the final gfortran command, but that doesn't solve the problem.

It's important to realize that this way of generating shared libraries is just a patch. It's not the "right" way to generate them. I used it successfully on C and F77 projects, but I had never tried it on F90 libraries that depend on F90 modules. Perhaps it's simply insufficient.

@amontoison
Copy link
Member

Actually, I think the problem runs deeper. Those ...MOD... symbols do not appear in any of the libraries. They refer to the F90 modules. I tried adding -I$GALAHAD/modules/$ARCH/double to the final gfortran command, but that doesn't solve the problem.

It's important to realize that this way of generating shared libraries is just a patch. It's not the "right" way to generate them. I used it successfully on C and F77 projects, but I had never tried it on F90 libraries that depend on F90 modules. Perhaps it's simply insufficient.

We are doing the same thing for HSL packages Dominique, which are F90 libraries that sometimes depend on F90 modules, and it works.

@dpo
Copy link
Collaborator Author

dpo commented Nov 21, 2022

Not on macOS.

@dpo
Copy link
Collaborator Author

dpo commented Nov 21, 2022

@nimgould On a related topic, currently, it's hard to debug create_one_shared because it's called with $(BINSHELL) in src/makemaster. So adding -vx to the first line of the script has no effect (at least on macOS).

Since it's executable, it's enough to remove $(BINSHELL) like so:

diff --git a/src/makemaster b/src/makemaster
index fa5a7ed..f6d3d1b 100644
--- a/src/makemaster
+++ b/src/makemaster
@@ -1630,7 +1630,7 @@ create_one_shared:
        cd $(OBJ); CC=$(CC) FORTRAN=$(FORTRAN) OPTIMIZATION=$(OPTIMIZATION) \
                    SHARED=$(SHARED) DLEXT=$(DLEXT) LOADALL=$(LOADALL) \
                    LOADNONE=$(LOADNONE) \
-                   $(BINSHELL) $(GALAHAD)/bin/create_one_shared
+                   $(GALAHAD)/bin/create_one_shared

 #  book keeping

Then, -vx works as expected.

@amontoison
Copy link
Member

More broadly, I now think we should provide two variables METIS4 and METIS5 in every compiler.* file in ARCHDEFS, setting as defaults

METIS4='-lgalahad_metis4' METIS5=

This will then permeate into the $GALAHAD/makefiles/* on install, and will provide users opportunities subsequently to replace these with real versions of metis4 and metis5 if they are available. Incidentally, metis5 (and parametis 4) is available as part of the Linux distributions, as is mumps with or without (p)scotch orderings. Is that true on Macs as well? I have always downloaded and compiled metis 4 as described in the README.external file in galahad. (same address as provided by @dpo above).

On Mac, MUMPS, SCOTCH, METIS, PARMETIS is available with Homebrew. It's Dominique that added them.
We would like to use Ygg to easily provide the shared libraries precompiled by BinaryBuilder / Yggdrasil on all platforms with the same API.

What do you think @dpo @jfowkes @amontoison ? If you agree, someone will have to make a global change to the archdefs files, and we need to coordinate the change with those for galahad so that they stay in sync

Good idea Nick, I created a PR for METIS4 and METIS5 variables in ARCHDefs repository.

@nimgould
Copy link
Contributor

OK, lots of things here.

  1. Thanks @dpo for the change to the main makefile to use the correct shell.

  2. Thank @amontoison for adding those variables. I will check and approve after I've grabbed a cup of tea. On this, since you mention it, could we also add
    SCOTCH=
    PARAMETIS=
    MPI='-lgalahad_mpi"
    the first two for future proofing (and because mumps uses the first), and the last
    because mumps uses mpi, and if mumps isn't available I need to pass in dummy mpi_init and mpi_terminate routines when building the interface to sls.

  3. the Mod file issue. This is strange. A mod file is only required in fortran when one package uses(calls) another, it is quite like a .h file in c (although of course a module is more than just that). The fact that the "complete" shared library is complaining about the hsl modules, but not the galahad ones suggests that this may not be the issue; I would expect complaints about all modules otherwise. Also, you say that you have julia interfaces to hsl, and hsl itself is module based, so I would have expected to see the same issue there.

What might be possible is to build the shared libraries as the compilation proceeds. Can one add objects to a shared library, or does it all have to happen at once? (My ignorance, this is trivial for random libraries, indeed the whole point of them, and it is a shame that macos doesn't have such a useful tool ... or does it?)

@dpo
Copy link
Collaborator Author

dpo commented Nov 21, 2022

What might be possible is to build the shared libraries as the compilation proceeds.

No, unfortunately, as far as I know, that is not possible (not on linux either).

@nimgould
Copy link
Contributor

OK, we need to find out why the process is not complaining about missing galahad mods but does for the hsl ones. Why does it not complain about hwloc when that is only available via its library? What about the other dummies like pastix? Is it simply that they are fortran 77 or c/c++. I really don't understand how macos shared libraries work, do you? You hinted that the hsl build doesn't work on macs on its own. Any idea why? Presumably it would be trivial to build a pair of 5 line modern fortran modules, one of which uses the other, to see this failure in it simplest form, and then to ask the question to the mac community?

@nimgould
Copy link
Contributor

Another possibility is to unpack all of the object files from the static libraries into a directory using ar x libname.a and then build the shared library from the .o files. If you want to try, there is another script , build_one_shared, and this can be tried using

make -s -f $GALAHAD/makefiles/(yourarch) build_one_shared

and this produces a libgahad_all.so/dylib in

$GALAHAD/objects/(yourarch)/double/shared

Of course this does depend on ar working properly, but from what I see on google about Macs this is so. As I said before, I don't believe that the issue is not finding mod files, as there is nothing to suggest that it hasn't found the galahad ones. On linux,

nm -D -g libgalahad_all.so

tells you what is in the shared library, and everything seems to be there. I appreciate that this may be superseded by @dpo 's meson build

@dpo dpo added the CI label Jan 23, 2023
@amontoison
Copy link
Member

Do we want to keep this PR to test the makemaster build system?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4 participants