-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH Action to build and test #10
base: master
Are you sure you want to change the base?
Conversation
b95bd35
to
ae6870c
Compare
@nimgould Here is a GitHub Action to automatically build a standalone GALAHAD for Linux and macOS. By clicking on the "Details" link below 👇 ("standalone / ubuntu" or "standalone / macos"), you can see the build steps and the details of the build. Currently, both fail because MA27 is not present. If we merge this pull request and you submit your future changes in the form of pull requests, this action will run automatically and you'll be able to assess directly whether the builds succeed. ps: the fact that the installation process is interactive is a real obstacle here. (It's the same in CUTEst & Co.) We should really think of modernizing the install script (at the very least). The process is brittle because the "user selections" change each time you add a platform, a compiler, or some other interactive option. |
2bce766
to
92728bb
Compare
8011b3a
to
5b22c73
Compare
@nimgould A few updates here:
|
|
f409a42
to
ccbe396
Compare
@nimgould I updated the CI to the latest ARCHDefs and GALAHAD. We see the shared libs for macOS fail to build. I'll have a look when I can. |
b1eabe1
to
18aeb74
Compare
@nimgould I refreshed this with the latest changes in Build error on macOS apparently due to incorrect flags: https://github.com/ralna/GALAHAD/actions/runs/3412660790/jobs/5678455792#step:7:709 |
Hi @nimgould. I rebased this on top of the latest changes in
On macOS, still lots of undefined symbols when building the shared libs: https://github.com/ralna/GALAHAD/actions/runs/3439387926/jobs/5736620985#step:7:710 I can probably help with that. Has the shared library building process stabilized on Linux? |
I believe that this is because you are still using the makefile stanza "tests" not "test" in the workfile. "test" just performs the comprehensive stand-alone tests while "tests" does this + the sif/cutest tests. Yes, please, for the shared libs. I think you need to check with @amontoison |
For the shared libraries, I need your help @nimgould. We can't use the "glue" workaround here and also this hack is only working on Linux for information (even if you just want the Python interface). Nick, if we can't compile with |
7eb4caa
to
2ccaf88
Compare
Ok, so the CI is now at a point where
My next task is to look into those dynamic libraries. |
SPRAL can take either Metis. HSL needs metis 4, MUMPS and Pastix need metis 5. I presume that we can simply have a libmetis4.a/so and libmetis.a/so (for v5). I believe that the functions in 4 and 5 can coexist as they have different signatures. I shall rename the dummy metis4 in galahad to be galahad_metis4.a to make this clear. |
More broadly, I now think we should provide two variables METIS4 and METIS5 in every compiler.* file in ARCHDEFS, setting as defaults METIS4='-lgalahad_metis4' This will then permeate into the $GALAHAD/makefiles/* on install, and will provide users opportunities subsequently to replace these with real versions of metis4 and metis5 if they are available. Incidentally, metis5 (and parametis 4) is available as part of the Linux distributions, as is mumps with or without (p)scotch orderings. Is that true on Macs as well? I have always downloaded and compiled metis 4 as described in the README.external file in galahad. What do you think @dpo @jfowkes @amontoison ? If you agree, someone will have to make a global change to the archdefs files, and we need to coordinate the change with those for galahad so that they stay in sync |
I have turned off the build of ssids in the single precision case for the time being. I had a look at the c++ an cuda parts that would need to be auto-sed-matically translated, but I do not feel confident enough in my C (or at all in my C++) to be sure that I got this right. Maybe one day. |
Happy to have a look at this with you at some point, but no promises we'll be able to make it work. |
I am afraid that ssids in double precision is as leaky as a paper bag in a storm. Even on the small comprehensive test program, valgrind reports =4076300== LEAK SUMMARY: all of which relate to ssids c++ creates. Not good |
Thank you @jfowkes , but not high priority I fear |
Yes unfortunately we have several open bugs in SPRAL relating to memory leaks, but no idea how to fix them... |
Fortunately once we switch to a robust solver (such as dsytrf/s), the rest of GALAHAD is now leak free (excepting the leaks of valgrind instructions and a few c kernel functions). That's where my two weeks have gone! On to mumps next. |
None of that is available on Macs by default in any official way. Precompiled binaries are usually available from Homebrew. METIS 5 is available but more obscure packages like MUMPS and ParMETIS are not officially available (I maintain the "taps" in my "free" time). I would be very reluctant to rely on Linux package managers as they are distribution dependent. Unless you're on Ubuntu 22, you're stuck with buggy MUMPS 5.2 from 5 years ago. What about the 2000 other linux distros? Though it's not perfect, I've found that Homebrew is my preferred package manager on linux too. |
@dpo, I think we should provide precompiled binaries of GALAHAD with Yggdrasil and rely on it even if the user is not using Julia. |
@nimgould @jfowkes It means that a user can just download an archive of SPRAL (and its dependencies) for its platform and it directly works. Everytime that we recompile SPRAL with BinaryBuilder / Yggdrasil, new archives are automatically uploaded. |
Here's the latest on the shared libraries for macOS. Firsly, I have to ask gfortran to compile with -O2 in order to work around a bug in the linker that should be fixed in the next release of the command-line tools. I'll open a pull request to ARCHDefs. Secondly, if I install GALAHAD without the shared libraries, and then call
I can resolve the |
Thanks to |
I really don't understand what the mac loader is doing. All of the dependencies should be in libgalahad_hsl.a. Could you try an ar t on libgalahad_hsl.a (in your/Users/dpo/dev/ralna/GALAHAD/objects/mac64.osx.gfo/double). Here I get % ar t libgalahad_hsl.a |
You are right about the missing -lhwloc, I missed it as I compiled with dummy ssids rather than the real one. But once I add this, the script forms the shared library under Linux. |
It could explain a lot of things because MacOS doesn't have |
Yes I'm sure order matters, just as it does when compiling. However, I'm not finding the right order. The linux linker is too permissive. |
Actually, I think the problem runs deeper. Those It's important to realize that this way of generating shared libraries is just a patch. It's not the "right" way to generate them. I used it successfully on C and F77 projects, but I had never tried it on F90 libraries that depend on F90 modules. Perhaps it's simply insufficient. |
We are doing the same thing for HSL packages Dominique, which are F90 libraries that sometimes depend on F90 modules, and it works. |
Not on macOS. |
@nimgould On a related topic, currently, it's hard to debug Since it's executable, it's enough to remove diff --git a/src/makemaster b/src/makemaster
index fa5a7ed..f6d3d1b 100644
--- a/src/makemaster
+++ b/src/makemaster
@@ -1630,7 +1630,7 @@ create_one_shared:
cd $(OBJ); CC=$(CC) FORTRAN=$(FORTRAN) OPTIMIZATION=$(OPTIMIZATION) \
SHARED=$(SHARED) DLEXT=$(DLEXT) LOADALL=$(LOADALL) \
LOADNONE=$(LOADNONE) \
- $(BINSHELL) $(GALAHAD)/bin/create_one_shared
+ $(GALAHAD)/bin/create_one_shared
# book keeping Then, |
On Mac, MUMPS, SCOTCH, METIS, PARMETIS is available with Homebrew. It's Dominique that added them.
Good idea Nick, I created a PR for METIS4 and METIS5 variables in ARCHDefs repository. |
OK, lots of things here.
What might be possible is to build the shared libraries as the compilation proceeds. Can one add objects to a shared library, or does it all have to happen at once? (My ignorance, this is trivial for random libraries, indeed the whole point of them, and it is a shame that macos doesn't have such a useful tool ... or does it?) |
No, unfortunately, as far as I know, that is not possible (not on linux either). |
OK, we need to find out why the process is not complaining about missing galahad mods but does for the hsl ones. Why does it not complain about hwloc when that is only available via its library? What about the other dummies like pastix? Is it simply that they are fortran 77 or c/c++. I really don't understand how macos shared libraries work, do you? You hinted that the hsl build doesn't work on macs on its own. Any idea why? Presumably it would be trivial to build a pair of 5 line modern fortran modules, one of which uses the other, to see this failure in it simplest form, and then to ask the question to the mac community? |
Another possibility is to unpack all of the object files from the static libraries into a directory using ar x libname.a and then build the shared library from the .o files. If you want to try, there is another script , build_one_shared, and this can be tried using make -s -f $GALAHAD/makefiles/(yourarch) build_one_shared and this produces a libgahad_all.so/dylib in $GALAHAD/objects/(yourarch)/double/shared Of course this does depend on ar working properly, but from what I see on google about Macs this is so. As I said before, I don't believe that the issue is not finding mod files, as there is nothing to suggest that it hasn't found the galahad ones. On linux, nm -D -g libgalahad_all.so tells you what is in the shared library, and everything seems to be there. I appreciate that this may be superseded by @dpo 's meson build |
Do we want to keep this PR to test the |
No description provided.