Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Alpine and other musl-based Linux distributions #331

Open
SeanMsi opened this issue Dec 7, 2021 · 13 comments
Open

Add support for Alpine and other musl-based Linux distributions #331

SeanMsi opened this issue Dec 7, 2021 · 13 comments
Assignees

Comments

@SeanMsi
Copy link

SeanMsi commented Dec 7, 2021

Hi,

We've recently discovered an issue while using ClearScript version 7.2.0.

When trying to use ClearScript in an Alpine Docker Image, the following error occurs:

Unhandled exception. System.TypeLoadException: Cannot load ClearScript V8 library. Load failure information for ClearScriptV8.linux-x64.so:
/app/runtimes/linux-x64/native/ClearScriptV8.linux-x64.so: Unable to load shared library '/app/runtimes/linux-x64/native/ClearScriptV8.linux-x64.so' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by /app/runtimes/linux-x64/native/ClearScriptV8.linux-x64.so)

Adding the glibc compatability layer to try and solve this causes the following error:

terminate called after throwing an instance of 'std::system_error'
 what():  No error information

Going back to version 7.0.0 with the glibc compatability layer works in an Alpine image without issues. Maybe this was caused by the following change in 7.1?:

Switched to static linking of C/C++ libraries to broaden Linux support

Steps to reproduce

Create simple C# project using ClearScript 7.2.0

static void Main(string[] args)
{
	var engine = new V8ScriptEngine();
	engine.AddHostType("Console", typeof(Console));
	engine.Evaluate("Console.WriteLine('Hello from JS')");
}

Build/Publish project (dotnet build && dotnet publish)

Build Docker image with:

  • Alpine base (such as mcr.microsoft.com/dotnet/aspnet:5.0-alpine)
  • Alpine's glibc compatability layer (RUN apk add gcompat)
  • Copy dotnet publish output and set the entrypoint to simple app

Run the Docker image and output should be the error shown above

@SeanMsi SeanMsi changed the title Error/Crasg when running in an Alpine Docker Image Dec 7, 2021
@ClearScriptLib ClearScriptLib self-assigned this Dec 8, 2021
@ClearScriptLib
Copy link
Collaborator

Hi @SeanMsi,

We've reproduced the issue with ClearScript 7.2 on Alpine in WSL. For us, ClearScript 7.0 with gcompat doesn't work either, failing quietly as soon as we try to instantiate V8ScriptEngine. There's no error message, no core dump, nothing.

Anyway, it looks like Alpine requires special .NET SDK and runtime builds for some reason, and ClearScript could be in the same boat. Unfortunately, as it's a side project, supporting the three major desktop platforms – including the eight platform-specific V8 packages – is already stretching the limits of the resources available.

Sorry!

@ClearScriptLib ClearScriptLib changed the title Error/Crash when running in an Alpine Docker Image Dec 8, 2021
@ClearScriptLib ClearScriptLib changed the title Add support for Alpine and other musl-based Linux distributions. Dec 8, 2021
@William-Froelich
Copy link

I built the latest mainline in an linux container and debugged in alpine. I was able to reproduce the error:

terminate called after throwing an instance of 'std::system_error'
  what():  No error information

There's an exception coming from a missing dependency linkage.

Exception has occurred: CLR/System.DllNotFoundException
Exception thrown: 'System.DllNotFoundException' in System.Private.CoreLib.dll: 'Unable to load shared library '/workspaces/ClearScript/MuslTest/bin/Debug/net6.0/runtimes/linux-x64/native/ClearScriptV8.linux-x64.so' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: Error loading shared library /workspaces/ClearScript/MuslTest/bin/Debug/net6.0/runtimes/linux-x64/native/ClearScriptV8.linux-x64.so: No such file or directory'
   at System.Runtime.InteropServices.NativeLibrary.LoadFromPath(String libraryName, Boolean throwOnError)
   at System.Runtime.InteropServices.NativeLibrary.Load(String libraryPath)
   at Microsoft.ClearScript.V8.V8Proxy.LoadLibrary(String path) in /workspaces/ClearScript/ClearScript/V8/V8Proxy.NetCore.c

I'm going to keep looking, but does this give any more insight?

@William-Froelich
Copy link

William-Froelich commented Dec 18, 2021

The exception in my previous message was my fault. I wasn't copying the lib to the expected subpath.

Update:
After fixing the above issue, I debugged it through to gthr-default.h. __gthread_active_p() returns false when checking to see if __thread_active_ptr has been set. After this, flow control proceeds to the end of std::call_once(). SIGABRT is thrown shortly after.

That make it seem to be related to threading not running | being supported | not detected at runtime. I'm not sure what the fix is exactly, but I did see on a quick glance through github issues elsewhere that there's been some issue with threading when compiled on gcc and run on musl. I don't know if these issues are related or not.

Exception displayed by GDB

Exception has occurred.
Unknown stopping event

Stack Trace from V8Environment_InitializeICU

ClearScriptV8.linux-x64.so!__gthread_active_p()() (\usr\include\c++\10.3.1\x86_64-alpine-linux-musl\bits\gthr-default.h:252)
ClearScriptV8.linux-x64.so!__gthread_once(int*, void (*)())() (\usr\include\c++\10.3.1\x86_64-alpine-linux-musl\bits\gthr-default.h:699)
ClearScriptV8.linux-x64.so!void std::call_once<void (&)()>(std::once_flag&, void (&)())() (\usr\include\c++\10.3.1\mutex:729)
ClearScriptV8.linux-x64.so!icu_69::UMutex::getMutex()() (\workspaces\ClearScript\V8\build\v8\third_party\icu\source\common\umutex.cpp:83)
ClearScriptV8.linux-x64.so!icu_69::UMutex::lock()() (\workspaces\ClearScript\V8\build\v8\third_party\icu\source\common\umutex.h:235)
ClearScriptV8.linux-x64.so!umtx_lock_69() (\workspaces\ClearScript\V8\build\v8\third_party\icu\source\common\umutex.cpp:116)
ClearScriptV8.linux-x64.so!setCommonICUData(UDataMemory*, signed char, UErrorCode*)() (\workspaces\ClearScript\V8\build\v8\third_party\icu\source\common\udata.cpp:187)
ClearScriptV8.linux-x64.so!udata_setCommonData_69() (\workspaces\ClearScript\V8\build\v8\third_party\icu\source\common\udata.cpp:909)
ClearScriptV8.linux-x64.so!v8::internal::InitializeICU(char const*)() (\workspaces\ClearScript\V8\build\v8\src\init\icu_util.cc:93)
ClearScriptV8.linux-x64.so!V8Environment_InitializeICU(const StdChar * pDataPath) (\workspaces\ClearScript\ClearScriptV8\V8SplitProxyNative.cpp:122)
[Unknown/Just-In-Time compiled code] (Unknown Source:0)

SIGABRT Stack Trace

ld-musl-x86_64.so.1!setjmp (Unknown Source:0)
ld-musl-x86_64.so.1!raise (Unknown Source:0)
ld-musl-x86_64.so.1![Unknown/Just-In-Time compiled code] (Unknown Source:0)

Debug Console:

Thread 1 "dotnet" received signal SIGABRT, Aborted.
0x00007ff9da4353f2 in setjmp () from /lib/ld-musl-x86_64.so.1
@ClearScriptLib
Copy link
Collaborator

Hi @William-Froelich,

Thanks for looking into this.

That make it seem to be related to threading not running | being supported | not detected at runtime.

Hmm, std::call_once is a standard C++ API, so std::system_error probably indicates a serious library mismatch of some sort. Note that ClearScript explicitly specifies pthread linkage here.

I did see on a quick glance through github issues elsewhere that there's been some issue with threading when compiled on gcc and run on musl.

ClearScript uses Clang to build its native Linux libraries. A quick search suggests that Clang may support a special target for musl.

Please keep us posted on any additional insights or discoveries.

Thank you!

@William-Froelich
Copy link

I've tried recompiling against musl but I've hit a few issues I'm hoping someone can advise on.

  1. The libv8 download step fails and I'm looking into what might be the cause. Is this something anyone else has seen before? The docker container I'm running in has Python 2.7 so I don't think it's specifically related to the python version.

I'm going to keep looking into the V8 project to see if I can figure something out.

vscode ➜ /workspaces/ClearScript (musl-debug-experiment ✗) $ make -f ./Unix/Makefile DEBUG=1
make -f /workspaces/ClearScript/Unix/ClearScriptV8/Makefile
make[1]: Entering directory '/workspaces/ClearScript'
cd /workspaces/ClearScript/Unix; ./V8Update.sh -n -y x64 Debug
Build: x64 Debug
*** BUILD DIRECTORY NOT FOUND; DOWNLOAD REQUIRED ***
V8 revision: Tested (9.6.180.14)
Creating build directory ...
Downloading Depot Tools ...
Downloading V8 and dependencies ...
WARNING: Your metrics.cfg file was invalid or nonexistent. A new one will be created.
Error: Command 'vpython third_party/depot_tools/update_depot_tools_toggle.py --disable' returned non-zero exit status 1 in /workspaces/ClearScript/V8/build/v8
[E2022-01-05T23:06:54.723430Z 5211 0 annotate.go:273] goroutine 1:
#0 go.chromium.org/luci/vpython/venv/config.go:309 - venv.(*Config).resolvePythonInterpreter()
  reason: none of [/workspaces/ClearScript/V8/build/depot_tools/bootstrap-2@3.8.10.chromium.23_bin/python/bin/python] matched specification 2.7.0

#1 go.chromium.org/luci/vpython/venv/config.go:153 - venv.(*Config).resolveRuntime()
  reason: failed to resolve system Python interpreter

#2 go.chromium.org/luci/vpython/venv/venv.go:143 - venv.With()
  reason: failed to resolve python runtime

#3 go.chromium.org/luci/vpython/run.go:60 - vpython.Run()
#4 go.chromium.org/luci/vpython/application/application.go:327 - application.(*application).mainImpl()
#5 go.chromium.org/luci/vpython/application/application.go:416 - application.(*Config).Main.func1()
#6 go.chromium.org/luci/vpython/application/support.go:46 - application.run()
#7 go.chromium.org/luci/vpython/application/application.go:415 - application.(*Config).Main()
#8 vpython/main.go:112 - main.mainImpl()
#9 vpython/main.go:118 - main.main()
#10 runtime/proc.go:225 - runtime.main()
#11 runtime/asm_amd64.s:1371 - runtime.goexit()

*** THE PREVIOUS STEP FAILED ***
make[1]: *** [/workspaces/ClearScript/Unix/ClearScriptV8/Makefile:140: /workspaces/ClearScript/V8/build/v8/out/x64/Debug/obj/libv8_monolith.a] Error 1
make[1]: Leaving directory '/workspaces/ClearScript'
make: *** [Unix/Makefile:24: all] Error 2
  1. V8SplitProxyNative.tt seems to assume just arch and os are needed to determine the native library to load. This probably would need to get changed, though I'm not clear what could be done to determine if musl is in use instead of glibc. This class is used to generate V8SplitProxyNativeGenerated.cs right?

  2. Packaging - I assume this would have to be a brand new nuget package and not just added to Microsoft.ClearScript.V8.Native.Linux-* packages?

@ClearScriptLib
Copy link
Collaborator

Hi @William-Froelich,

Thanks for investigating this!

none of [/workspaces/ClearScript/V8/build/depot_tools/bootstrap-2@3.8.10.chromium.23_bin/python/bin/python] matched specification 2.7.0

Not sure what's going on there. On Ubuntu LTS, that path is Python 2.7.18. What happens when you run it in your container?

This class is used to generate V8SplitProxyNativeGenerated.cs right?

Yes. It'd probably be best to treat this as a new architecture – perhaps something like "x64_musl". For the native call routing to work as it does today, we'd need to be able to detect this architecture at runtime. Currently we use RuntimePlatform.OSArchitecture, but we'd need some other technique to differentiate between musl and glibc.

I assume this would have to be a brand new nuget package and not just added to Microsoft.ClearScript.V8.Native.Linux-* packages?

We might be able to make it work either way, but a separate package would probably be best.

Thanks again, and please keep us posted!

@William-Froelich
Copy link

Not sure what's going on there. On Ubuntu LTS, that path is Python 2.7.18. What happens when you run it in your container?

In the alpine container the path is what I would expect and the version matches Ubuntu:

vscode ➜ /workspaces/ClearScript/V8/build (musl-debug-experiment ✗) $ which python
/usr/bin/python
vscode ➜ /workspaces/ClearScript/V8/build (musl-debug-experiment ✗) $ ls -al /usr/bin/python
lrwxrwxrwx 1 root root 7 Jan  5 22:43 /usr/bin/python -> python2
pythjovscode ➜ /workspaces/ClearScript/V8/build (musl-debug-experiment ✗) $ 
vscode ➜ /workspaces/ClearScript/V8/build (musl-debug-experiment ✗) $ python --version
Python 2.7.18

Running python from the depot_tools directory seems to indicate the minimal libc on alpine isn't sufficient due to the symbol not found errors.

vscode ➜ /workspaces/ClearScript/V8/build (musl-debug-experiment ✗) $ /workspaces/ClearScript/V8/build/depot_tools/bootstrap-2@3.8.10.chromium.23_bin/python/bin/python
Error relocating /workspaces/ClearScript/V8/build/depot_tools/bootstrap-2@3.8.10.chromium.23_bin/python/bin/python2.7: xdr_pointer: symbol not found
Error relocating /workspaces/ClearScript/V8/build/depot_tools/bootstrap-2@3.8.10.chromium.23_bin/python/bin/python2.7: clnt_perror: symbol not found
Error relocating /workspaces/ClearScript/V8/build/depot_tools/bootstrap-2@3.8.10.chromium.23_bin/python/bin/python2.7: clnt_pcreateerror: symbol not found
Error relocating /workspaces/ClearScript/V8/build/depot_tools/bootstrap-2@3.8.10.chromium.23_bin/python/bin/python2.7: xdr_bool: symbol not found
Error relocating /workspaces/ClearScript/V8/build/depot_tools/bootstrap-2@3.8.10.chromium.23_bin/python/bin/python2.7: xdr_opaque: symbol not found
Error relocating /workspaces/ClearScript/V8/build/depot_tools/bootstrap-2@3.8.10.chromium.23_bin/python/bin/python2.7: xdr_u_int: symbol not found
Error relocating /workspaces/ClearScript/V8/build/depot_tools/bootstrap-2@3.8.10.chromium.23_bin/python/bin/python2.7: xdr_bytes: symbol not found
Error relocating /workspaces/ClearScript/V8/build/depot_tools/bootstrap-2@3.8.10.chromium.23_bin/python/bin/python2.7: clnt_spcreateerror: symbol not found
Error relocating /workspaces/ClearScript/V8/build/depot_tools/bootstrap-2@3.8.10.chromium.23_bin/python/bin/python2.7: xdr_char: symbol not found
Error relocating /workspaces/ClearScript/V8/build/depot_tools/bootstrap-2@3.8.10.chromium.23_bin/python/bin/python2.7: xdr_enum: symbol not found
Error relocating /workspaces/ClearScript/V8/build/depot_tools/bootstrap-2@3.8.10.chromium.23_bin/python/bin/python2.7: xdr_vector: symbol not found
Error relocating /workspaces/ClearScript/V8/build/depot_tools/bootstrap-2@3.8.10.chromium.23_bin/python/bin/python2.7: xdr_free: symbol not found
Error relocating /workspaces/ClearScript/V8/build/depot_tools/bootstrap-2@3.8.10.chromium.23_bin/python/bin/python2.7: xdr_string: symbol not found
Error relocating /workspaces/ClearScript/V8/build/depot_tools/bootstrap-2@3.8.10.chromium.23_bin/python/bin/python2.7: clnt_create: symbol not found

I guess I can try to build with musl on ubuntu and copy to alpine and see if that works.

@ClearScriptLib
Copy link
Collaborator

Yeah, the V8 build process is finicky about its platform. For example, all 32-bit versions must be cross-compiled. Apparently, it's now using its own glibc-based Python.

@William-Froelich
Copy link

William-Froelich commented Jan 7, 2022

I was trying out the musl compilation without adding the higher-level changes (just replacing the existing linux-64.so with the musl version). I built on ubuntu and then swapped to alpine linux to test.

You can view the diff here: William-Froelich#1

I verified the library builds targeting musl instead of glibc with ldd:

vscode ➜ /workspaces/ClearScript/MuslTest (musl-debug-experiment ✗) $ ldd ./bin/Debug/net6.0/ClearScriptV8.linux-x64.so 
        /lib/ld-musl-x86_64.so.1 (0x7f5ab030b000)
        libm.so.6 => /lib/ld-musl-x86_64.so.1 (0x7f5ab030b000)
        libpthread.so.0 => /lib/ld-musl-x86_64.so.1 (0x7f5ab030b000)
        libc.so => /lib/ld-musl-x86_64.so.1 (0x7f5ab030b000)

Unfortunately, the error output is exactly the same and the app crashes right after executing std::call_once(). Am I missing something here? Did I build it right or is there still something else I need to do to make it happy about stdlib?

I realize that my branch breaks the gnu-libc build by adding 'alpine' to the target in the makefile so if I do get this working I will be fixing that!

Steps to reproduce:

  1. Open current folder in the provided ubuntu dev container (you will have to swap what's in .devcontainer/ with .devcontainer/ubuntu
  2. From the VSCode terminal run make -f Unix/Makefile
  3. Swap back to Alpine (replace .devcontainer/ with .devcontainer/alpine
  4. run dotnet run --project ./MuslTest/MuslTest.csproj
@ClearScriptLib
Copy link
Collaborator

Hi @William-Froelich,

Quick question about your diff: You're defining CXXALPINEFLAGS, but where is that symbol used?

Thanks!

@William-Froelich
Copy link

William-Froelich commented Jan 10, 2022

It's not being used. I had originally added it because I thought it would be necessary, since it was in most of the examples I found. However, as far as I can tell, the --sysroot flag is only needed when you have musl in a non-standard install location such as when you download musl-gcc and other tools and compile them from source.

@Nikolay-Ch
Copy link

Hi,
Is there any information about this issue?
I have alpine docker image and get errors:

can't run Unable to load shared library '/app/runtimes/linux-x64/native/ClearScriptV8.linux-x64.so
@ClearScriptLib
Copy link
Collaborator

Hi @Nikolay-Ch,

Unfortunately, we have no update. Support for musl-based Linux platforms is still something we'd like to add in the future.

Good luck!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
4 participants