Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade 9.9.2963 -> 9.10.4067 causes server silent reboot #4224

Open
apavelm opened this issue May 9, 2024 · 17 comments
Open

Upgrade 9.9.2963 -> 9.10.4067 causes server silent reboot #4224

apavelm opened this issue May 9, 2024 · 17 comments
Assignees
Labels
Bug Lang: .NET .Net wrapper issue
Projects
Milestone

Comments

@apavelm
Copy link

apavelm commented May 9, 2024

Version: 9.9.2963 -> 9.104067
Language: C#, .NET 8, AspNet Core latest stable

CP-SAT
Windows (on Azure App Services)

After upgrading from 9.9.* to 9.10.* AspNet Core Web Application Service reboots without any error message or exception. We accidentally upgraded library, considering that minor version upgrade should not affect anything. We were wrong. Downgrade make it work again.

@lperron
Copy link
Collaborator

lperron commented May 9, 2024

can you send me the model that triggers the error. Windows silently crashes on floating point errors for instance.

@apavelm
Copy link
Author

apavelm commented May 9, 2024

The problem, I'm not certain of the place where it crashes. It works on localhost, but crashes on Azure App Service (previous version has been working fine). But I'm confident it happens here:

CpModel model = new CpModel();
var allBookings = inputData.Items.OrderBy(x => x.Start).ToArray();
var numBookings = allBookings.Length;
var allTasks = new Dictionary<int, IntVar>();
foreach (var booking in allBookings)
{
    allTasks[booking.Id] = booking.IsFixed ?
        model.NewConstant(booking.Spot.Value) :
        model.NewIntVar(0, numSpots - 1, $"{booking.Id}_spot");

    if (booking.Spot.HasValue && !booking.IsFixed)
    {
        model.AddHint(allTasks[booking.Id], booking.Spot.Value);
    }
}

for (int i = 0; i < numBookings; i++)
{
    for (int j = i; j < numBookings; j++)
    {
        if (i == j) continue;

        var booking = allBookings[i];
        var otherBooking = allBookings[j];
        var distance = otherBooking.Start - booking.Start - booking.Duration;

        var bookingLeft = allTasks[booking.Id];
        var bookingRight = allTasks[otherBooking.Id];

        var reqDistance = booking.UseGaps && otherBooking.UseGaps ? gapSize : 0;

        if (distance < reqDistance)
        {
            ILiteral couldBeInChainAtTheSameSpot = model.FalseLiteral();
            model.Add(bookingLeft != bookingRight).OnlyEnforceIf(couldBeInChainAtTheSameSpot.Not());
        }
    }
}

CpSolver solver = new CpSolver
{
    StringParameters = "linearization_level:1 num_workers:4"
};

CpSolverStatus status = solver.Solve(model);

AllBookings could contain only 1 record, and it will crash on 9.10. Not likely that the problem is in the model.

@lperron
Copy link
Collaborator

lperron commented May 9, 2024

can you check protobuf was correctly updated ?

@apavelm
Copy link
Author

apavelm commented May 9, 2024

No doubts, 3.26.1
I have the same versions on localhost and in Azure App Service. On localhost no problems, in a cloud silently crashes.

@apavelm
Copy link
Author

apavelm commented May 10, 2024

Microsoft Azure Support shared the stack trace:


Your app crashed because of System.ExecutionEngineExceptionYour app and aborted the requests it was processing when the overflow occurred. As a result, your app’s users may have experienced HTTP 502 errors.

This call stack caused the exception:
InlinedCallFrame
InlinedCallFrame
ILStubClass.IL_STUB_PInvokeGoogle.OrTools.Sat.SolveWrapper.Solve
Google.OrTools.Sat.CpSolver.Solve
<Next goes app service Solve method from the appliation>

....

@lperron
Copy link
Collaborator

lperron commented May 10, 2024

Still, it works locally. So the issue is a configuration issue.

@apavelm
Copy link
Author

apavelm commented May 10, 2024

All the configuration above and still works with previous versions down to 9.6 (we started from it)
Everywhere is Windows x64 platform. But I'm not sure about Windows version on Azure VM (App Service), guessting at least win2019 or even 2022

@apavelm
Copy link
Author

apavelm commented May 10, 2024

Probably I know the reason why it works on localhost. On localhost I run it always in debugger. And InlinedCallFrame never appears in debug mode as far I remember.

@lperron
Copy link
Collaborator

lperron commented May 10, 2024

I cannot do anything until you send me something I can reproduce.

@apavelm
Copy link
Author

apavelm commented May 10, 2024

Sorry, I don't have anythig else. I already sent everything I have. The fact it happenes only in Cloud makes the task harder.
What is possible to do - is to compare sources of CpModel and CpSolver between abovementioned versions. Maybe something new and/or suspicious could be found in DIFF, because on the same environment all previous versions since 9.6...9.9 are working fine.

@lperron
Copy link
Collaborator

lperron commented May 10, 2024

see the other issues I just closed, it was a missing updated visual studio version.

@apavelm
Copy link
Author

apavelm commented May 10, 2024

We are using Azure Pipelines to build the artifact.
Azure Agent named "windows-latest", according to the documentation doc it contains windows-2022 and visual studio 2022 (version: 17.9.34728.123). On localhost I have 17.8... (maybe this is the reason. I'll try to update)
The full list of installed software on the agent is here.

@lperron
Copy link
Collaborator

lperron commented May 10, 2024 via email

@leoduret
Copy link

We are using Azure Pipelines to build the artifact. Azure Agent named "windows-latest", according to the documentation doc it contains windows-2022 and visual studio 2022 (version: 17.9.34728.123). On localhost I have 17.8... (maybe this is the reason. I'll try to update) The full list of installed software on the agent is here.

I have the exact same issue with python ortools==9.10 CP-SAT on an azure pipelines runner with windows-latest (works on ubuntu-latest). Can't reproduce locally either with Visual Studio 16. Fix is also to downgrade to 9.9

@Mizux Mizux self-assigned this May 11, 2024
@Mizux Mizux added Bug Lang: .NET .Net wrapper issue labels May 11, 2024
@Mizux Mizux added this to the v10.0 milestone May 11, 2024
@Mizux Mizux added this to To do in ToDo via automation May 11, 2024
@apavelm
Copy link
Author

apavelm commented May 14, 2024

I did many tests, please note:

I built (win-x64) in azure pipeline and deployed to Azure Function / Azure App Service - same result: AccessViolationException.
then I built the same version (9.10) using the same toolset and deployed to linux version of azure function / azure app service - both works.

Apparently, the issue is in win-x64 platform binaries.

@Mizux
Copy link
Collaborator

Mizux commented May 23, 2024

my 2 cents:
I've used VS 2022 Preview (up to date?) to built the binaries since few months ago VS 2022 was crashing in macro/template parsing (IIRC).

if VS 2022 Preview is shipped with an "advanced" redistributable VS runtime it main explain why azure base image can't load them...

I'll try to see If I can use a regular VS 2022 Community install to build (i.e. removed the preview from my Windows VM).

@Mizux Mizux changed the title Upgrade 9.9.2963 -> 9.104067 causes server silent reboot May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Lang: .NET .Net wrapper issue
4 participants