Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cassandra Clustering implementation #8925

Merged
merged 13 commits into from
Jul 8, 2024

Conversation

rkargMsft
Copy link
Contributor

@rkargMsft rkargMsft commented Mar 25, 2024

Built off of https://github.com/OrleansContrib/OrleansCassandraUtils with the following major changes:

  • Target Orleans 8
  • Better multi-datacenter support
  • Allow multiple serviceId and clusterId to be stored in the same keyspace
  • Integration tests via Testcontainers
Microsoft Reviewers: Open in CodeFlow
@rkargMsft
Copy link
Contributor Author

@dotnet-policy-service agree company="Microsoft"

@rkargMsft
Copy link
Contributor Author

@ReubenBond Can this PR get flagged to run the integration tests? Specifically to validate that the new Cassandra tests:

"Category=${{ "Cassandra" }}&(Category=BVT|Category=SlowBVT|Category=Functional)"

@ReubenBond
Copy link
Member

@rkargMsft done

@rkargMsft
Copy link
Contributor Author

@ReubenBond Sorry, looks like this needs approval for each commit to run the tests again

@rkargMsft rkargMsft marked this pull request as ready for review April 10, 2024 16:57
@rkargMsft
Copy link
Contributor Author

@ReubenBond
One open question on a unit test pulled from MembershipTableTestBase that doesn't pass:

// Cassandra doesn't have the capability to prevent a duplicate insert in the way this test requires
/*[Fact]
public async Task MembershipTable_ReadRow_Insert_Read()
{
var (membershipTable, gatewayProvider) = await CreateNewMembershipTableAsync("Phalanx", "blu");
MembershipTableData data = await membershipTable.ReadAll();
_testOutputHelper.WriteLine("Membership.ReadAll returned TableVersion={0} Data={1}", data.Version, data);
Assert.Empty(data.Members);
TableVersion newTableVersion = data.Version.Next();
MembershipEntry newEntry = CreateMembershipEntryForTest();
bool ok = await membershipTable.InsertRow(newEntry, newTableVersion);
Assert.True(ok, "InsertRow failed");
ok = await membershipTable.InsertRow(newEntry, newTableVersion);
Assert.False(ok, "InsertRow should have failed - same entry, old table version");
ok = await membershipTable.InsertRow(CreateMembershipEntryForTest(), newTableVersion);
Assert.False(ok, "InsertRow should have failed - new entry, old table version");
data = await membershipTable.ReadAll();
Assert.Equal(1, data.Version.Version);
TableVersion nextTableVersion = data.Version.Next();
ok = await membershipTable.InsertRow(newEntry, nextTableVersion);
Assert.False(ok, "InsertRow should have failed - duplicate entry");
data = await membershipTable.ReadAll();
Assert.Single(data.Members);
data = await membershipTable.ReadRow(newEntry.SiloAddress);
Assert.Equal(newTableVersion.Version, data.Version.Version);
_testOutputHelper.WriteLine("Membership.ReadAll returned TableVersion={0} Data={1}", data.Version, data);
Assert.Single(data.Members);
Assert.NotNull(data.Version.VersionEtag);
Assert.NotEqual(newTableVersion.VersionEtag, data.Version.VersionEtag);
Assert.Equal(newTableVersion.Version, data.Version.Version);
var membershipEntry = data.Members[0].Item1;
string eTag = data.Members[0].Item2;
_testOutputHelper.WriteLine("Membership.ReadRow returned MembershipEntry ETag={0} Entry={1}", eTag, membershipEntry);
Assert.NotNull(eTag);
Assert.NotNull(membershipEntry);
}*/

This test expects to reject the insert of a membership entry if it already exists for the specific Silo. Cassandra doesn't have a way to support this: doing both a Light Weight Transaction (LWT) for the table version check and an Exists check for the entry table isn't something that it supports.

That said, it seems like it would require an incorrectly functioning Silo to even get in this situation. It would need to read the current membership table, see that an entry exists for this Silo, decide to insert a new entry anyway, increment to the correct New Table version, then insert the entry. No other Silo would be adding an entry that would conflict, so I think this would be something that wouldn't ever happen.

@ReubenBond
Copy link
Member

@rwkarg that is very odd. Are you saying that INSERT ... IF NOT EXISTS is not supported under an LWT? They use this in their docs example: https://docs.datastax.com/en/cql-oss/3.3/cql/cql_using/useInsertLWT.html. I may be misunderstanding

@rkargMsft
Copy link
Contributor Author

rkargMsft commented Apr 10, 2024

Can't do both an IF version = :expected_version; (for the table version check) and an IF NOT EXISTS (for the "insert only" check) in the same query.

@rkargMsft
Copy link
Contributor Author

rkargMsft commented Apr 11, 2024

Can't do both an IF version = :expected_version; (for the table version check) and an IF NOT EXISTS (for the "insert only" check) in the same query.

@ReubenBond Found a better way to explain the situation.
INSERT allows only for IF NOT EXISTS
UPDATE allows only for IF column = :value (or IF EXISTS).
There isn't a way to mix those. Generally that isn't an issue.
INSERT is for adding a new entry so why would you do a conditional check there?
UPDATE is for changing an existing row so why would you make an IF NOT EXISTS check there?

Where this comes up is that the membership table has rows for each Silo entry but there's a static column (basically one cell that all rows share) on those rows for the overall membership table version. This creates the situation where, to meet the expectations of the commented out unit test, we need to both:

  • only write the row if there isn't an entry with the same primary key (ie. only if there isn't an entry for the Silo already)
  • only write the row if the static column for the table version is the expected, last read, value (for the CAS check)

Cassandra doesn't directly support doing both of those in a query. The recommendations I've seen to work around this are to have the application logic do the "does this row exist" check based on querying the data and then make the write as an UPDATE with the CAS check to ensure that the view the application had when checking for row existence is still the current view when making the write.

MembershipTableManager already makes this application side check by:

@ReubenBond
Copy link
Member

ReubenBond commented Apr 11, 2024

Given we have a version check, can the Cassandra implementation fail the insert if it knows that the version row is there already? The version check should be all we need here, given we have lightweight transactions for consistency

The recommendations I've seen to work around this are to have the application logic do the "does this row exist" check based on querying the data and then make the write as an UPDATE with the CAS check to ensure that the view the application had when checking for row existence is still the current view when making the write.

Essentially, this. If MembershipTableManager is already performing the check, then the behavior checked by this test shouldn't be necessary.

@rkargMsft
Copy link
Contributor Author

can the Cassandra implementation fail the insert if it knows that the version row is there already?

Yes, this Cassandra implementation does that version check and will fail the query if the version is mismatched (this is covered by other tests).

That was the only open question I had. Otherwise I believe this is good to go, aside from the two Azure Pipeline check failures.

@rkargMsft
Copy link
Contributor Author

@ReubenBond Are the failing dotnet.orleans Azure Pipelines checks something that I need to address?

It looks like it's running on Windows and there aren't any Cassandra container images for Windows.

@rkargMsft
Copy link
Contributor Author

@ReubenBond @benjaminpetit Can we get CassandraCSharpDriver vetted for inclusion in dotnet-public?

It's Apache 2 licensed: https://www.nuget.org/packages/CassandraCSharpDriver

rkargMsft and others added 2 commits June 25, 2024 13:38
Avoid trying to run on Windows

Co-authored-by: Reuben Bond <203839+ReubenBond@users.noreply.github.com>
@ReubenBond
Copy link
Member

ReubenBond commented Jul 8, 2024

@rkargMsft The one remaining thing: I think we should re-enable the commented-out test and implement this logic within CassandraClusteringTable, even if it's duplicating the pattern which MembershipTableManager already implements (and will be a little bit slower due to added round-trips):
image

@ReubenBond
Copy link
Member

I pushed an update to re-enable that test & add per-row ETags for silo entries

@ReubenBond ReubenBond merged commit fc9937e into dotnet:main Jul 8, 2024
22 checks passed
@rkargMsft rkargMsft deleted the cassandra-clustering branch July 8, 2024 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants