Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make ordering in Snowflake's Concatenate aggregate deterministic #10412

Open
radeusgd opened this issue Jul 1, 2024 · 1 comment
Open

Make ordering in Snowflake's Concatenate aggregate deterministic #10412

radeusgd opened this issue Jul 1, 2024 · 1 comment
Assignees
Labels
-libs Libraries: New libraries to be implemented l-db-connector Libraries: database connectors

Comments

@radeusgd
Copy link
Member

radeusgd commented Jul 1, 2024

The Snowflake Dialect (#9486) supports the Concatenate aggregate, but it is currently not deterministic.

This is expected, because we only rely on an ORDER BY clause for a sub-query, but the documentation says:

If you do not specify WITHIN GROUP (ORDER BY), the order of elements within each list is unpredictable. (An ORDER BY clause outside the WITHIN GROUP clause applies to the order of the output rows, not to the order of the list elements within a row.)

To make the ordering predictable, we should inherit any ORDER BY clauses from the inner query's context into the aggregate itself. This solution will likely be similar to #10321.

@radeusgd radeusgd added -libs Libraries: New libraries to be implemented l-db-connector Libraries: database connectors labels Jul 1, 2024
@radeusgd
Copy link
Member Author

radeusgd commented Jul 1, 2024

When this task is implemented, the following patch should be reverted:

Subject: [PATCH] workaround for https://github.com/enso-org/enso/issues/10412
---
Index: test/Snowflake_Tests/src/Snowflake_Spec.enso
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/test/Snowflake_Tests/src/Snowflake_Spec.enso b/test/Snowflake_Tests/src/Snowflake_Spec.enso
--- a/test/Snowflake_Tests/src/Snowflake_Spec.enso	(revision dfeeaa207ecceedf129f8fb93f05a5ac84d6a859)
+++ b/test/Snowflake_Tests/src/Snowflake_Spec.enso	(revision 0d192541be6a8ef0eb0659d226aee1b7dddbd3cf)
@@ -568,7 +568,7 @@
     Common_Spec.add_specs suite_builder prefix create_connection_fn
 
     common_selection = Common_Table_Operations.Main.Test_Selection.Config supports_case_sensitive_columns=True order_by_unicode_normalization_by_default=True allows_mixed_type_comparisons=False fixed_length_text_columns=True removes_trailing_whitespace_casting_from_char_to_varchar=True supports_decimal_type=True supported_replace_params=supported_replace_params run_advanced_edge_case_tests_by_default=False
-    aggregate_selection = Common_Table_Operations.Aggregate_Spec.Test_Selection.Config first_last=False first_last_row_order=False aggregation_problems=False
+    aggregate_selection = Common_Table_Operations.Aggregate_Spec.Test_Selection.Config first_last=False first_last_row_order=False aggregation_problems=False text_concat=False
     agg_in_memory_table = ((Project_Description.new enso_dev.Table_Tests).data / "data.csv") . read
 
     agg_table_fn = _->
Index: test/Table_Tests/src/Common_Table_Operations/Aggregate_Spec.enso
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/test/Table_Tests/src/Common_Table_Operations/Aggregate_Spec.enso b/test/Table_Tests/src/Common_Table_Operations/Aggregate_Spec.enso
--- a/test/Table_Tests/src/Common_Table_Operations/Aggregate_Spec.enso	(revision dfeeaa207ecceedf129f8fb93f05a5ac84d6a859)
+++ b/test/Table_Tests/src/Common_Table_Operations/Aggregate_Spec.enso	(revision 0d192541be6a8ef0eb0659d226aee1b7dddbd3cf)
@@ -908,6 +908,20 @@
                 materialized.columns.at 1 . name . should_equal "Shortest B"
                 materialized.columns.at 1 . to_vector . should_equal ["f"]
 
+
+    # Special case for Snowflake until the https://github.com/enso-org/enso/issues/10412 ticket is resolved.
+    if setup.prefix.contains "Snowflake" then
+        suite_builder.group prefix+"Table.aggregate Concatenate" group_builder->
+            group_builder.specify "should be supported" <|
+                table = table_builder [["X", ["A", "B", "C"]]]
+                result = table.aggregate columns=[(Concatenate "X")]
+                result.row_count . should_equal 1
+                str = result.at 0 . at 0
+                # No assumptions about ordering
+                str . should_contain "A"
+                str . should_contain "B"
+                str . should_contain "C"
+
     suite_builder.group prefix+"Table.aggregate Concatenate" (pending = resolve_pending test_selection.text_concat) group_builder->
         build_sorted_table table_structure =
             # Workaround for https://github.com/enso-org/enso/issues/10321
@@ -1597,7 +1611,7 @@
                     expect_sum_and_unsupported_errors 2 <|
                         table.aggregate columns=[Sum "X", Shortest "Y", Longest "Y"]
 
-            if test_selection.text_concat.not then
+            if test_selection.text_concat.not && (setup.prefix.contains "Snowflake" . not) then
                 group_builder.specify "with Concatenate" <|
                     table = table_builder [["X", [1,2,3]], ["Y", ["a", "bb", "ccc"]]]
                     expect_sum_and_unsupported_errors 1 <|
radeusgd added a commit that referenced this issue Jul 1, 2024
radeusgd added a commit that referenced this issue Jul 2, 2024
radeusgd added a commit that referenced this issue Jul 2, 2024
radeusgd added a commit that referenced this issue Jul 3, 2024
radeusgd added a commit that referenced this issue Jul 3, 2024
radeusgd added a commit that referenced this issue Jul 4, 2024
radeusgd added a commit that referenced this issue Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-libs Libraries: New libraries to be implemented l-db-connector Libraries: database connectors
1 participant