Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] wrong colocate join used #36324

Closed
3 tasks done
cambyzju opened this issue Jun 14, 2024 · 0 comments · Fixed by #37361
Closed
3 tasks done

[Bug] wrong colocate join used #36324

cambyzju opened this issue Jun 14, 2024 · 0 comments · Fixed by #37361
Assignees

Comments

@cambyzju
Copy link
Contributor

cambyzju commented Jun 14, 2024

Search before asking

  • I had searched in the issues and found no similar issues.

Version

nereids planner has this problem, all versions.

What's Wrong?

colocate group keys do not the same as on condition, we should no use colocate join, then make the result do not correctly.

What You Expected?

get correct result

How to Reproduce?

  1. create table a, use k1 as distributed key
CREATE TABLE `a` (
  `k1` INT NOT NULL,
  `k2` INT NOT NULL,
  `v` INT SUM NULL
) ENGINE=OLAP
AGGREGATE KEY(`k1`, `k2`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`k1`) BUCKETS 10
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"colocate_with" = "k1_group"
); 
  1. create table b, use k2 as distributed key
CREATE TABLE `b` (
  `k1` INT NOT NULL,
  `k2` INT NOT NULL,
  `v` INT SUM NULL
) ENGINE=OLAP
AGGREGATE KEY(`k1`, `k2`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`k2`) BUCKETS 10
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"colocate_with" = "k1_group"
); 
  1. and table a and b use the same colocate group: k1_group
  2. add some data to table and table b
insert into a values(1,2,3);
insert into b values(1,2,3);
  1. check the plan, we found colocate join is used. Actually in this case, we should not use colocate join:
> explain select * from a join b on a.k1=b.k1 and a.k2=b.k2;
+----------------------------------------------------------------------------------+
| Explain String(Nereids Planner)                                                  |
+----------------------------------------------------------------------------------+
| PLAN FRAGMENT 0                                                                  |
|   OUTPUT EXPRS:                                                                  |
|     k1[#12]                                                                      |
|     k2[#13]                                                                      |
|     v[#14]                                                                       |
|     k1[#15]                                                                      |
|     k2[#16]                                                                      |
|     v[#17]                                                                       |
|   PARTITION: UNPARTITIONED                                                       |
|                                                                                  |
|   HAS_COLO_PLAN_NODE: false                                                      |
|                                                                                  |
|   VRESULT SINK                                                                   |
|      MYSQL_PROTOCAL                                                              |
|                                                                                  |
|   3:VEXCHANGE                                                                    |
|      offset: 0                                                                   |
|      distribute expr lists: k1[#12]                                              |
|                                                                                  |
| PLAN FRAGMENT 1                                                                  |
|                                                                                  |
|   PARTITION: HASH_PARTITIONED: k1[#3]                                            |
|                                                                                  |
|   HAS_COLO_PLAN_NODE: true                                                       |
|                                                                                  |
|   STREAM DATA SINK                                                               |
|     EXCHANGE ID: 03                                                              |
|     UNPARTITIONED                                                                |
|                                                                                  |
|   2:VHASH JOIN(136)                                                              |
|   |  join op: INNER JOIN(COLOCATE[])[]                                           |
|   |  equal join conjunct: (k1[#3] = k1[#0])                                      |
|   |  equal join conjunct: (k2[#4] = k2[#1])                                      |
|   |  cardinality=1                                                               |
|   |  vec output tuple id: 3                                                      |
|   |  output tuple id: 3                                                          |
|   |  vIntermediate tuple ids: 2                                                  |
|   |  hash output slot ids: 0 1 2 3 4 5                                           |
|   |  final projections: k1[#6], k2[#7], v[#8], k1[#9], k2[#10], v[#11]           |
|   |  final project output tuple id: 3                                            |
|   |  distribute expr lists: k1[#3]                                               |
|   |  distribute expr lists: k2[#1]                                               |
|   |                                                                              |
|   |----0:VOlapScanNode(133)                                                      |
|   |       TABLE: doris0.b(b), PREAGGREGATION: OFF. Reason: No aggregate on scan. |
|   |       partitions=1/1 (b)                                                     |
|   |       tablets=10/10, tabletList=30133,30135,30137 ...                        |
|   |       cardinality=0, avgRowSize=0.0, numNodes=1                              |
|   |       pushAggOp=NONE                                                         |
|   |                                                                              |
|   1:VOlapScanNode(132)                                                           |
|      TABLE: doris0.a(a), PREAGGREGATION: OFF. Reason: No aggregate on scan.      |
|      partitions=1/1 (a)                                                          |
|      tablets=10/10, tabletList=30082,30084,30086 ...                             |
|      cardinality=0, avgRowSize=0.0, numNodes=1                                   |
|      pushAggOp=NONE                                                              |
|                                                                                  |
|                                                                                  |
| Statistics                                                                       |
|  planed with unknown column statistics                                           |
+----------------------------------------------------------------------------------+

Anything Else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
1 participant