Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhencement](tvf) select tvf supports using resource #35139

Merged
merged 3 commits into from
May 24, 2024

Conversation

BePPPower
Copy link
Contributor

@BePPPower BePPPower commented May 21, 2024

Proposed changes

Issue Number: close #xxx

Create an S3/HDFS resource that TVF can use it directly to access the data source.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@BePPPower
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-DS: Total hot run time: 181762 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0073863dbe4d6c813e0a370804952113d52ae9ca, data reload: false

query1	907	382	370	370
query2	6457	2502	2338	2338
query3	6649	221	217	217
query4	23835	21337	21258	21258
query5	4144	408	438	408
query6	260	175	172	172
query7	4592	286	292	286
query8	236	182	187	182
query9	8481	2426	2373	2373
query10	445	246	263	246
query11	14917	14253	14330	14253
query12	131	91	88	88
query13	1639	367	359	359
query14	10078	8518	8420	8420
query15	247	167	169	167
query16	8133	264	250	250
query17	1855	546	538	538
query18	2055	275	265	265
query19	206	144	165	144
query20	92	86	87	86
query21	190	132	125	125
query22	4970	4806	4853	4806
query23	34149	33516	33397	33397
query24	11043	2840	2892	2840
query25	614	395	365	365
query26	1150	156	154	154
query27	2833	315	326	315
query28	7306	2035	2035	2035
query29	855	602	594	594
query30	309	173	178	173
query31	990	771	757	757
query32	99	52	52	52
query33	742	267	242	242
query34	1033	473	472	472
query35	842	676	703	676
query36	1056	945	897	897
query37	138	74	70	70
query38	2883	2787	2757	2757
query39	1632	1555	1570	1555
query40	198	123	124	123
query41	45	41	40	40
query42	103	98	94	94
query43	592	540	551	540
query44	1220	723	739	723
query45	272	254	260	254
query46	1079	694	721	694
query47	1942	1878	1877	1877
query48	367	293	286	286
query49	1087	385	393	385
query50	762	397	385	385
query51	6957	6792	6814	6792
query52	103	95	90	90
query53	350	275	276	275
query54	908	428	418	418
query55	75	74	76	74
query56	250	233	229	229
query57	1242	1115	1136	1115
query58	223	197	195	195
query59	3359	3010	3121	3010
query60	260	229	235	229
query61	91	86	93	86
query62	668	477	466	466
query63	306	280	278	278
query64	9220	2250	1699	1699
query65	3175	3152	3100	3100
query66	1417	350	354	350
query67	15259	14757	15125	14757
query68	4594	513	519	513
query69	465	306	312	306
query70	1136	1082	1046	1046
query71	414	260	264	260
query72	8116	2558	2381	2381
query73	705	314	313	313
query74	6600	6216	6381	6216
query75	3308	2633	2625	2625
query76	2455	995	1005	995
query77	430	268	262	262
query78	10568	9948	10054	9948
query79	2502	507	506	506
query80	1019	427	432	427
query81	519	242	238	238
query82	737	98	96	96
query83	240	169	166	166
query84	237	87	86	86
query85	1612	276	274	274
query86	502	298	322	298
query87	3254	3174	3132	3132
query88	4219	2350	2346	2346
query89	458	377	388	377
query90	1935	193	184	184
query91	127	112	107	107
query92	58	52	52	52
query93	1872	497	486	486
query94	1181	182	186	182
query95	400	300	301	300
query96	586	265	268	265
query97	3183	3003	3004	3003
query98	234	223	217	217
query99	1130	889	925	889
Total cold run time: 284519 ms
Total hot run time: 181762 ms
@doris-robot
Copy link

ClickBench: Total hot run time: 30.69 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 0073863dbe4d6c813e0a370804952113d52ae9ca, data reload: false

query1	0.04	0.03	0.04
query2	0.08	0.04	0.04
query3	0.23	0.05	0.06
query4	1.68	0.07	0.07
query5	0.50	0.49	0.52
query6	1.12	0.72	0.72
query7	0.02	0.01	0.01
query8	0.05	0.05	0.04
query9	0.55	0.48	0.49
query10	0.55	0.54	0.53
query11	0.15	0.11	0.11
query12	0.15	0.12	0.12
query13	0.60	0.59	0.58
query14	0.78	0.77	0.77
query15	0.83	0.80	0.80
query16	0.36	0.36	0.37
query17	0.95	0.96	0.96
query18	0.23	0.24	0.21
query19	1.75	1.67	1.68
query20	0.02	0.01	0.02
query21	15.48	0.71	0.69
query22	3.97	7.30	2.16
query23	18.31	1.36	1.27
query24	1.29	0.48	0.20
query25	0.14	0.09	0.08
query26	0.27	0.17	0.18
query27	0.08	0.07	0.08
query28	13.28	1.03	1.00
query29	12.70	3.30	3.27
query30	0.25	0.06	0.06
query31	2.86	0.39	0.38
query32	3.31	0.48	0.46
query33	2.91	2.90	2.83
query34	17.02	4.51	4.47
query35	4.59	4.48	4.50
query36	0.66	0.46	0.46
query37	0.18	0.17	0.15
query38	0.16	0.15	0.15
query39	0.04	0.03	0.04
query40	0.16	0.14	0.14
query41	0.09	0.04	0.05
query42	0.06	0.05	0.05
query43	0.04	0.03	0.04
Total cold run time: 108.49 s
Total hot run time: 30.69 s
@BePPPower
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41661 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 259b52f8c9823c792bb6340ef0bd5649c5f86aa0, data reload: false

------ Round 1 ----------------------------------
q1	17598	4360	4227	4227
q2	2013	200	191	191
q3	10436	1245	1260	1245
q4	10208	887	718	718
q5	7469	2730	2720	2720
q6	219	131	135	131
q7	1046	607	604	604
q8	9252	2141	2078	2078
q9	10543	7329	7336	7329
q10	9711	3950	3908	3908
q11	460	258	233	233
q12	454	217	215	215
q13	17258	3180	3241	3180
q14	265	237	226	226
q15	518	478	475	475
q16	498	398	381	381
q17	973	643	696	643
q18	8281	7814	7809	7809
q19	2149	1544	1557	1544
q20	626	320	299	299
q21	5312	3233	4223	3233
q22	357	272	290	272
Total cold run time: 115646 ms
Total hot run time: 41661 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4535	4464	4406	4406
q2	394	274	277	274
q3	3178	2958	2957	2957
q4	1928	1605	1615	1605
q5	5453	5525	5496	5496
q6	218	128	128	128
q7	2330	1996	2007	1996
q8	3238	3433	3404	3404
q9	9332	9439	9431	9431
q10	3934	3803	3911	3803
q11	587	524	526	524
q12	771	621	650	621
q13	16230	3138	3211	3138
q14	300	281	269	269
q15	515	474	477	474
q16	510	441	430	430
q17	1755	1507	1446	1446
q18	7865	7693	7402	7402
q19	1958	1535	1603	1535
q20	2032	1805	1782	1782
q21	9194	4972	4890	4890
q22	601	485	494	485
Total cold run time: 76858 ms
Total hot run time: 56496 ms
@doris-robot
Copy link

TPC-DS: Total hot run time: 181396 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 259b52f8c9823c792bb6340ef0bd5649c5f86aa0, data reload: false

query1	906	405	369	369
query2	6434	2409	2400	2400
query3	6654	225	213	213
query4	22941	21368	21352	21352
query5	4142	431	420	420
query6	257	182	189	182
query7	4590	296	284	284
query8	242	186	187	186
query9	8662	2368	2338	2338
query10	431	246	240	240
query11	14737	14250	14408	14250
query12	137	92	90	90
query13	1644	377	369	369
query14	9668	8692	6949	6949
query15	254	164	173	164
query16	8037	275	254	254
query17	1803	548	555	548
query18	2031	302	280	280
query19	202	147	152	147
query20	91	87	84	84
query21	192	128	126	126
query22	5134	4921	4894	4894
query23	34089	33497	33586	33497
query24	6706	2975	2840	2840
query25	542	366	363	363
query26	700	160	156	156
query27	1886	319	310	310
query28	3800	2030	2019	2019
query29	863	627	604	604
query30	255	177	174	174
query31	937	761	732	732
query32	89	52	56	52
query33	482	254	242	242
query34	846	478	472	472
query35	763	668	686	668
query36	1052	922	919	919
query37	104	70	70	70
query38	2898	2828	2754	2754
query39	1636	1549	1533	1533
query40	193	125	151	125
query41	45	44	41	41
query42	103	94	94	94
query43	569	552	524	524
query44	1061	729	740	729
query45	269	249	252	249
query46	1058	748	696	696
query47	1943	1894	1878	1878
query48	374	298	285	285
query49	769	391	411	391
query50	768	384	377	377
query51	6866	6746	6866	6746
query52	111	88	92	88
query53	350	280	287	280
query54	540	422	430	422
query55	74	73	75	73
query56	252	215	222	215
query57	1222	1184	1132	1132
query58	218	203	210	203
query59	3394	2977	3193	2977
query60	252	243	232	232
query61	88	105	94	94
query62	601	452	487	452
query63	311	284	285	284
query64	8404	2252	1808	1808
query65	3151	3110	3081	3081
query66	805	353	347	347
query67	15259	15402	15333	15333
query68	4580	531	525	525
query69	480	302	368	302
query70	1109	1161	1105	1105
query71	389	270	281	270
query72	7360	2612	2333	2333
query73	724	317	317	317
query74	6632	6246	6290	6246
query75	3424	2657	2672	2657
query76	2238	1041	1010	1010
query77	388	263	265	263
query78	10694	10120	10381	10120
query79	2382	519	510	510
query80	947	436	427	427
query81	525	240	245	240
query82	733	97	92	92
query83	247	168	176	168
query84	246	91	86	86
query85	1098	314	262	262
query86	471	276	313	276
query87	3298	3162	3122	3122
query88	4077	2318	2324	2318
query89	469	381	381	381
query90	2037	186	184	184
query91	123	96	94	94
query92	62	49	48	48
query93	1462	494	477	477
query94	1137	179	178	178
query95	396	304	302	302
query96	586	271	269	269
query97	3150	3014	3041	3014
query98	244	230	223	223
query99	1207	877	921	877
Total cold run time: 268802 ms
Total hot run time: 181396 ms
@doris-robot
Copy link

ClickBench: Total hot run time: 30.8 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 259b52f8c9823c792bb6340ef0bd5649c5f86aa0, data reload: false

query1	0.04	0.04	0.03
query2	0.08	0.04	0.04
query3	0.22	0.05	0.05
query4	1.68	0.07	0.07
query5	0.50	0.48	0.51
query6	1.13	0.73	0.73
query7	0.02	0.01	0.01
query8	0.05	0.04	0.04
query9	0.53	0.50	0.49
query10	0.55	0.54	0.53
query11	0.16	0.12	0.12
query12	0.15	0.12	0.12
query13	0.58	0.58	0.61
query14	0.79	0.79	0.79
query15	0.84	0.82	0.82
query16	0.36	0.36	0.37
query17	1.04	0.96	1.00
query18	0.24	0.25	0.24
query19	1.82	1.71	1.71
query20	0.02	0.01	0.01
query21	15.51	0.69	0.68
query22	5.12	6.48	2.17
query23	18.25	1.32	1.19
query24	1.76	0.22	0.22
query25	0.15	0.08	0.08
query26	0.26	0.17	0.18
query27	0.09	0.07	0.08
query28	13.46	1.03	0.99
query29	13.22	3.24	3.24
query30	0.24	0.06	0.05
query31	2.87	0.40	0.39
query32	3.27	0.48	0.47
query33	2.90	2.88	2.88
query34	17.21	4.48	4.48
query35	4.50	4.52	4.72
query36	0.66	0.46	0.47
query37	0.18	0.16	0.15
query38	0.15	0.15	0.14
query39	0.05	0.03	0.04
query40	0.16	0.13	0.14
query41	0.09	0.05	0.04
query42	0.05	0.05	0.04
query43	0.04	0.03	0.04
Total cold run time: 110.99 s
Total hot run time: 30.8 s
@BePPPower
Copy link
Contributor Author

run p0

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We missed the authorization issue

@BePPPower
Copy link
Contributor Author

run buildall

@BePPPower
Copy link
Contributor Author

run external performance

@BePPPower
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41616 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f0033f109808e965c5ba19f651770c28a97a8e64, data reload: false

------ Round 1 ----------------------------------
q1	17603	4421	4237	4237
q2	2036	183	186	183
q3	10456	1285	1202	1202
q4	10194	799	784	784
q5	7480	2791	2609	2609
q6	233	134	135	134
q7	974	601	598	598
q8	9211	2111	2070	2070
q9	9344	6696	6654	6654
q10	9682	3839	3934	3839
q11	445	249	248	248
q12	485	216	234	216
q13	18145	3198	3233	3198
q14	247	213	208	208
q15	504	471	465	465
q16	507	409	403	403
q17	959	629	607	607
q18	8524	7949	7826	7826
q19	6835	1594	1559	1559
q20	652	320	322	320
q21	5190	3984	3982	3982
q22	381	292	274	274
Total cold run time: 120087 ms
Total hot run time: 41616 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4452	4389	4377	4377
q2	391	278	277	277
q3	3118	2937	2714	2714
q4	1863	1598	1640	1598
q5	5461	5459	5493	5459
q6	217	121	123	121
q7	2166	1845	1782	1782
q8	3231	3392	3351	3351
q9	8570	8610	8545	8545
q10	3872	3747	3822	3747
q11	581	494	497	494
q12	784	615	621	615
q13	15953	3190	3130	3130
q14	286	265	265	265
q15	547	475	487	475
q16	493	435	433	433
q17	1814	1490	1457	1457
q18	7667	7594	7414	7414
q19	1644	1557	1524	1524
q20	1999	1807	1787	1787
q21	8445	4661	4687	4661
q22	587	501	483	483
Total cold run time: 74141 ms
Total hot run time: 54709 ms
@doris-robot
Copy link

TPC-DS: Total hot run time: 172260 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f0033f109808e965c5ba19f651770c28a97a8e64, data reload: false

query1	909	380	370	370
query2	6430	2367	2304	2304
query3	6670	204	201	201
query4	19487	17484	17188	17188
query5	4192	429	412	412
query6	236	158	159	158
query7	4595	290	285	285
query8	232	192	182	182
query9	8459	2451	2442	2442
query10	453	291	266	266
query11	10475	10069	10034	10034
query12	132	88	85	85
query13	1664	367	381	367
query14	9286	7497	7372	7372
query15	234	173	169	169
query16	7730	259	254	254
query17	1729	543	512	512
query18	1843	271	262	262
query19	198	154	156	154
query20	94	81	87	81
query21	192	125	130	125
query22	4246	3859	3914	3859
query23	33439	32722	32922	32722
query24	6852	2843	2795	2795
query25	569	366	358	358
query26	709	153	155	153
query27	2012	325	323	323
query28	3681	2081	2086	2081
query29	832	621	586	586
query30	252	170	171	170
query31	947	737	738	737
query32	85	51	53	51
query33	488	285	260	260
query34	841	468	477	468
query35	685	606	595	595
query36	1022	929	893	893
query37	105	73	69	69
query38	2890	2782	2734	2734
query39	816	789	791	789
query40	195	125	129	125
query41	47	43	43	43
query42	103	93	96	93
query43	556	540	535	535
query44	1036	710	732	710
query45	178	165	161	161
query46	1052	720	706	706
query47	1902	1735	1797	1735
query48	365	306	298	298
query49	856	368	377	368
query50	763	385	381	381
query51	6866	6800	6694	6694
query52	101	86	88	86
query53	345	281	279	279
query54	529	435	421	421
query55	70	71	72	71
query56	259	244	244	244
query57	1135	1073	1006	1006
query58	229	204	207	204
query59	3497	3217	3346	3217
query60	270	250	287	250
query61	88	88	86	86
query62	573	457	461	457
query63	308	275	288	275
query64	8408	2189	1784	1784
query65	3125	3074	3115	3074
query66	820	350	323	323
query67	15508	15116	14966	14966
query68	4567	532	530	530
query69	451	271	271	271
query70	1080	1090	1090	1090
query71	371	270	263	263
query72	7143	5694	5604	5604
query73	721	321	319	319
query74	6037	5603	5629	5603
query75	3307	2604	2642	2604
query76	2335	1026	1000	1000
query77	407	282	264	264
query78	10114	9635	9750	9635
query79	2437	524	520	520
query80	1849	507	427	427
query81	539	245	239	239
query82	1297	98	94	94
query83	298	171	171	171
query84	268	84	86	84
query85	1274	284	264	264
query86	458	312	297	297
query87	3276	3123	3222	3123
query88	3987	2419	2400	2400
query89	470	399	381	381
query90	2020	186	184	184
query91	120	97	97	97
query92	56	46	47	46
query93	2578	502	492	492
query94	1269	186	232	186
query95	391	303	299	299
query96	598	268	268	268
query97	3185	2999	3006	2999
query98	238	218	216	216
query99	1157	832	858	832
Total cold run time: 259193 ms
Total hot run time: 172260 ms
@doris-robot
Copy link

ClickBench: Total hot run time: 30.5 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit f0033f109808e965c5ba19f651770c28a97a8e64, data reload: false

query1	0.04	0.03	0.03
query2	0.08	0.04	0.04
query3	0.23	0.06	0.05
query4	1.66	0.08	0.08
query5	0.51	0.49	0.50
query6	1.13	0.72	0.72
query7	0.02	0.01	0.01
query8	0.06	0.05	0.04
query9	0.54	0.49	0.50
query10	0.54	0.56	0.55
query11	0.15	0.11	0.12
query12	0.14	0.12	0.11
query13	0.59	0.59	0.62
query14	0.77	0.77	0.78
query15	0.83	0.82	0.82
query16	0.36	0.37	0.38
query17	1.03	1.03	1.00
query18	0.22	0.23	0.24
query19	1.88	1.72	1.71
query20	0.01	0.01	0.01
query21	15.58	0.67	0.64
query22	4.27	8.06	1.76
query23	18.33	1.39	1.29
query24	1.72	0.29	0.22
query25	0.16	0.09	0.08
query26	0.26	0.17	0.17
query27	0.08	0.08	0.08
query28	13.33	1.01	0.98
query29	13.09	3.35	3.30
query30	0.24	0.06	0.06
query31	2.94	0.39	0.38
query32	3.22	0.46	0.46
query33	2.91	2.93	2.85
query34	17.19	4.44	4.42
query35	4.51	4.57	4.50
query36	0.70	0.49	0.49
query37	0.18	0.15	0.15
query38	0.17	0.15	0.14
query39	0.04	0.03	0.03
query40	0.16	0.15	0.15
query41	0.08	0.04	0.05
query42	0.05	0.05	0.05
query43	0.04	0.04	0.04
Total cold run time: 110.04 s
Total hot run time: 30.5 s
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
But miss authorization check, will be fixed in next PR

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 23, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman merged commit 686e48f into apache:master May 24, 2024
29 of 32 checks passed
yiguolei pushed a commit that referenced this pull request May 24, 2024
Create an S3/HDFS resource that TVF can use it directly to access the data source.
dataroaring pushed a commit that referenced this pull request May 26, 2024
Create an S3/HDFS resource that TVF can use it directly to access the data source.
seawinde pushed a commit to seawinde/doris that referenced this pull request May 27, 2024
Create an S3/HDFS resource that TVF can use it directly to access the data source.
@morningman morningman mentioned this pull request Jun 1, 2024
morningman pushed a commit to apache/doris-website that referenced this pull request Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 participants