All Questions
Tagged with snowflake-cloud-data-platform pyspark
155
questions
0
votes
3
answers
44
views
How to connect Snowflake with PySpark with Google Colab?
I am trying to connect to Snowflake with Pyspark on Google Colab.
Spark version 3.4
Scala version 2.12.17
from pyspark.sql import SparkSession
from pyspark.sql.functions import *
from pyspark import ...
0
votes
0
answers
33
views
Snowflake setting of sql variables using pyspark
I am trying to read a view in snowflake using spark.read.format("snowflake").options(**options).option("query","Set date='03-02-2018';Select * from View123").load()
This ...
1
vote
1
answer
99
views
Snowflake- IS there any way to generate a PDF report from snowflake table?
May be this question is not common, but i have requirement to convert/generate a pdf report from a snowflake table.
So is there any possible ways to do that?
0
votes
0
answers
45
views
Docker Image Build Issue. curl: (60) SSL certificate problem: unable to get local issuer certificate
I am trying to build an opensource Docker image that would allow to configure pyspark and spark snowflake connector. I have provided the error log and contents of the Dockerfile below.
Here's the link ...
1
vote
2
answers
302
views
'spark.jars.packages' not working as expected in AWS Glue and Spark
I want to use some Maven repository JAR files in my Spark session so I am creating the session with 'spark.jars.packages' which would automatically download the JARs. This is not working as expected ...
0
votes
1
answer
226
views
How to overwrite a single partition in Snowflake when using Spark connector
Is there a way for Spark to read a single date partition from a Snowflake table, update it and then to overwrite this single date partition. Concurrent writes should be supported. Currently Spark has ...
0
votes
1
answer
84
views
Safe way to read STREAMS from Snowflake within Databricks across retries
We are exploring a scenario of reading snowflake streams data from within Azure Databricks.
Details:
We have a large Snowflake table on which streams have been setup.
We have other sources with which ...
1
vote
1
answer
118
views
Snowflake stored procedure parallelism
I have snowflake stored procedure written using Javascript. From my main sp1 I'm reading the table names list. Using iterator passing the table name to the sp2 for the data load process.
create ...
0
votes
1
answer
158
views
How to insert JSON into snowflake variant column using pyspark
I have a JSON data that I am pulling from an API. Below is a sample from that data:
{'Clients' : [{'id' : 123, 'name' : 'client ABC inc'},
{'id' : 456, 'name' : 'client XYZ inc'}]}
I want to insert ...
1
vote
1
answer
119
views
Same query giving different result in Databricks vs Snowflake
The following query gives different result in Snowflake and Databricks.
Snowflake:- 766967
Databricks:- 749309
SELECT SUM(ORDER_ID) FROM
(SELECT DISTINCT B.VISITOR_ID,TO_TIMESTAMP_NTZ(CREATE_TS) as ...
0
votes
0
answers
63
views
DataBricks Read Snowflake Returns Inconsistent Row Counts
I'm attempting to read data from Snowflake into DataBricks. I want to read a whole table/view. Not push down a query. However the number of rows returned varies depending on the letter case of the ...
0
votes
0
answers
61
views
Snowpark UDF locking mechanism
In Snowpark version 0.7.0 there was a new locking mechanism added to the UDFs - from Snowflake's release notes:
Added a lock to a UDF or UDTF when it is called for the first time per thread.
When I ...
0
votes
1
answer
321
views
rlike not working for a snowpark dataframe
I am trying to fetch records having the string 'o' in a dataframe column but no records are showing up.
Currently in column 'Variant_nm' in dataframe I have
Junk_df1=df.filter(col('Variant_Nm')=='...
0
votes
0
answers
73
views
Not able to run COPY INTO commands during spark write in Snowflake
GET @spark_connector_load_stage_jtVp1DVrvm/ file:///tmp/dummy_location_spark_connector_tmp/
PUT file:///tmp/dummy_location_spark_connector_tmp/ @spark_connector_load_stage_jtVp1DVrvm
copy into <...
0
votes
0
answers
110
views
I copied the exact same code from AWS Glue Visual editor and created new ETL script. Getting Connection Timed out error for snowflake
I have a AWS glue ETL that load data from PostgreSQl to Snowflake. I copied the exact same code from AWS Glue Visual editor and created new ETL script. I also set the same VPC and Subnet in Script Job ...