I have a large set of temperature data from upstream and downstream gauges. I am trying to find the influence of dam release on downstream temperatures. To do this, I am comparing correlation between the tailwater gauge (placed right below the dam) against various downstream sites. Here is my dataset.
> dput(head(TravelTimeAdjustedSaltData))
structure(list(Date = structure(c(1709942400, 1709943300, 1709944200,
1709945100, 1709946000, 1709946900), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), S1 = c(12.824443359375, 12.824443359375, 12.824443359375,
12.824443359375, 12.824443359375, 12.78154296875), S2 = c(12.86734375,
12.86734375, 12.86734375, 12.910244140625, 12.86734375, 12.824443359375
), S3 = c(12.223837890625, 12.223837890625, 12.26673828125, 12.26673828125,
12.223837890625, 12.26673828125), S4 = c(NA, NA, NA, NA, 7.8908984375,
7.847998046875), S5 = c(NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_), S6 = c(NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_), S7 = c(NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_), S8 = c(12.309638671875, 12.309638671875,
12.26673828125, 12.3525390625, 12.3525390625, 12.309638671875
), S9 = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_
), S10 = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_), S11 = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_), GaugeTemp = c(8.2, 8.2, 8.2, 8.2, 8.2, 8.2), GaugeHeight = c(70.83,
70.84, 70.84, 70.85, 70.83, 70.83)), row.names = c(NA, 6L), class = c("tbl_df",
"tbl", "data.frame"))
The data is adjusted for the waters travel time downstream and that is why there are NAs on the first few rows in some columns. Anyways, I ran Spearman correlations, but found a pattern that was not expected and therefore, think that the correlation is not indeed testing what I am actually wanting to find out. What I found is that sites further downstream (S10 / S11) actually had higher or around the same correlation with the Tailwater Gauge as the first site downstream (S4). I have included both the S4 (closest site downstream from tailwater) and S11 (furthest from tailwater) to show what I am saying. This should not be the case as the influence of the dam releases on downstream temperature should decrease over distance. This leads me to believe that the correlation test is not the answer to my question.
cor.test(TravelTimeAdjustedSaltData$GaugeTemp, TravelTimeAdjustedSaltData$S11, method = "spearman", na.rm=TRUE)
cor.test(TravelTimeAdjustedSaltData$GaugeTemp, TravelTimeAdjustedSaltData$S4, method = "spearman", na.rm=TRUE)
I do not know how to go about testing the cause of dam release (aka the tailwater gauge temperature readings) with the effect (downstream temperature readings). I am looking into some sort of non-parametric regression (LOESS) between the two but not sure if that is the correct way to go about it and also am not very familiar with local regression analysis. Any help would be very much appreciated. I am just wanting a statistical way to show that the dam release is indeed having an effect downstream or not. The correlation is not seeming to serve that purpose (not exactly sure why though).