Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
162 views
in Technique[技术] by (71.8m points)

sql - R dbplyr translation produces "operator does not exist" error

I'm trying to query a database and calculate weighted mean with dbplyr. I wrote it out with sum and * and / and the SQL query seems correct (e.g. follows Calculate weighted average in single query) but the query doesn't go through and gives me an error when I try to collect. I don't know what "adding explicit type casts" means or how to do that with dbplyr.

LINE 1: ... TYPE", "REGISTRATION TYPE", "SEGMENT", SUM("VIO" * "EPA MIL...
                                                             ^
HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.

I also reviewed https://github.com/tidyverse/dbplyr/issues/376 where it was noted that writing it out with sum and * and / should work, given that R's weighted.mean doesn't get translated with dbplyr. Compared to the simpler example there baby_db %>% summarize(wavg_prop = sum(prop * n) / sum(n)), I see that my names are all quoted with " - could that be causing a problem?

query_polk_fe <- polk_data %>%
+   group_by(STATE, `COUNTY`, `YEAR MODEL`, `FUEL TYPE`, `REGISTRATION TYPE`, `SEGMENT`) %>%
+   summarize(WMPG_EPA = sum(VIO * `EPA MILEAGE COMBINED`, na.rm = TRUE)/sum(VIO, na.rm = TRUE),
+             WGPM_EPA = sum(VIO / `EPA MILEAGE COMBINED`, na.rm = TRUE)/sum(VIO, na.rm = TRUE),
+             VIO = sum(VIO, na.rm = TRUE))
query_polk_fe %>% show_query()

<SQL>
SELECT "STATE", "COUNTY", "YEAR MODEL", "FUEL TYPE", "REGISTRATION TYPE", "SEGMENT", SUM("VIO" * "EPA MILEAGE COMBINED") / SUM("VIO") AS "WMPG_EPA", SUM("VIO" / "EPA MILEAGE COMBINED") / SUM("VIO") AS "WGPM_EPA", SUM("VIO") AS "VIO"
FROM polk.polk18
GROUP BY "STATE", "COUNTY", "YEAR MODEL", "FUEL TYPE", "REGISTRATION TYPE", "SEGMENT"

polk_fe <- query_polk_fe %>% collect()
Error: Failed to prepare query: ERROR:  operator does not exist: smallint * character varying
LINE 1: ... TYPE", "REGISTRATION TYPE", "SEGMENT", SUM("VIO" * "EPA MIL...
                                                             ^
HINT:  No operator matches the given name and argument types. You might need to add explicit type casts.
question from:https://stackoverflow.com/questions/65545756/r-dbplyr-translation-produces-operator-does-not-exist-error

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Case closed - "EPA MILEAGE COMBINED" column is a string and not a number" - Thanks @Blue Star


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...