My dataframes sometimes contain NA values. These were previously blanks, characters like 'BAD' or actual 'NA' characters from the imported .csv file. I have changed everything in my dataframes to numeric - this changes all non-numeric characters to NA. So far, so good.
I am aware I can use the following using dataframe 'df' to ensure a line is always drawn between data points, ensuring there are no gaps:
ggplot(na.omit(df), aes(x=Time, y=pH)) +
geom_line()
However, sometimes I wish to plot 2 or more dataframes using ggplot2 to get a single plot. I do this because my x axis (Time) is indeed the same for all dataframes, but the specific numbers are different. I was having immense trouble merging these dataframes because the rows are not equal. Otherwise I would merge, melt the data and use ggplot2 as normal to make a multiple-lined line plot.
I have since learnt you can plot multiple dataframes manually on ggplot at the 'geom level':
ggplot() +
geom_line(df1, aes(x=Time1, y=pH1), colour='green') +
geom_line(df2, aes(x=Time2, y=pH2), colour='red') +
geom_line(df3, aes(x=Time3, y=pH3), colour='blue') +
geom_line(df4, aes(x=Time4, y=pH4), colour='yellow')
However, how can I now ensure NA values are omitted and the lines are connected?! It all seems to work, but my 4 plots have gaps in them where the NA values are!
I am new to R, but enjoying it so far and realise there are usually multiple solutions to an issue. Any help or advice appreciated.
EDIT (for anyone who later sees this)
So, after playing around for 30 mins I realised I could first use the no.omit function separately on each dataframe, name these new objects and then just these plot these instead on ggplot. This works fine. Also, the above code was incorrect anyway if I wanted a suitable legend.
New, correct code:
df1.omit <- na.omit(df1)
df2.omit <- na.omit(df2)
df3.omit <- na.omit(df3)
df4.omit <- na.omit(df4)
ggplot() +
geom_line(df1.omit, aes(x=Time1, y=pH1, colour="Variable 1") +
geom_line(df2.omit, aes(x=Time2, y=pH2, colour="Variable 2") +
geom_line(df3.omit, aes(x=Time3, y=pH3, colour="Variable 3") +
geom_line(df4.omit, aes(x=Time4, y=pH4, colour="Variable 4")
question from:
https://stackoverflow.com/questions/66046724/omitting-na-values-from-ggplot-when-using-multiple-dataframes-to-plot-multiple-l