Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
377 views
in Technique[技术] by (71.8m points)

ggplot2 - Omitting NA values from ggplot when using multiple dataframes to plot multiple lines

My dataframes sometimes contain NA values. These were previously blanks, characters like 'BAD' or actual 'NA' characters from the imported .csv file. I have changed everything in my dataframes to numeric - this changes all non-numeric characters to NA. So far, so good.

I am aware I can use the following using dataframe 'df' to ensure a line is always drawn between data points, ensuring there are no gaps:

ggplot(na.omit(df), aes(x=Time, y=pH)) +
  geom_line()

However, sometimes I wish to plot 2 or more dataframes using ggplot2 to get a single plot. I do this because my x axis (Time) is indeed the same for all dataframes, but the specific numbers are different. I was having immense trouble merging these dataframes because the rows are not equal. Otherwise I would merge, melt the data and use ggplot2 as normal to make a multiple-lined line plot.

I have since learnt you can plot multiple dataframes manually on ggplot at the 'geom level':

ggplot() + 
  geom_line(df1, aes(x=Time1, y=pH1), colour='green') + 
  geom_line(df2, aes(x=Time2, y=pH2), colour='red') +
  geom_line(df3, aes(x=Time3, y=pH3), colour='blue') +
  geom_line(df4, aes(x=Time4, y=pH4), colour='yellow')

However, how can I now ensure NA values are omitted and the lines are connected?! It all seems to work, but my 4 plots have gaps in them where the NA values are!

I am new to R, but enjoying it so far and realise there are usually multiple solutions to an issue. Any help or advice appreciated.

EDIT (for anyone who later sees this)

So, after playing around for 30 mins I realised I could first use the no.omit function separately on each dataframe, name these new objects and then just these plot these instead on ggplot. This works fine. Also, the above code was incorrect anyway if I wanted a suitable legend.

New, correct code:

df1.omit <- na.omit(df1)
df2.omit <- na.omit(df2)
df3.omit <- na.omit(df3)
df4.omit <- na.omit(df4)

ggplot() + 
  geom_line(df1.omit, aes(x=Time1, y=pH1, colour="Variable 1") + 
  geom_line(df2.omit, aes(x=Time2, y=pH2, colour="Variable 2") +
  geom_line(df3.omit, aes(x=Time3, y=pH3, colour="Variable 3") +
  geom_line(df4.omit, aes(x=Time4, y=pH4, colour="Variable 4")
question from:https://stackoverflow.com/questions/66046724/omitting-na-values-from-ggplot-when-using-multiple-dataframes-to-plot-multiple-l

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

So, after playing around for 30 mins I realised I could first use the no.omit function separately on each dataframe, name these new objects and then just these plot these instead on ggplot. This works fine. Also, the above code was incorrect anyway if I wanted a suitable legend.

df1.omit <- na.omit(df1)
df2.omit <- na.omit(df2)
df3.omit <- na.omit(df3)
df4.omit <- na.omit(df4)

ggplot() + 
  geom_line(df1.omit, aes(x=Time1, y=pH1, colour="Variable 1") + 
  geom_line(df2.omit, aes(x=Time2, y=pH2, colour="Variable 2") +
  geom_line(df3.omit, aes(x=Time3, y=pH3, colour="Variable 3") +
  geom_line(df4.omit, aes(x=Time4, y=pH4, colour="Variable 4")

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...