The answer to the question is fairly straight forward. As a new R user, I recommend you make liberal use of the 'ggplot2' package. For many R users, this one package is enough.
To get the "combined" barchart described in the original post, the answer is to put all of the data into one dataset and then add grouping variables, like so:
Step 1: Make the dataset.
data <- read.table(text="
Country,Profession,Income
China,Government employee,20000
China,CEO,17000
China,Doctor,15000
China,Athlete,14000
China,Artist,13000
USA,Doctor,40000
USA,Athlete,35000
USA,Artist,30000
USA,Lawyer,25000
USA,Teacher,20000", header=TRUE, sep=",")
You'll notice I'm using the 'read.table' function here. This is not required and is purely for readability in this example. The important part is that we have our values (Income) and our grouping variables (Country, Profession).
Step 2: Create a barchart with Income as the height of the bars, Profession as the x-axis, and color the bars by Country.
library(ggplot2)
ggplot(data, aes(x=Profession, y=Income, fill=Country)) +
geom_bar(stat="identity", position="dodge") +
theme(axis.text.x = element_text(angle = 90))
Here we are first loading the 'ggplot2' package. You may need to install this.
Then, we specify what data we want to use and how to separate it.
ggplot(data, aes(x=Profession, y=Income, fill=Country))
This tells 'ggplot' to use our dataset in the 'data' data frame. The aes()
command specifies how 'ggplot' should read the data. We map the grouping variable Profession onto the x-axis, map the Income onto the y-axis, and change the color (fill) of each bar according to the grouping variable Country.
Next, we specify what kind of barchart we want.
geom_bar(stat="identity", position="dodge")
This tells 'ggplot' to make a barchart (geom_bar()
). By default, the 'geom_bar' function tries to make a histogram, but we already have the totals we want to use. We tell it to use our totals by specifying that the type of statistic represented in Income is the total, or actual values (identity) that we want to chart (stat="identity"
). Finally, I made a judgement call about how to display the data and decided to set one set of data on next to the other when a single profession has multiple income values (position="dodge"
).
Finally, we need to rotate the x-axis labels, since some of them are quite long. We do this with a simple 'theme' command that changes the rotation of the x-axis text elements.
theme(axis.text.x = element_text(angle = 90))
We chain all of these commands together with the +
, and it's done!