I want to calculate `mean`

(or any other summary statistics of length one, e.g. `min`

, `max`

, `length`

, `sum`

) of a numeric variable ("value") within each level of a grouping variable ("group").

The summary statistic should be assigned to a new variable which has the *same length* as the *original data*. That is, each row of the original data should have a value corresponding to the current group value - the data set should *not* be collapsed to one row per group. For example, consider group `mean`

:

Before

```
id group value
1 a 10
2 a 20
3 b 100
4 b 200
```

After

```
id group value grp.mean.values
1 a 10 15
2 a 20 15
3 b 100 150
4 b 200 150
```

## Solution 1

Have a look at the `ave`

function. Something like

```
df$grp.mean.values <- ave(df$value, df$group)
```

If you want to use `ave`

to calculate something else per group, you need to specify `FUN = your-desired-function`

, e.g. `FUN = min`

:

```
df$grp.min <- ave(df$value, df$group, FUN = min)
```

## Solution 2

You may do this in `dplyr`

using `mutate`

:

```
library(dplyr)
df %>%
group_by(group) %>%
mutate(grp.mean.values = mean(value))
```

...or use `data.table`

to assign the new column by reference (`:=`

):

```
library(data.table)
setDT(df)[ , grp.mean.values := mean(value), by = group]
```

## Solution 3

One option is to use `plyr`

. `ddply`

expects a `data.frame`

(the first d) and returns a `data.frame`

(the second d). Other XXply functions work in a similar way; i.e. `ldply`

expects a `list`

and returns a `data.frame`

, `dlply`

does the opposite...and so on and so forth. The second argument is the grouping variable(s). The third argument is the function we want to compute for each group.

```
require(plyr)
ddply(dat, "group", transform, grp.mean.values = mean(value))
id group value grp.mean.values
1 1 a 10 15
2 2 a 20 15
3 3 b 100 150
4 4 b 200 150
```

## Solution 4

Here is another option using base functions `aggregate`

and `merge`

:

```
merge(x, aggregate(value ~ group, data = x, mean),
by = "group", suffixes = c("", "mean"))
group id value.x value.y
1 a 1 10 15
2 a 2 20 15
3 b 3 100 150
4 b 4 200 150
```

You can get "better" column names with `suffixes`

:

```
merge(x, aggregate(value ~ group, data = x, mean),
by = "group", suffixes = c("", ".mean"))
group id value value.mean
1 a 1 10 15
2 a 2 20 15
3 b 3 100 150
4 b 4 200 150
```