How to aggregate categorical data in R?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

I have a dataframe which consists of two columns with categorical variables (Better, Similar, Worse). I would like to come up with a table which counts the number of times that these categories appear in the two columns.
The dataframe I am using is as follows:

       Category.x  Category.y

1      Better      Better

2      Better      Better

3      Similar     Similar

4      Worse       Similar

I would like to come up with a table like this:

           Category.x    Category.y

Better     2             2

Similar    1             2

Worse      1             0

How would you go about it?

asked Apr 2 at 16:26

Daniel

665

4

Looks like you need table(df1)

– akrun
Apr 2 at 16:27

Is it possible to reformat the table, so that I get it as a 3x2 table instead of a 3x3?

– Daniel
Apr 2 at 16:29

I would convert to factor with common levels lvls <- unique(unlist(df1)); df1 <- lapply(df1, factor, levels = lvls) and then do the table(df1)

– akrun
Apr 2 at 16:43

add a comment |

       Category.x  Category.y

1      Better      Better

2      Better      Better

3      Similar     Similar

4      Worse       Similar

I would like to come up with a table like this:

           Category.x    Category.y

Better     2             2

Similar    1             2

Worse      1             0

How would you go about it?

asked Apr 2 at 16:26

Daniel

665

4

Looks like you need table(df1)

– akrun
Apr 2 at 16:27

Is it possible to reformat the table, so that I get it as a 3x2 table instead of a 3x3?

– Daniel
Apr 2 at 16:29

I would convert to factor with common levels lvls <- unique(unlist(df1)); df1 <- lapply(df1, factor, levels = lvls) and then do the table(df1)

– akrun
Apr 2 at 16:43

add a comment |

       Category.x  Category.y

1      Better      Better

2      Better      Better

3      Similar     Similar

4      Worse       Similar

I would like to come up with a table like this:

           Category.x    Category.y

Better     2             2

Similar    1             2

Worse      1             0

How would you go about it?

asked Apr 2 at 16:26

Daniel

665

       Category.x  Category.y

1      Better      Better

2      Better      Better

3      Similar     Similar

4      Worse       Similar

I would like to come up with a table like this:

           Category.x    Category.y

Better     2             2

Similar    1             2

Worse      1             0

How would you go about it?

r aggregate

asked Apr 2 at 16:26

Daniel

665

asked Apr 2 at 16:26

Daniel

665

asked Apr 2 at 16:26

Daniel

665

asked Apr 2 at 16:26

Daniel

665

asked Apr 2 at 16:26

Daniel

665

4

Looks like you need table(df1)

– akrun
Apr 2 at 16:27

Is it possible to reformat the table, so that I get it as a 3x2 table instead of a 3x3?

– Daniel
Apr 2 at 16:29

I would convert to factor with common levels lvls <- unique(unlist(df1)); df1 <- lapply(df1, factor, levels = lvls) and then do the table(df1)

– akrun
Apr 2 at 16:43

add a comment |

4

Looks like you need table(df1)

– akrun
Apr 2 at 16:27

Is it possible to reformat the table, so that I get it as a 3x2 table instead of a 3x3?

– Daniel
Apr 2 at 16:29

I would convert to factor with common levels lvls <- unique(unlist(df1)); df1 <- lapply(df1, factor, levels = lvls) and then do the table(df1)

– akrun
Apr 2 at 16:43

Looks like you need table(df1)

– akrun
Apr 2 at 16:27

Is it possible to reformat the table, so that I get it as a 3x2 table instead of a 3x3?

– Daniel
Apr 2 at 16:29

I would convert to factor with common levels lvls <- unique(unlist(df1)); df1 <- lapply(df1, factor, levels = lvls) and then do the table(df1)

– akrun
Apr 2 at 16:43

add a comment |

3 Answers
3

active

oldest

votes

As mentioned in the comments, table is standard for this, like

table(stack(DT))



         ind

values    Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

table(value = unlist(DT), cat = names(DT)[col(DT)])



         cat

value     Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

with(reshape(DT, direction = "long", varying = 1:2), 

  table(value = Category, cat = time)

)



         cat

value     x y

  Better  2 2

  Similar 1 2

  Worse   1 0

answered Apr 2 at 16:48

Frank

56.1k660135

add a comment |

sapply(df1, function(x) sapply(unique(unlist(df1)), function(y) sum(y == x)))

#        Category.x Category.y

#Better           2          2

#Similar          1          2

#Worse            1          0

answered Apr 2 at 16:33

d.b

20.5k41949

add a comment |

One dplyr and tidyr possibility could be:

df %>%

 gather(var, val) %>%

 count(var, val) %>%

 spread(var, n, fill = 0)



  val     Category.x Category.y

  <chr>        <dbl>      <dbl>

1 Better           2          2

2 Similar          1          2

3 Worse            1          0

It, first, transforms the data from wide to long format, with column "var" including the variable names and column "val" the corresponding values. Second, it counts per "var" and "val". Finally, it spreads the data into the desired format.

Or with dplyr and reshape2 you can do:

df %>%

 mutate(rowid = row_number()) %>%

 melt(., id.vars = "rowid") %>%

 count(variable, value) %>%

 dcast(value ~ variable, value.var = "n", fill = 0)



    value Category.x Category.y

1  Better          2          2

2 Similar          1          2

3   Worse          1          0

edited Apr 2 at 17:58

answered Apr 2 at 16:41

tmfmnk

3,6661516

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
Apr 2 at 16:56

Please see the updated post for commentary.

– tmfmnk
Apr 2 at 17:04

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55479506%2fhow-to-aggregate-categorical-data-in-r%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

As mentioned in the comments, table is standard for this, like

table(stack(DT))



         ind

values    Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

table(value = unlist(DT), cat = names(DT)[col(DT)])



         cat

value     Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

with(reshape(DT, direction = "long", varying = 1:2), 

  table(value = Category, cat = time)

)



         cat

value     x y

  Better  2 2

  Similar 1 2

  Worse   1 0

answered Apr 2 at 16:48

Frank

56.1k660135

add a comment |

As mentioned in the comments, table is standard for this, like

table(stack(DT))



         ind

values    Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

table(value = unlist(DT), cat = names(DT)[col(DT)])



         cat

value     Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

with(reshape(DT, direction = "long", varying = 1:2), 

  table(value = Category, cat = time)

)



         cat

value     x y

  Better  2 2

  Similar 1 2

  Worse   1 0

answered Apr 2 at 16:48

Frank

56.1k660135

add a comment |

As mentioned in the comments, table is standard for this, like

table(stack(DT))



         ind

values    Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

table(value = unlist(DT), cat = names(DT)[col(DT)])



         cat

value     Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

with(reshape(DT, direction = "long", varying = 1:2), 

  table(value = Category, cat = time)

)



         cat

value     x y

  Better  2 2

  Similar 1 2

  Worse   1 0

answered Apr 2 at 16:48

Frank

56.1k660135

As mentioned in the comments, table is standard for this, like

table(stack(DT))



         ind

values    Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

table(value = unlist(DT), cat = names(DT)[col(DT)])



         cat

value     Category.x Category.y

  Better           2          2

  Similar          1          2

  Worse            1          0

with(reshape(DT, direction = "long", varying = 1:2), 

  table(value = Category, cat = time)

)



         cat

value     x y

  Better  2 2

  Similar 1 2

  Worse   1 0

answered Apr 2 at 16:48

Frank

56.1k660135

answered Apr 2 at 16:48

Frank

56.1k660135

answered Apr 2 at 16:48

Frank

56.1k660135

answered Apr 2 at 16:48

Frank

56.1k660135

add a comment |

sapply(df1, function(x) sapply(unique(unlist(df1)), function(y) sum(y == x)))

#        Category.x Category.y

#Better           2          2

#Similar          1          2

#Worse            1          0

answered Apr 2 at 16:33

d.b

20.5k41949

add a comment |

sapply(df1, function(x) sapply(unique(unlist(df1)), function(y) sum(y == x)))

#        Category.x Category.y

#Better           2          2

#Similar          1          2

#Worse            1          0

answered Apr 2 at 16:33

d.b

20.5k41949

add a comment |

sapply(df1, function(x) sapply(unique(unlist(df1)), function(y) sum(y == x)))

#        Category.x Category.y

#Better           2          2

#Similar          1          2

#Worse            1          0

answered Apr 2 at 16:33

d.b

20.5k41949

sapply(df1, function(x) sapply(unique(unlist(df1)), function(y) sum(y == x)))

#        Category.x Category.y

#Better           2          2

#Similar          1          2

#Worse            1          0

answered Apr 2 at 16:33

d.b

20.5k41949

answered Apr 2 at 16:33

d.b

20.5k41949

answered Apr 2 at 16:33

d.b

20.5k41949

answered Apr 2 at 16:33

d.b

20.5k41949

add a comment |

One dplyr and tidyr possibility could be:

df %>%

 gather(var, val) %>%

 count(var, val) %>%

 spread(var, n, fill = 0)



  val     Category.x Category.y

  <chr>        <dbl>      <dbl>

1 Better           2          2

2 Similar          1          2

3 Worse            1          0

Or with dplyr and reshape2 you can do:

df %>%

 mutate(rowid = row_number()) %>%

 melt(., id.vars = "rowid") %>%

 count(variable, value) %>%

 dcast(value ~ variable, value.var = "n", fill = 0)



    value Category.x Category.y

1  Better          2          2

2 Similar          1          2

3   Worse          1          0

edited Apr 2 at 17:58

answered Apr 2 at 16:41

tmfmnk

3,6661516

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
Apr 2 at 16:56

Please see the updated post for commentary.

– tmfmnk
Apr 2 at 17:04

add a comment |

One dplyr and tidyr possibility could be:

df %>%

 gather(var, val) %>%

 count(var, val) %>%

 spread(var, n, fill = 0)



  val     Category.x Category.y

  <chr>        <dbl>      <dbl>

1 Better           2          2

2 Similar          1          2

3 Worse            1          0

Or with dplyr and reshape2 you can do:

df %>%

 mutate(rowid = row_number()) %>%

 melt(., id.vars = "rowid") %>%

 count(variable, value) %>%

 dcast(value ~ variable, value.var = "n", fill = 0)



    value Category.x Category.y

1  Better          2          2

2 Similar          1          2

3   Worse          1          0

edited Apr 2 at 17:58

answered Apr 2 at 16:41

tmfmnk

3,6661516

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
Apr 2 at 16:56

Please see the updated post for commentary.

– tmfmnk
Apr 2 at 17:04

add a comment |

One dplyr and tidyr possibility could be:

df %>%

 gather(var, val) %>%

 count(var, val) %>%

 spread(var, n, fill = 0)



  val     Category.x Category.y

  <chr>        <dbl>      <dbl>

1 Better           2          2

2 Similar          1          2

3 Worse            1          0

Or with dplyr and reshape2 you can do:

df %>%

 mutate(rowid = row_number()) %>%

 melt(., id.vars = "rowid") %>%

 count(variable, value) %>%

 dcast(value ~ variable, value.var = "n", fill = 0)



    value Category.x Category.y

1  Better          2          2

2 Similar          1          2

3   Worse          1          0

edited Apr 2 at 17:58

answered Apr 2 at 16:41

tmfmnk

3,6661516

One dplyr and tidyr possibility could be:

df %>%

 gather(var, val) %>%

 count(var, val) %>%

 spread(var, n, fill = 0)



  val     Category.x Category.y

  <chr>        <dbl>      <dbl>

1 Better           2          2

2 Similar          1          2

3 Worse            1          0

Or with dplyr and reshape2 you can do:

df %>%

 mutate(rowid = row_number()) %>%

 melt(., id.vars = "rowid") %>%

 count(variable, value) %>%

 dcast(value ~ variable, value.var = "n", fill = 0)



    value Category.x Category.y

1  Better          2          2

2 Similar          1          2

3   Worse          1          0

edited Apr 2 at 17:58

answered Apr 2 at 16:41

tmfmnk

3,6661516

edited Apr 2 at 17:58

answered Apr 2 at 16:41

tmfmnk

3,6661516

answered Apr 2 at 16:41

tmfmnk

3,6661516

answered Apr 2 at 16:41

tmfmnk

3,6661516

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
Apr 2 at 16:56

Please see the updated post for commentary.

– tmfmnk
Apr 2 at 17:04

add a comment |

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
Apr 2 at 16:56

Please see the updated post for commentary.

– tmfmnk
Apr 2 at 17:04

Is var = Category.x and val= c('Better', 'Similar', 'Worse')?

– Daniel
Apr 2 at 16:56

Please see the updated post for commentary.

– tmfmnk
Apr 2 at 17:04

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Hcfyk