Massimo Bressan
2018-Jun-07 12:21 UTC
[R] aggregate and list elements of variables in data.frame
sorry, but by further looking at the example I just realised that the posted solution it's not completely what I need because in fact I do not need to get back the 'indices' but instead the corrisponding values of column A #please consider this new example t<-data.frame(id=c(18,91,20,68,54,27,26,15,4,97),A=c(123,345,123,678,345,123,789,345,123,789)) t # I need to get this result r<-data.frame(unique_A=c(123, 345, 678, 789),list_id=c('18,20,27,4','91,54,15','68','26,97')) r # any help for this, please? Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> A: "r-help" <R-help at r-project.org> Inviato: Gioved?, 7 giugno 2018 10:09:55 Oggetto: Re: aggregate and list elements of variables in data.frame thanks for the help I'm posting here the complete solution t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) t$A <- factor(t$A) l<-sapply(levels(t$A), function(x) which(t$A==x)) r<-data.frame(list_id=unlist(lapply(l, paste, collapse = ", "))) r<-cbind(unique_A=row.names(r),r) row.names(r)<-NULL r best Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> A: "r-help" <R-help at r-project.org> Inviato: Mercoled?, 6 giugno 2018 10:13:10 Oggetto: aggregate and list elements of variables in data.frame #given the following reproducible and simplified example t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) t #I need to get the following result r<-data.frame(unique_A=c(123, 345, 678, 789),list_id=c('1,3,6,9','2,5,8','4','7,10')) r # i.e. aggregate over the variable "A" and list all elements of the variable "id" satisfying the criteria of having the same corrisponding value of "A" #any help for that? #so far I've just managed to "aggregate" and "count", like: library(sqldf) sqldf('select count(*) as count_id, A as unique_A from t group by A') library(dplyr) t%>%group_by(unique_A=A) %>% summarise(count_id = n()) # thank you -- ------------------------------------------------------------ Massimo Bressan ARPAV Agenzia Regionale per la Prevenzione e Protezione Ambientale del Veneto Dipartimento Provinciale di Treviso Via Santa Barbara, 5/a 31100 Treviso, Italy tel: +39 0422 558545 fax: +39 0422 558516 e-mail: massimo.bressan at arpa.veneto.it ------------------------------------------------------------ -- ------------------------------------------------------------ Massimo Bressan ARPAV Agenzia Regionale per la Prevenzione e Protezione Ambientale del Veneto Dipartimento Provinciale di Treviso Via Santa Barbara, 5/a 31100 Treviso, Italy tel: +39 0422 558545 fax: +39 0422 558516 e-mail: massimo.bressan at arpa.veneto.it ------------------------------------------------------------ [[alternative HTML version deleted]]
Ivan Calandra
2018-Jun-07 12:28 UTC
[R] aggregate and list elements of variables in data.frame
Using which() to subset t$id should do the trick: sapply(levels(t$A), function(x) t$id[which(t$A==x)]) Ivan -- Dr. Ivan Calandra TraCEr, laboratory for Traceology and Controlled Experiments MONREPOS Archaeological Research Centre and Museum for Human Behavioural Evolution Schloss Monrepos 56567 Neuwied, Germany +49 (0) 2631 9772-243 https://www.researchgate.net/profile/Ivan_Calandra On 07/06/2018 14:21, Massimo Bressan wrote:> sorry, but by further looking at the example I just realised that the posted solution it's not completely what I need because in fact I do not need to get back the 'indices' but instead the corrisponding values of column A > > #please consider this new example > > t<-data.frame(id=c(18,91,20,68,54,27,26,15,4,97),A=c(123,345,123,678,345,123,789,345,123,789)) > t > > # I need to get this result > r<-data.frame(unique_A=c(123, 345, 678, 789),list_id=c('18,20,27,4','91,54,15','68','26,97')) > r > > # any help for this, please? > > > > > > Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> > A: "r-help" <R-help at r-project.org> > Inviato: Gioved?, 7 giugno 2018 10:09:55 > Oggetto: Re: aggregate and list elements of variables in data.frame > > thanks for the help > > I'm posting here the complete solution > > t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) > t$A <- factor(t$A) > l<-sapply(levels(t$A), function(x) which(t$A==x)) > r<-data.frame(list_id=unlist(lapply(l, paste, collapse = ", "))) > r<-cbind(unique_A=row.names(r),r) > row.names(r)<-NULL > r > > best > > > > Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> > A: "r-help" <R-help at r-project.org> > Inviato: Mercoled?, 6 giugno 2018 10:13:10 > Oggetto: aggregate and list elements of variables in data.frame > > #given the following reproducible and simplified example > > t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) > t > > #I need to get the following result > > r<-data.frame(unique_A=c(123, 345, 678, 789),list_id=c('1,3,6,9','2,5,8','4','7,10')) > r > > # i.e. aggregate over the variable "A" and list all elements of the variable "id" satisfying the criteria of having the same corrisponding value of "A" > #any help for that? > > #so far I've just managed to "aggregate" and "count", like: > > library(sqldf) > sqldf('select count(*) as count_id, A as unique_A from t group by A') > > library(dplyr) > t%>%group_by(unique_A=A) %>% summarise(count_id = n()) > > # thank you > >
Ben Tupper
2018-Jun-07 12:47 UTC
[R] aggregate and list elements of variables in data.frame
Hi, Does this do what you want? I had to change the id values to something more obvious. It uses tibbles which allow each variable to be a list. library(tibble) library(dplyr) x <- tibble(id=LETTERS[1:10], A=c(123,345,123,678,345,123,789,345,123,789)) uA <- unique(x$A) idx <- lapply(uA, function(v) which(x$A %in% v)) vals <- lapply(idx, function(index) x$id[index]) r <- tibble(unique_A = uA, list_idx = idx, list_vals = vals)> r# A tibble: 4 x 3 unique_A list_idx list_vals <dbl> <list> <list> 1 123. <int [4]> <chr [4]> 2 345. <int [3]> <chr [3]> 3 678. <int [1]> <chr [1]> 4 789. <int [2]> <chr [2]>> r$list_idx[1][[1]] [1] 1 3 6 9> r$list_vals[1][[1]] [1] "A" "C" "F" "I" Cheers, ben> On Jun 7, 2018, at 8:21 AM, Massimo Bressan <massimo.bressan at arpa.veneto.it> wrote: > > sorry, but by further looking at the example I just realised that the posted solution it's not completely what I need because in fact I do not need to get back the 'indices' but instead the corrisponding values of column A > > #please consider this new example > > t<-data.frame(id=c(18,91,20,68,54,27,26,15,4,97),A=c(123,345,123,678,345,123,789,345,123,789)) > t > > # I need to get this result > r<-data.frame(unique_A=c(123, 345, 678, 789),list_id=c('18,20,27,4','91,54,15','68','26,97')) > r > > # any help for this, please? > > > > > > Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> > A: "r-help" <R-help at r-project.org> > Inviato: Gioved?, 7 giugno 2018 10:09:55 > Oggetto: Re: aggregate and list elements of variables in data.frame > > thanks for the help > > I'm posting here the complete solution > > t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) > t$A <- factor(t$A) > l<-sapply(levels(t$A), function(x) which(t$A==x)) > r<-data.frame(list_id=unlist(lapply(l, paste, collapse = ", "))) > r<-cbind(unique_A=row.names(r),r) > row.names(r)<-NULL > r > > best > > > > Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> > A: "r-help" <R-help at r-project.org> > Inviato: Mercoled?, 6 giugno 2018 10:13:10 > Oggetto: aggregate and list elements of variables in data.frame > > #given the following reproducible and simplified example > > t<-data.frame(id=1:10,A=c(123,345,123,678,345,123,789,345,123,789)) > t > > #I need to get the following result > > r<-data.frame(unique_A=c(123, 345, 678, 789),list_id=c('1,3,6,9','2,5,8','4','7,10')) > r > > # i.e. aggregate over the variable "A" and list all elements of the variable "id" satisfying the criteria of having the same corrisponding value of "A" > #any help for that? > > #so far I've just managed to "aggregate" and "count", like: > > library(sqldf) > sqldf('select count(*) as count_id, A as unique_A from t group by A') > > library(dplyr) > t%>%group_by(unique_A=A) %>% summarise(count_id = n()) > > # thank you > > > -- > > ------------------------------------------------------------ > Massimo Bressan > > ARPAV > Agenzia Regionale per la Prevenzione e > Protezione Ambientale del Veneto > > Dipartimento Provinciale di Treviso > Via Santa Barbara, 5/a > 31100 Treviso, Italy > > tel: +39 0422 558545 > fax: +39 0422 558516 > e-mail: massimo.bressan at arpa.veneto.it > ------------------------------------------------------------ > > > -- > > ------------------------------------------------------------ > Massimo Bressan > > ARPAV > Agenzia Regionale per la Prevenzione e > Protezione Ambientale del Veneto > > Dipartimento Provinciale di Treviso > Via Santa Barbara, 5/a > 31100 Treviso, Italy > > tel: +39 0422 558545 > fax: +39 0422 558516 > e-mail: massimo.bressan at arpa.veneto.it > ------------------------------------------------------------ > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Ben Tupper Bigelow Laboratory for Ocean Sciences 60 Bigelow Drive, P.O. Box 380 East Boothbay, Maine 04544 http://www.bigelow.org Ecological Forecasting: https://eco.bigelow.org/ [[alternative HTML version deleted]]
Massimo Bressan
2018-Jun-07 13:27 UTC
[R] aggregate and list elements of variables in data.frame
thank you for the help this is my solution based on your valuable hint but without the need to pass through the use of a 'tibble' x<-data.frame(id=LETTERS[1:10], A=c(123,345,123,678,345,123,789,345,123,789)) uA<-unique(x$A) idx<-lapply(uA, function(v) which(x$A %in% v)) vals<- lapply(idx, function(index) x$id[index]) data.frame(unique_A = uA, list_vals=unlist(lapply(vals, paste, collapse = ", "))) best Da: "Ben Tupper" <btupper at bigelow.org> A: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> Cc: "r-help" <R-help at r-project.org> Inviato: Gioved?, 7 giugno 2018 14:47:55 Oggetto: Re: [R] aggregate and list elements of variables in data.frame Hi, Does this do what you want? I had to change the id values to something more obvious. It uses tibbles which allow each variable to be a list. library(tibble) library(dplyr) x <- tibble(id=LETTERS[1:10], A=c(123,345,123,678,345,123,789,345,123,789)) uA <- unique(x$A) idx <- lapply(uA, function(v) which(x$A %in% v)) vals <- lapply(idx, function(index) x$id[index]) r <- tibble(unique_A = uA, list_idx = idx, list_vals = vals)> r# A tibble: 4 x 3 unique_A list_idx list_vals <dbl> <list> <list> 1 123. <int [4]> <chr [4]> 2 345. <int [3]> <chr [3]> 3 678. <int [1]> <chr [1]> 4 789. <int [2]> <chr [2]>> r$list_idx[1][[1]] [1] 1 3 6 9> r$list_vals[1][[1]] [1] "A" "C" "F" "I" Cheers, ben [[alternative HTML version deleted]]