I have the following (simple!?) problem which I am unable to find a relatively trivial solution to. If I have a dataframe, A 1 A 7 B 4 B 5 C 3 D 3 D 2 E 5 F 5 F 6 I would like to create a new data.frame in the form ID pt1 pt2 A 1 7 B 4 5 C 3 NA D 3 2 E 5 NA F 5 6 so that for each identifier, in this example, A...F I have a column for each observation for each identifier... (with a maximum of 2 obs per identifier, if only 1 obs exist then the second obs pt2 is set to NA) This is so I can find the absolute differences between the obs for each identifier, that is abs(pt1-pt2) ID Diff A 6 B 1 C NA D 1 E NA F 1 for which there may be another approach so as not to mess about creating a new dataframe Any ideas? Gary __________________________________________________ Gary S. Collins, PhD, Statistics Research Fellow, Quality of Life Unit, European Organisation for Research and Treatment of Cancer, EORTC Data Center, Avenue E. Mounier 83, bte. 11, B-1200 Brussels, Belgium. Tel: +32 2 774 1 606 Fax: +32 2 779 4 568 http://www.eortc.be/home/qol/ __________________________________________________ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Gary Collins wrote:> > I have the following (simple!?) problem which I am unable to find a > relatively trivial solution to. > If I have a dataframe, > > A 1 > A 7 > B 4 > B 5 > C 3 > D 3 > D 2 > E 5 > F 5 > F 6 > > I would like to create a new data.frame in the form > > ID pt1 pt2 > A 1 7 > B 4 5 > C 3 NA > D 3 2 > E 5 NA > F 5 6 > > so that for each identifier, in this example, A...F I have a column for > each observation for each identifier... (with a maximum of 2 obs per > identifier, if only 1 obs exist then the second obs pt2 is set to NA) > This is so I can find the absolute differences between the obs for each > identifier, that is abs(pt1-pt2) > > ID Diff > A 6 > B 1 > C NA > D 1 > E NA > F 1 > for which there may be another approach so as not to mess about creating > a new dataframe > Any ideas? > GaryWhat about by(y,x,function(x) x[2]-x[1]) if x is your factor and y are your values? -d> > __________________________________________________ > Gary S. Collins, PhD, > Statistics Research Fellow, > Quality of Life Unit, > European Organisation for Research and Treatment of Cancer, > EORTC Data Center, > Avenue E. Mounier 83, bte. 11, > B-1200 Brussels, Belgium. > > Tel: +32 2 774 1 606 > Fax: +32 2 779 4 568 > http://www.eortc.be/home/qol/ > __________________________________________________ > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._-- Mag. David Meyer Wiedner Hauptstrasse 8-10 Vienna University of Technology A-1040 Vienna/AUSTRIA Department for Tel.: (+431) 58801/10772 Statistics and Probability Theory mail: david.meyer at ci.tuwien.ac.at -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> I have the following (simple!?) problem which I am unable to find a > relatively trivial solution to. > If I have a dataframe, > > A 1 > A 7 > B 4 > B 5 > C 3 > D 3 > D 2 > E 5 > F 5 > F 6 > > I would like to create a new data.frame in the form > > ID pt1 pt2 > A 1 7 > B 4 5 > C 3 NA > D 3 2 > E 5 NA > F 5 6 >t(as.data.frame(split(1:4, factor(c("A", "A", "B", "B"))))) (for complete data) Torsten> so that for each identifier, in this example, A...F I have a column for > each observation for each identifier... (with a maximum of 2 obs per > identifier, if only 1 obs exist then the second obs pt2 is set to NA) > This is so I can find the absolute differences between the obs for each > identifier, that is abs(pt1-pt2) > > ID Diff > A 6 > B 1 > C NA > D 1 > E NA > F 1 > for which there may be another approach so as not to mess about creating > a new dataframe > Any ideas? > Gary > > > __________________________________________________ > Gary S. Collins, PhD, > Statistics Research Fellow, > Quality of Life Unit, > European Organisation for Research and Treatment of Cancer, > EORTC Data Center, > Avenue E. Mounier 83, bte. 11, > B-1200 Brussels, Belgium. > > Tel: +32 2 774 1 606 > Fax: +32 2 779 4 568 > http://www.eortc.be/home/qol/ > __________________________________________________ > > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
This gives you what you want, I think (maybe up to the sign). However, you must be sure that each identifier occurs at most twice. Giovanni> aV1 V2 1 A 1 2 A 7 3 B 4 4 B 5 5 C 3 6 D 3 7 D 2 8 E 5 9 F 5 10 F 6> tapply(a$V2, a$V1, diff)$A [1] 6 $B [1] 1 $C numeric(0) $D [1] -1 $E numeric(0) $F [1] 1> From: "Gary Collins" <gco at eortc.be> > Date: Tue, 22 Jan 2002 14:29:22 +0100 > Organization: EORTC > X-Priority: 3 > X-MSMail-Priority: Normal > X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4807.1700 > Sender: owner-r-help at stat.math.ethz.ch > Precedence: SfS-bulk > Content-Type: text/plain; > charset="iso-8859-1" > Content-Length: 1597 > > I have the following (simple!?) problem which I am unable to find a > relatively trivial solution to. > If I have a dataframe, > > A 1 > A 7 > B 4 > B 5 > C 3 > D 3 > D 2 > E 5 > F 5 > F 6 > > I would like to create a new data.frame in the form > > ID pt1 pt2 > A 1 7 > B 4 5 > C 3 NA > D 3 2 > E 5 NA > F 5 6 > > so that for each identifier, in this example, A...F I have a column for > each observation for each identifier... (with a maximum of 2 obs per > identifier, if only 1 obs exist then the second obs pt2 is set to NA) > This is so I can find the absolute differences between the obs for each > identifier, that is abs(pt1-pt2) > > ID Diff > A 6 > B 1 > C NA > D 1 > E NA > F 1 > for which there may be another approach so as not to mess about creating > a new dataframe > Any ideas? > Gary > > > __________________________________________________ > Gary S. Collins, PhD, > Statistics Research Fellow, > Quality of Life Unit, > European Organisation for Research and Treatment of Cancer, > EORTC Data Center, > Avenue E. Mounier 83, bte. 11, > B-1200 Brussels, Belgium. > > Tel: +32 2 774 1 606 > Fax: +32 2 779 4 568 > http://www.eortc.be/home/qol/ > __________________________________________________ > > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-- __________________________________________________ [ ] [ Giovanni Petris GPetris at uark.edu ] [ Department of Mathematical Sciences ] [ University of Arkansas - Fayetteville, AR 72701 ] [ Ph: (501) 575-6324, 575-8630 (fax) ] [ http://definetti.uark.edu/~gpetris/ ] [__________________________________________________] -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Tue, 22 Jan 2002, Gary Collins wrote:> I have the following (simple!?) problem which I am unable to find a > relatively trivial solution to. > If I have a dataframe, > > A 1 > A 7 > B 4 > B 5 > C 3 > D 3 > D 2 > E 5 > F 5 > F 6 > > I would like to create a new data.frame in the form > > ID pt1 pt2 > A 1 7 > B 4 5 > C 3 NA > D 3 2 > E 5 NA > F 5 6 >In addition to the specific suggestions already given there is a general solution to this sort of problem with reshape() You would need to create a time variable to indicate which is the first or second observation, which in your case could be df$time<-c(0,df$ID[-1]==df$ID[-10]) then newdf<-reshape(df,timevar="time",idvar="ID",direction="wide") In your case this isn't a big saving over the other approaches. It becomes really useful when you have many variables, especially if some are constant over time and others aren't. -thomas -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._