Dear R-users with a bit of grief I had to repeat an extensive analysis because I did not suspect (and therefore did not read the documentation) that round was implemented as "for rounding off a 5, the IEC 60559 standard is expected to be used, 'go to the even digit'", resulting in round(1.5) = 2 round (2.5) = 2. As a non-mathematician I am both puzzled and intrigued by this rule as it is against what I have learned in my math courses, i.e. round(1.5) = 2 round (2.5) = 3. I would like to understand the reason behind this rule. Thanks for your comments. Markus -- Markus Didion Waldökologie Forest Ecology Inst. f. Terrestrische Oekosysteme Inst. of Terrestrial Ecosystems Departement Umweltwissenschaften Dept. of Environmental Sciences Eidg. Technische Hochschule Swiss Fed. Inst. of Technology ETH-Zentrum CHN G78 ETH-Zentrum CHN G78 Universitätstr. 22 Universitaetstr. 22 CH-8092 Zürich CH-8092 Zurich Schweiz Switzerland Tel +41 (0)44 632 5629 Fax +41 (0)44 632 1358 Email markus.didion@env.ethz.ch homepage: http://www.fe.ethz.ch/people/didionm [[alternative HTML version deleted]]
Hi Markus, The R function round() uses the round-to-even method, which is explained on http://en.wikipedia.org/wiki/Rounding#Round-to-even_method. If you would like instead "traditional rounding" then you should add 0.5 and take the integer part, as is suggested in the examples on ?round, e.g. x=c(1.49,1.50,1.51,2.49,2.50,2.51,3.49,3.50,3.51);r2e=round(x);trad=floor(x+0.5);data.frame(x,r2e,trad) The reason for the IEEE standard is to do with signal processing and the bias introduced by traditional rounding if you have a lot of data points whose decimal expansion ends in ....5. Personally, I am a mathematician and satistician and I find that in 99% of cases traditional rounding is what is required (basically always except some very specific examples involving very large sets of data) so for R to have put the IEEE standard as the default (rather than, say, an option) is a bit odd. However, R's benefits and advantages by far outweigh its little oddities, as I presume you know since you are using it. Effectively, I never use the round() command and always calculate using the floor function. Toby Marthews Le Dim 15 juin 2008 11:26, Markus Didion a ?crit :> Dear R-users > > with a bit of grief I had to repeat an extensive analysis because I > did not suspect (and therefore did not read the documentation) that > round was implemented as "for rounding off a 5, the IEC 60559 standard > is expected to be used, 'go to the even digit'", resulting in > round(1.5) = 2 > round (2.5) = 2. > > As a non-mathematician I am both puzzled and intrigued by this rule as > it is against what I have learned in my math courses, i.e. > round(1.5) = 2 > round (2.5) = 3. > > I would like to understand the reason behind this rule. > > Thanks for your comments. > > Markus > > -- > > Markus Didion > > Wald?kologie Forest Ecology > Inst. f. Terrestrische Oekosysteme Inst. of Terrestrial Ecosystems > Departement Umweltwissenschaften Dept. of Environmental Sciences > Eidg. Technische Hochschule Swiss Fed. Inst. of Technology > ETH-Zentrum CHN G78 ETH-Zentrum CHN G78 > Universit?tstr. 22 Universitaetstr. 22 > CH-8092 Z?rich CH-8092 Zurich > Schweiz Switzerland > > Tel +41 (0)44 632 5629 Fax +41 (0)44 632 1358 > Email markus.didion at env.ethz.ch > homepage: http://www.fe.ethz.ch/people/didionm
The logic behind the round to even rule is that we are trying to represent an underlying continuous value and if x comes from a truly continuous distribution, then the probability that x==2.5 is 0 and the 2.5 was probably already rounded once from any values between 2.45 and 2.54999999999999..., if we use the round up on 0.5 rule that we learned in grade school, then the double rounding means that values between 2.45 and 2.50 will all round to 3 (having been rounded first to 2.5). This will tend to bias estimates upwards. To remove the bias we need to either go back to before the rounding to 2.5 (which is often impossible to impractical), or just round up half the time and round down half the time (or better would be to round proportional to how likely we are to see values below or above 2.5 rounded to 2.5, but that will be close to 50/50 for most underlying distributions). The stochastic approach would be to have the round function randomly choose which way to round, but deterministic types are not comforatable with that, so "round to even" was chosen (round to odd should work about the same) as a consistent rule that rounds up and down about 50/50. If you are dealing with data where 2.5 is likely to represent an exact value (money for example), then you may do better by multiplying all values by 10 or 100 and working in integers, then converting back only for the final printing. Note that 2.50000001 rounds to 3, so if you keep more digits of accuracy until the final printing, then rounding will go in the expected direction, or you can add 0.000000001 (or other small number) to your values just before rounding, but that can bias your estimates upwards. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org (801) 408-8111> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Markus Didion > Sent: Sunday, June 15, 2008 3:26 AM > To: r-help at r-project.org > Subject: [R] round(1.5) = round(2.5) = 2? > > Dear R-users > > with a bit of grief I had to repeat an extensive analysis > because I did not suspect (and therefore did not read the > documentation) that round was implemented as "for rounding > off a 5, the IEC 60559 standard is expected to be used, 'go > to the even digit'", resulting in > round(1.5) = 2 > round (2.5) = 2. > > As a non-mathematician I am both puzzled and intrigued by > this rule as it is against what I have learned in my math > courses, i.e. > round(1.5) = 2 > round (2.5) = 3. > > I would like to understand the reason behind this rule. > > Thanks for your comments. > > Markus > > -- > > Markus Didion > > Wald?kologie Forest Ecology > Inst. f. Terrestrische Oekosysteme Inst. of > Terrestrial Ecosystems > Departement Umweltwissenschaften Dept. of > Environmental Sciences > Eidg. Technische Hochschule Swiss Fed. > Inst. of Technology > ETH-Zentrum CHN G78 ETH-Zentrum CHN G78 > Universit?tstr. 22 > Universitaetstr. 22 > CH-8092 Z?rich CH-8092 Zurich > Schweiz > Switzerland > > Tel +41 (0)44 632 5629 Fax +41 (0)44 632 1358 > Email markus.didion at env.ethz.ch > homepage: http://www.fe.ethz.ch/people/didionm > [[alternative HTML version deleted]] > >