Hi, I need to compare two matrices with each other. If you can get one of them out of the other one by resorting the rows and/or the columns, then both of them are equal, otherwise they're not. A matrix could look like this: [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [1,] 0 1 1 1 0 1 1 0 [2,] 1 1 0 0 0 1 0 1 [3,] 1 0 1 0 0 0 1 1 [4,] 1 1 0 0 1 0 0 0 [5,] 1 0 1 1 1 0 0 0 [6,] 0 1 0 1 1 0 0 0 [7,] 0 0 0 0 0 1 1 1 Note that each matrix consists of ones and zeros, in each row and in each column there are at least three ones and one zero and each pair of rows/columns may have at most two positions where both are ones (e.g. for the 1. and 2. rows those positions are 2 and 6). I was advised to sort both matrices in the same way and then to compare them element by element. But I don't manage to get them sorted... My approach is as following: 1. Sort the rows after the row sums (greater sums first). 2. Sort the columns after the first column (columns with ones in the first row go left, columns with zeros go right). 3. Save the left part (all columns with ones in the first row) and the right part in separate matrices. 4. Repeat steps 2 and 3 with both of the created matrices (now taking the second row for sorting), repeat until all fragments consist of a single column. 5. Compose the columns to a sorted matrix. This algorithm has several problems: 1. How to make a loop that is branching out in two subloops on each iteration? 2. How to organize the intermediate results and compose them without losing the order? Maybe save them in lists and sublists? 3. A fundamental problem: If there are rows with equal sums the result may depend on which of them is sorted after first. Maybe this algorithm won't work at all because of this problem? Thanks in advance for your input, Michael
Hi, On Sat, Aug 22, 2009 at 2:45 PM, Michael Kogan <michael.kogan@gmx.net>wrote:> Hi, > > I need to compare two matrices with each other. If you can get one of them > out of the other one by resorting the rows and/or the columns, then both of > them are equal, otherwise they're not. A matrix could look like this: > > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] > [1,] 0 1 1 1 0 1 1 0 > [2,] 1 1 0 0 0 1 0 1 > [3,] 1 0 1 0 0 0 1 1 > [4,] 1 1 0 0 1 0 0 0 > [5,] 1 0 1 1 1 0 0 0 > [6,] 0 1 0 1 1 0 0 0 > [7,] 0 0 0 0 0 1 1 1 > > Note that each matrix consists of ones and zeros, in each row and in each > column there are at least three ones and one zero and each pair of > rows/columns may have at most two positions where both are ones (e.g. for > the 1. and 2. rows those positions are 2 and 6). > > I was advised to sort both matrices in the same way and then to compare > them element by element. But I don't manage to get them sorted... My > approach is as following: > > 1. Sort the rows after the row sums (greater sums first). > 2. Sort the columns after the first column (columns with ones in the first > row go left, columns with zeros go right). > 3. Save the left part (all columns with ones in the first row) and the > right part in separate matrices. > 4. Repeat steps 2 and 3 with both of the created matrices (now taking the > second row for sorting), repeat until all fragments consist of a single > column. > 5. Compose the columns to a sorted matrix. > > This algorithm has several problems: > > 1. How to make a loop that is branching out in two subloops on each > iteration? > 2. How to organize the intermediate results and compose them without losing > the order? Maybe save them in lists and sublists? > 3. A fundamental problem: If there are rows with equal sums the result may > depend on which of them is sorted after first. Maybe this algorithm won't > work at all because of this problem?Ouch, this seems like a real PITA. If you want to go about this by implementing the algo you described, I think you'd be best suited via some divide-and-conquer/recursion route: http://en.wikipedia.org/wiki/Divide_and_conquer_algorithm Perhaps you can take inspiration from some concrete sorting algorithms that are implemented this way: Merge sort: http://en.wikipedia.org/wiki/Merge_sort Quick sort: http://en.wikipedia.org/wiki/Quicksort Hope that helps, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact [[alternative HTML version deleted]]
On Sat, Aug 22, 2009 at 2:45 PM, Michael Kogan<michael.kogan at gmx.net> wrote:> Hi, > > I need to compare two matrices with each other. If you can get one of them > out of the other one by resorting the rows and/or the columns, then both of > them are equal, otherwise they're not. A matrix could look like this: > ? ?[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] > [1,] ? ?0 ? ?1 ? ?1 ? ?1 ? ?0 ? ?1 ? ?1 ? ?0 > [2,] ? ?1 ? ?1 ? ?0 ? ?0 ? ?0 ? ?1 ? ?0 ? ?1 > [3,] ? ?1 ? ?0 ? ?1 ? ?0 ? ?0 ? ?0 ? ?1 ? ?1 > [4,] ? ?1 ? ?1 ? ?0 ? ?0 ? ?1 ? ?0 ? ?0 ? ?0 > [5,] ? ?1 ? ?0 ? ?1 ? ?1 ? ?1 ? ?0 ? ?0 ? ?0 > [6,] ? ?0 ? ?1 ? ?0 ? ?1 ? ?1 ? ?0 ? ?0 ? ?0 > [7,] ? ?0 ? ?0 ? ?0 ? ?0 ? ?0 ? ?1 ? ?1 ? ?1 > > Note that each matrix consists of ones and zeros, in each row and in each > column there are at least three ones and one zero and each pair of > rows/columns may have at most two ?positions where both are ones (e.g. for > the 1. and 2. rows those positions are 2 and 6). > > I was advised to sort both matrices in the same way and then to compare them > element by element. But I don't manage to get them sorted... My approach is > as following: > > 1. Sort the rows after the row sums (greater sums first). > 2. Sort the columns after the first column (columns with ones in the first > row go left, columns with zeros go right). > 3. Save the left part (all columns with ones in the first row) and the right > part in separate matrices. > 4. Repeat steps 2 and 3 with both of the created matrices (now taking the > second row for sorting), repeat until all fragments consist of a single > column. > 5. Compose the columns to a sorted matrix. > > This algorithm has several problems: > > 1. How to make a loop that is branching out in two subloops on each > iteration? > 2. How to organize the intermediate results and compose them without losing > the order? Maybe save them in lists and sublists? > 3. A fundamental problem: If there are rows with equal sums the result may > depend on which of them is sorted after first. Maybe this algorithm won't > work at all because of this problem? > > Thanks in advance for your input, > Michael >Define two test matrices, m1 and m2, and then a function which sorts its argument into standardized form. Finally compare the standardized versions: m1 <- matrix(c(0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1), 7) m2 <- m1[7:1, ] stdize <- function(x) x[do.call(order, as.data.frame(x)),] identical(stdize(m1), stdize(m2))
> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Michael Kogan > Sent: Saturday, August 22, 2009 11:45 AM > To: r-help at r-project.org > Subject: [R] Help on comparing two matrices > > Hi, > > I need to compare two matrices with each other. If you can get one of > them out of the other one by resorting the rows and/or the > columns, then > both of them are equal, otherwise they're not. A matrix could > look like > this: > > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] > [1,] 0 1 1 1 0 1 1 0 > [2,] 1 1 0 0 0 1 0 1 > [3,] 1 0 1 0 0 0 1 1 > [4,] 1 1 0 0 1 0 0 0 > [5,] 1 0 1 1 1 0 0 0 > [6,] 0 1 0 1 1 0 0 0 > [7,] 0 0 0 0 0 1 1 1 > > Note that each matrix consists of ones and zeros, in each row and in > each column there are at least three ones and one zero and > each pair of > rows/columns may have at most two positions where both are > ones (e.g. > for the 1. and 2. rows those positions are 2 and 6). > > I was advised to sort both matrices in the same way and then > to compare > them element by element. But I don't manage to get them sorted... My > approach is as following: > > 1. Sort the rows after the row sums (greater sums first). > 2. Sort the columns after the first column (columns with ones in the > first row go left, columns with zeros go right). > 3. Save the left part (all columns with ones in the first > row) and the > right part in separate matrices. > 4. Repeat steps 2 and 3 with both of the created matrices (now taking > the second row for sorting), repeat until all fragments consist of a > single column. > 5. Compose the columns to a sorted matrix. > > This algorithm has several problems: > > 1. How to make a loop that is branching out in two subloops on each > iteration? > 2. How to organize the intermediate results and compose them without > losing the order? Maybe save them in lists and sublists? > 3. A fundamental problem: If there are rows with equal sums > the result > may depend on which of them is sorted after first. Maybe this > algorithm > won't work at all because of this problem? > > Thanks in advance for your input, > Michael >Michael, I see that you have received a number of responses to your request, and you may have already solved your problem. It seems to me that the responses so far don't guarantee finding a match if both row and column exchanges are necessary. I put together some code as an R learning task for me. It is only partially automated. Someone with more R programming skills than me could probably totally automate this process. I think your initial plan of sorting by row total was moving in the right direction. I took your original matrix, m, and created a matrix, x, as a random ordering of rows and columns of m. Then sorted m and x by both row and column totals. Next, all possible orderings of the ordered x matrix, x.o, were constructed by permuting rows (and columns) with the same row ( or column) totals and were compared to the ordered m matrix, m.o . I used the permutations function from the gtools package, so you would need to install and load that package. If a match was found, TRUE was printed out. A better R programmer than me could probably finish automating the entire process, wrapping it all up in a nice function, but this at least gives you a framework for a solution. require(gtools) m <- matrix(c( 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1), 7, 8, byrow = TRUE) ##----create comparison matrix by randomly ordering rows and columns (should compare TRUE)----## i <- sample(1:nrow(m)) j <- sample(1:ncol(m)) x <- m[i,j] ##----if row and column totals aren't the same, then matrices can't be "equal"----## # no need to proceed if false identical(sort(apply(m,1,sum)),sort(apply(x,1,sum))) & identical(sort(apply(m,2,sum)),sort(apply(x,2,sum))) ##----Order both arrays by row and column totals----## m.o <- m[order(apply(m,1,sum)),order(apply(m,2,sum))] x.o <- x[order(apply(x,1,sum)),order(apply(x,2,sum))] ##----build row permutation list----## # these are groups of rows that have same sum rowtot <- rle(apply(x.o,1,sum))$lengths rb <- list() begin <- 1 for(i in 1:length(rowtot)){ rb[[i]] <- permutations(rowtot[i],rowtot[i],v=begin:cumsum(rowtot)[i]) begin <- begin + rowtot[i] } # construction of rperm could (should) be automated rperm <- cbind(rb[[1]][rep(1:nrow(rb[[1]]), each = nrow(rb[[2]])), ], rb[[2]][rep(1:nrow(rb[[2]]), nrow(rb[[1]])), ] ) rperm <- cbind(rperm[rep(1:nrow(rperm), each = nrow(rb[[3]])), ], rb[[3]][rep(1:nrow(rb[[3]]), nrow(rperm)), ] ) ##----build column permutation list----## # these are groups of columns that have same sum coltot <- rle(apply(x.o,2,sum))$lengths cb <- list() begin <- 1 for(i in 1:length(coltot)){ cb[[i]] <- permutations(coltot[i],coltot[i],v=begin:cumsum(coltot)[i]) begin <- begin + coltot[i] } # construction of cperm could (should) be automated cperm <- cbind(cb[[1]][rep(1:nrow(cb[[1]]), each = nrow(cb[[2]])), ], cb[[2]][rep(1:nrow(cb[[2]]), nrow(cb[[1]])), ] ) ##----compare m.o with all x.o where rows and columns of x.o with tied totals are permuted----## for(i in 1:nrow(rperm)){ for(j in 1:nrow(cperm)){ if(identical(m.o,x.o[rperm[i,],cperm[j,]])) { cat('TRUE','\n') break } } } Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA