If I run "reboot" on the a node?there are not .snap files on A node after reboot. Does the snap file only appear after unexpect reboot? Why its size is 0 byte? In this situation ?is a right method to solve this problem removing the snap file? thanks xin ???? iPhone> ? 2016?2?25??19:05?Atin Mukherjee <amukherj at redhat.com> ??? > > + Rajesh , Avra > >> On 02/25/2016 04:12 PM, songxin wrote: >> Thanks for your reply. >> >> Do I need check all files in /var/lib/glusterd/*? >> Must all files be same in A node and B node? > Yes, they should be identical. >> >> I found that the size of >> file /var/lib/glusterd/snaps/.nfs0000000001722f4000000002 is 0 bytes >> after A board reboot. >> So glusterd can't restore by this snap file on A node. >> Is it right? > Yes, looks like that. >> >> >> >> >> >> >> At 2016-02-25 18:25:50, "Atin Mukherjee" <amukherj at redhat.com> wrote: >>> I believe you and Abhishek are from the same group and sharing the >>> common set up. Could you check the content of /var/lib/glusterd/* in >>> board B (post reboot and before starting glusterd) matches with >>> /var/lib/glusterd/* from board A? >>> >>> ~Atin >>> >>>> On 02/25/2016 03:48 PM, songxin wrote: >>>> Hi, >>>> I have a problem as below when I start the gluster after reboot a board. >>>> >>>> precondition: >>>> I use two boards do this test. >>>> The version of glusterfs is 3.7.6. >>>> >>>> A board ip:128.224.162.255 >>>> B board ip:128.224.95.140 >>>> >>>> reproduce steps?? >>>> >>>> 1.systemctl start glusterd (A board) >>>> 2.systemctl start glusterd (B board) >>>> 3.gluster peer probe 128.224.95.140 (A board) >>>> 4.gluster volume create gv0 replica 2 128.224.95.140:/tmp/brick1/gv0 >>>> 128.224.162.255:/data/brick/gv0 force (local board) >>>> 5.gluster volume start gv0 (A board) >>>> 6.press the reset button on the A board.It is a develop board so it has >>>> a reset button that is similar to reset button on pc (A board) >>>> 7.run command "systemctl start glusterd" after A board reboot. And >>>> command failed because the file /var/lib/glusterd/snaps/.nfsxxxxxxxxx >>>> (local board) . >>>> Log is as below. >>>> [2015-12-07 07:55:38.260084] E [MSGID: 101032] >>>> [store.c:434:gf_store_handle_retrieve] 0-: Path corresponding to >>>> /var/lib/glusterd/snaps/.nfs0000000001722f4000000002 >>>> [2015-12-07 07:55:38.260120] D [MSGID: 0] >>>> [store.c:439:gf_store_handle_retrieve] 0-: Returning -1 >>>> >>>> [2015-12-07 07:55:38.260152] E [MSGID: 106200] >>>> [glusterd-store.c:3332:glusterd_store_update_snap] 0-management: snap >>>> handle is NULL >>>> [2015-12-07 07:55:38.260180] E [MSGID: 106196] >>>> [glusterd-store.c:3427:glusterd_store_retrieve_snap] 0-management: >>>> Failed to update snapshot for .nfs0000000001722f40 >>>> [2015-12-07 07:55:38.260208] E [MSGID: 106043] >>>> [glusterd-store.c:3589:glusterd_store_retrieve_snaps] 0-management: >>>> Unable to restore snapshot: .nfs0000000001722f400 >>>> [2015-12-07 07:55:38.260241] D [MSGID: 0] >>>> [glusterd-store.c:3607:glusterd_store_retrieve_snaps] 0-management: >>>> Returning with -1 >>>> [2015-12-07 07:55:38.260268] D [MSGID: 0] >>>> [glusterd-store.c:4339:glusterd_restore] 0-management: Returning -1 >>>> >>>> [2015-12-07 07:55:38.260325] E [MSGID: 101019] >>>> [xlator.c:428:xlator_init] 0-management: Initialization of volume >>>> 'management' failed, review your volfile again >>>> [2015-12-07 07:55:38.260355] E [graph.c:322:glusterfs_graph_init] >>>> 0-management: initializing translator failed >>>> >>>> [2015-12-07 07:55:38.260374] E [graph.c:661:glusterfs_graph_activate] >>>> 0-graph: init failed >>>> >>>> 8.rm /var/lib/glusterd/snaps/.nfsxxxxxxxxx (A board) >>>> 9..run command "systemctl start glusterd" and success. >>>> 10.at this point the peer status is Peer in Cluster (Connected) and all >>>> process is online. >>>> >>>> If a node abnormal reset, must I remove >>>> the /var/lib/glusterd/snaps/.nfsxxxxxx before starting the glusterd? >>>> >>>> I want to know if it is nomal. >>>> >>>> Thanks, >>>> Xin >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> http://www.gluster.org/mailman/listinfo/gluster-users >> >> >> >>
Hi, /var/lib/glusterd/snaps/ contains only 1 file called missed_snaps_list. Apart from it, there are only directories created with the snap names. Is .nfs0000000001722f4000000002, that you saw in /var/lib/glusterd a file or a directory. If it's a file, then it was not placed there as part of snapshotting any volume. If it's a directory, then did you try creating a snapshot with such a name. Regards, Avra On 02/25/2016 05:10 PM, songxin wrote:> If I run "reboot" on the a node??there are not .snap files on A node after reboot. > Does the snap file only appear after unexpect reboot?? > Why its size is 0 byte?? > In this situation ??is a right method to solve this problem removing the snap file? > > thanks > xin > > ??????? iPhone > >> ?? 2016??2??25???19:05??Atin Mukherjee <amukherj at redhat.com> ????? >> >> + Rajesh , Avra >> >>> On 02/25/2016 04:12 PM, songxin wrote: >>> Thanks for your reply. >>> >>> Do I need check all files in /var/lib/glusterd/*? >>> Must all files be same in A node and B node? >> Yes, they should be identical. >>> I found that the size of >>> file /var/lib/glusterd/snaps/.nfs0000000001722f4000000002 is 0 bytes >>> after A board reboot. >>> So glusterd can't restore by this snap file on A node. >>> Is it right? >> Yes, looks like that. >>> >>> >>> >>> >>> >>> At 2016-02-25 18:25:50, "Atin Mukherjee" <amukherj at redhat.com> wrote: >>>> I believe you and Abhishek are from the same group and sharing the >>>> common set up. Could you check the content of /var/lib/glusterd/* in >>>> board B (post reboot and before starting glusterd) matches with >>>> /var/lib/glusterd/* from board A? >>>> >>>> ~Atin >>>> >>>>> On 02/25/2016 03:48 PM, songxin wrote: >>>>> Hi, >>>>> I have a problem as below when I start the gluster after reboot a board. >>>>> >>>>> precondition: >>>>> I use two boards do this test. >>>>> The version of glusterfs is 3.7.6. >>>>> >>>>> A board ip:128.224.162.255 >>>>> B board ip:128.224.95.140 >>>>> >>>>> reproduce steps?? >>>>> >>>>> 1.systemctl start glusterd (A board) >>>>> 2.systemctl start glusterd (B board) >>>>> 3.gluster peer probe 128.224.95.140 (A board) >>>>> 4.gluster volume create gv0 replica 2 128.224.95.140:/tmp/brick1/gv0 >>>>> 128.224.162.255:/data/brick/gv0 force (local board) >>>>> 5.gluster volume start gv0 (A board) >>>>> 6.press the reset button on the A board.It is a develop board so it has >>>>> a reset button that is similar to reset button on pc (A board) >>>>> 7.run command "systemctl start glusterd" after A board reboot. And >>>>> command failed because the file /var/lib/glusterd/snaps/.nfsxxxxxxxxx >>>>> (local board) . >>>>> Log is as below. >>>>> [2015-12-07 07:55:38.260084] E [MSGID: 101032] >>>>> [store.c:434:gf_store_handle_retrieve] 0-: Path corresponding to >>>>> /var/lib/glusterd/snaps/.nfs0000000001722f4000000002 >>>>> [2015-12-07 07:55:38.260120] D [MSGID: 0] >>>>> [store.c:439:gf_store_handle_retrieve] 0-: Returning -1 >>>>> >>>>> [2015-12-07 07:55:38.260152] E [MSGID: 106200] >>>>> [glusterd-store.c:3332:glusterd_store_update_snap] 0-management: snap >>>>> handle is NULL >>>>> [2015-12-07 07:55:38.260180] E [MSGID: 106196] >>>>> [glusterd-store.c:3427:glusterd_store_retrieve_snap] 0-management: >>>>> Failed to update snapshot for .nfs0000000001722f40 >>>>> [2015-12-07 07:55:38.260208] E [MSGID: 106043] >>>>> [glusterd-store.c:3589:glusterd_store_retrieve_snaps] 0-management: >>>>> Unable to restore snapshot: .nfs0000000001722f400 >>>>> [2015-12-07 07:55:38.260241] D [MSGID: 0] >>>>> [glusterd-store.c:3607:glusterd_store_retrieve_snaps] 0-management: >>>>> Returning with -1 >>>>> [2015-12-07 07:55:38.260268] D [MSGID: 0] >>>>> [glusterd-store.c:4339:glusterd_restore] 0-management: Returning -1 >>>>> >>>>> [2015-12-07 07:55:38.260325] E [MSGID: 101019] >>>>> [xlator.c:428:xlator_init] 0-management: Initialization of volume >>>>> 'management' failed, review your volfile again >>>>> [2015-12-07 07:55:38.260355] E [graph.c:322:glusterfs_graph_init] >>>>> 0-management: initializing translator failed >>>>> >>>>> [2015-12-07 07:55:38.260374] E [graph.c:661:glusterfs_graph_activate] >>>>> 0-graph: init failed >>>>> >>>>> 8.rm /var/lib/glusterd/snaps/.nfsxxxxxxxxx (A board) >>>>> 9..run command "systemctl start glusterd" and success. >>>>> 10.at this point the peer status is Peer in Cluster (Connected) and all >>>>> process is online. >>>>> >>>>> If a node abnormal reset, must I remove >>>>> the /var/lib/glusterd/snaps/.nfsxxxxxx before starting the glusterd? >>>>> >>>>> I want to know if it is nomal. >>>>> >>>>> Thanks, >>>>> Xin >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Gluster-users mailing list >>>>> Gluster-users at gluster.org >>>>> http://www.gluster.org/mailman/listinfo/gluster-users >>> >>> >>>
Hi, recondition: A node:128.224.95.140 B node:128.224.162.255 brick on A node:/data/brick/gv0 brick on B node:/data/brick/gv0 reproduce steps: 1.gluster peer probe 128.224.162.255 (on A node) 2.gluster volume create gv0 replica 2 128.224.95.140:/data/brick/gv0 128.224.162.255:/data/brick/gv0 force (on A node) 3.gluster volume start gv0 (on A node) 4.mount -t glusterfs 128.224.95.140:/gv0 gluster (on A node) 5.create some files(a,b,c) in dir gluster (on A node) 6.shutdown the B node 7.change the files(a,b,c) in dir gluster (on A node) 8.reboot B node 9.start glusterd on B node but glusterfsd is offline (on B node) 10.gluster volume remove-brick gv0 replica 1 128.224.162.255:/data/brick/gv0 force (on A node) 11.gluster volume add-brick gv0 replica 2 128.224.162.255:/data/brick/gv0 force (on A node) Now the files are not same between two brick 12."gluster volume heal gv0 info " show entry num is 0 (on A node) Now What I should do if I want to sync file(a,b,c) on two brick? I know the "heal full" can work , but I think the command take too long time. So I run "tail -n 1 file" to all file on A node, but some files are sync but some files are not. My question is below: 1.Why the tail can't sync all files? 2.Can the command "tail -n 1 filename" trigger selfheal, just like "ls -l filename"? Thanks, Xin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160301/86b45434/attachment.html>