thr3ads.net - Gluster users - [Gluster-users] faied start the glusterd after reboot [Feb 2016]

If this information is useful, please help other people find it:
Share via:

songxin

2016-Feb-25 11:40 UTC

[Gluster-users] faied start the glusterd after reboot

If I run "reboot" on the a node?there are not .snap files on A node
after reboot.
Does the snap file only appear after unexpect reboot?
Why its size is 0 byte?
In this situation ?is a right method to solve this problem removing the snap
file?

thanks
xin

???? iPhone
> ? 2016?2?25??19:05?Atin Mukherjee <amukherj at redhat.com> ???
> 
> + Rajesh , Avra
> 
>> On 02/25/2016 04:12 PM, songxin wrote:
>> Thanks for your reply.
>> 
>> Do I need check all files in /var/lib/glusterd/*? 
>> Must all files be same in A node and B node?
> Yes, they should be identical.
>> 
>> I found that the size of
>> file /var/lib/glusterd/snaps/.nfs0000000001722f4000000002 is 0 bytes
>> after A board reboot.
>> So glusterd can't restore by this snap file on A node.
>> Is it right?
> Yes, looks like that.
>> 
>> 
>> 
>> 
>> 
>> 
>> At 2016-02-25 18:25:50, "Atin Mukherjee" <amukherj at
redhat.com> wrote:
>>> I believe you and Abhishek are from the same group and sharing the
>>> common set up. Could you check the content of /var/lib/glusterd/*
in
>>> board B (post reboot and before starting glusterd) matches with
>>> /var/lib/glusterd/* from board A?
>>> 
>>> ~Atin
>>> 
>>>> On 02/25/2016 03:48 PM, songxin wrote:
>>>> Hi,
>>>> I have a problem as below when I start the gluster after reboot
a board.
>>>> 
>>>> precondition: 
>>>> I use two boards do this test.
>>>> The version of glusterfs is 3.7.6.
>>>> 
>>>> A board ip:128.224.162.255 
>>>> B board ip:128.224.95.140 
>>>> 
>>>> reproduce steps??
>>>> 
>>>> 1.systemctl start glusterd (A board) 
>>>> 2.systemctl start glusterd (B board) 
>>>> 3.gluster peer probe 128.224.95.140 (A board) 
>>>> 4.gluster volume create gv0 replica 2
128.224.95.140:/tmp/brick1/gv0
>>>> 128.224.162.255:/data/brick/gv0 force (local board)
>>>> 5.gluster volume start gv0 (A board) 
>>>> 6.press the reset button on the A board.It is a develop board
so it has
>>>> a reset button that is similar to reset button on pc (A board) 
>>>> 7.run command "systemctl start glusterd" after A
board reboot. And
>>>> command failed because the file
/var/lib/glusterd/snaps/.nfsxxxxxxxxx
>>>> (local board) .
>>>> Log is as below.
>>>> [2015-12-07 07:55:38.260084] E [MSGID: 101032]
>>>> [store.c:434:gf_store_handle_retrieve] 0-: Path corresponding
to
>>>> /var/lib/glusterd/snaps/.nfs0000000001722f4000000002
>>>> [2015-12-07 07:55:38.260120] D [MSGID: 0]
>>>> [store.c:439:gf_store_handle_retrieve] 0-: Returning -1
>>>> 
>>>> [2015-12-07 07:55:38.260152] E [MSGID: 106200]
>>>> [glusterd-store.c:3332:glusterd_store_update_snap]
0-management: snap
>>>> handle is NULL                                 
>>>> [2015-12-07 07:55:38.260180] E [MSGID: 106196]
>>>> [glusterd-store.c:3427:glusterd_store_retrieve_snap]
0-management:
>>>> Failed to update snapshot for .nfs0000000001722f40
>>>> [2015-12-07 07:55:38.260208] E [MSGID: 106043]
>>>> [glusterd-store.c:3589:glusterd_store_retrieve_snaps]
0-management:
>>>> Unable to restore snapshot: .nfs0000000001722f400
>>>> [2015-12-07 07:55:38.260241] D [MSGID: 0]
>>>> [glusterd-store.c:3607:glusterd_store_retrieve_snaps]
0-management:
>>>> Returning with -1                              
>>>> [2015-12-07 07:55:38.260268] D [MSGID: 0]
>>>> [glusterd-store.c:4339:glusterd_restore] 0-management:
Returning -1
>>>> 
>>>> [2015-12-07 07:55:38.260325] E [MSGID: 101019]
>>>> [xlator.c:428:xlator_init] 0-management: Initialization of
volume
>>>> 'management' failed, review your volfile again    
>>>> [2015-12-07 07:55:38.260355] E
[graph.c:322:glusterfs_graph_init]
>>>> 0-management: initializing translator failed
>>>> 
>>>> [2015-12-07 07:55:38.260374] E
[graph.c:661:glusterfs_graph_activate]
>>>> 0-graph: init failed                                 
>>>> 
>>>> 8.rm /var/lib/glusterd/snaps/.nfsxxxxxxxxx (A board) 
>>>> 9..run command "systemctl start glusterd" and
success.
>>>> 10.at this point the peer status is Peer in Cluster (Connected)
and all
>>>> process is online. 
>>>> 
>>>> If a node abnormal reset, must I remove
>>>> the  /var/lib/glusterd/snaps/.nfsxxxxxx before starting the
glusterd?
>>>> 
>>>> I want to know if it is nomal.
>>>> 
>>>> Thanks,
>>>> Xin
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>> 
>> 
>> 
>>

Avra Sengupta

2016-Feb-25 11:46 UTC

head link

[Gluster-users] faied start the glusterd after reboot

Hi,

/var/lib/glusterd/snaps/ contains only 1 file called missed_snaps_list. Apart
from it, there are only directories created with
the snap names. Is .nfs0000000001722f4000000002, that you saw in
/var/lib/glusterd a file or a directory. If it's a file, then it
was not placed there as part of snapshotting any volume. If it's a
directory, then did you try creating a snapshot with such a name.

Regards,
Avra

On 02/25/2016 05:10 PM, songxin wrote:> If I run "reboot" on the a node??there are not .snap files on A
node after reboot.
> Does the snap file only appear after unexpect reboot??
> Why its size is 0 byte??
> In this situation ??is a right method to solve this problem removing the
snap file?
>
> thanks
> xin
>
> ??????? iPhone
>
>> ?? 2016??2??25???19:05??Atin Mukherjee <amukherj at redhat.com>
?????
>>
>> + Rajesh , Avra
>>
>>> On 02/25/2016 04:12 PM, songxin wrote:
>>> Thanks for your reply.
>>>
>>> Do I need check all files in /var/lib/glusterd/*?
>>> Must all files be same in A node and B node?
>> Yes, they should be identical.
>>> I found that the size of
>>> file /var/lib/glusterd/snaps/.nfs0000000001722f4000000002 is 0
bytes
>>> after A board reboot.
>>> So glusterd can't restore by this snap file on A node.
>>> Is it right?
>> Yes, looks like that.
>>>
>>>
>>>
>>>
>>>
>>> At 2016-02-25 18:25:50, "Atin Mukherjee" <amukherj at
redhat.com> wrote:
>>>> I believe you and Abhishek are from the same group and sharing
the
>>>> common set up. Could you check the content of
/var/lib/glusterd/* in
>>>> board B (post reboot and before starting glusterd) matches with
>>>> /var/lib/glusterd/* from board A?
>>>>
>>>> ~Atin
>>>>
>>>>> On 02/25/2016 03:48 PM, songxin wrote:
>>>>> Hi,
>>>>> I have a problem as below when I start the gluster after
reboot a board.
>>>>>
>>>>> precondition:
>>>>> I use two boards do this test.
>>>>> The version of glusterfs is 3.7.6.
>>>>>
>>>>> A board ip:128.224.162.255
>>>>> B board ip:128.224.95.140
>>>>>
>>>>> reproduce steps??
>>>>>
>>>>> 1.systemctl start glusterd (A board)
>>>>> 2.systemctl start glusterd (B board)
>>>>> 3.gluster peer probe 128.224.95.140 (A board)
>>>>> 4.gluster volume create gv0 replica 2
128.224.95.140:/tmp/brick1/gv0
>>>>> 128.224.162.255:/data/brick/gv0 force (local board)
>>>>> 5.gluster volume start gv0 (A board)
>>>>> 6.press the reset button on the A board.It is a develop
board so it has
>>>>> a reset button that is similar to reset button on pc (A
board)
>>>>> 7.run command "systemctl start glusterd" after A
board reboot. And
>>>>> command failed because the file
/var/lib/glusterd/snaps/.nfsxxxxxxxxx
>>>>> (local board) .
>>>>> Log is as below.
>>>>> [2015-12-07 07:55:38.260084] E [MSGID: 101032]
>>>>> [store.c:434:gf_store_handle_retrieve] 0-: Path
corresponding to
>>>>> /var/lib/glusterd/snaps/.nfs0000000001722f4000000002
>>>>> [2015-12-07 07:55:38.260120] D [MSGID: 0]
>>>>> [store.c:439:gf_store_handle_retrieve] 0-: Returning -1
>>>>>
>>>>> [2015-12-07 07:55:38.260152] E [MSGID: 106200]
>>>>> [glusterd-store.c:3332:glusterd_store_update_snap]
0-management: snap
>>>>> handle is NULL
>>>>> [2015-12-07 07:55:38.260180] E [MSGID: 106196]
>>>>> [glusterd-store.c:3427:glusterd_store_retrieve_snap]
0-management:
>>>>> Failed to update snapshot for .nfs0000000001722f40
>>>>> [2015-12-07 07:55:38.260208] E [MSGID: 106043]
>>>>> [glusterd-store.c:3589:glusterd_store_retrieve_snaps]
0-management:
>>>>> Unable to restore snapshot: .nfs0000000001722f400
>>>>> [2015-12-07 07:55:38.260241] D [MSGID: 0]
>>>>> [glusterd-store.c:3607:glusterd_store_retrieve_snaps]
0-management:
>>>>> Returning with -1
>>>>> [2015-12-07 07:55:38.260268] D [MSGID: 0]
>>>>> [glusterd-store.c:4339:glusterd_restore] 0-management:
Returning -1
>>>>>
>>>>> [2015-12-07 07:55:38.260325] E [MSGID: 101019]
>>>>> [xlator.c:428:xlator_init] 0-management: Initialization of
volume
>>>>> 'management' failed, review your volfile again
>>>>> [2015-12-07 07:55:38.260355] E
[graph.c:322:glusterfs_graph_init]
>>>>> 0-management: initializing translator failed
>>>>>
>>>>> [2015-12-07 07:55:38.260374] E
[graph.c:661:glusterfs_graph_activate]
>>>>> 0-graph: init failed
>>>>>
>>>>> 8.rm /var/lib/glusterd/snaps/.nfsxxxxxxxxx (A board)
>>>>> 9..run command "systemctl start glusterd" and
success.
>>>>> 10.at this point the peer status is Peer in Cluster
(Connected) and all
>>>>> process is online.
>>>>>
>>>>> If a node abnormal reset, must I remove
>>>>> the  /var/lib/glusterd/snaps/.nfsxxxxxx before starting the
glusterd?
>>>>>
>>>>> I want to know if it is nomal.
>>>>>
>>>>> Thanks,
>>>>> Xin
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>>

songxin

2016-Mar-01 13:49 UTC

head link

[Gluster-users] about tail command

Hi,


recondition:
A node:128.224.95.140
B node:128.224.162.255


brick on A node:/data/brick/gv0
brick on B node:/data/brick/gv0




reproduce steps:
1.gluster peer probe 128.224.162.255                                            
(on A node)
2.gluster volume create gv0 replica 2 128.224.95.140:/data/brick/gv0 
128.224.162.255:/data/brick/gv0 force                (on A node)
3.gluster volume start gv0                                                      
(on A node)
4.mount -t glusterfs 128.224.95.140:/gv0 gluster                                
(on A node)
5.create some files(a,b,c)        in dir gluster                                
(on A node)
6.shutdown the B node
7.change the files(a,b,c)    in dir gluster                                     
(on A node)
8.reboot B node
9.start glusterd  on B node but glusterfsd is offline                           
(on B node)
10.gluster volume remove-brick gv0 replica 1 128.224.162.255:/data/brick/gv0
force                                                        (on A node)
11.gluster volume add-brick gv0 replica 2 128.224.162.255:/data/brick/gv0 force 
(on A node)


Now the files are not same between two brick 


12."gluster volume heal gv0 info " show entry num is 0                
(on A node)


Now What I should do if I want to sync file(a,b,c) on two brick?


I know the "heal full" can work , but I think the command take too
long time.


So I run  "tail -n 1 file" to all file on A node, but some files are
sync but some files are not.


My question is below:
1.Why the tail can't sync all files?
2.Can  the command "tail -n 1 filename"  trigger selfheal,  just like
"ls -l filename"?


Thanks,
Xin








-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160301/86b45434/attachment.html>

Gluster users - Feb 2016 - faied start the glusterd after reboot

[Gluster-users] faied start the glusterd after reboot

[Gluster-users] faied start the glusterd after reboot

[Gluster-users] about tail command