thr3ads.net - Gluster users - [Gluster-users] How to maintain HA using NFS clients if the NFS daemon process gets killed on a gluster node? [Jan 2016]

If this information is useful, please help other people find it:
Share via:

Kris Laib

2016-Jan-27 16:09 UTC

[Gluster-users] How to maintain HA using NFS clients if the NFS daemon process gets killed on a gluster node?

Hi all,

We're getting ready to roll out Gluster using standard NFS from the clients,
and CTDB and RRDNS to help facilitate HA.   I thought we were good to know, but
recently had an issue where there wasn't enough memory on one of the gluster
nodes in a test cluster, and OOM killer took out the NFS daemon process.   Since
there was still IP traffic between nodes and the gluster-based local CTDB mount
for the lock file was intact, CTDB didn't kick in an initiate failover, and
all clients connected to the node where NFS was killed lost their connections.  
We'll obviously fix the lack of memory, but going forward how can we protect
against clients getting disconnected if the NFS daemon should be stopped for any
reason?

Our cluster is 3 nodes, 1 is a silent witness node to help with split brain, and
the other 2 host the volumes with one brick per node, and 1x2 replication.

Is there something incorrect about my setup, or is this a known downfall to
using standard NFS mounts with gluster?

Thanks,
Kris
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160127/40d27996/attachment.html>

Soumya Koduri

2016-Jan-28 04:15 UTC

head link

[Gluster-users] How to maintain HA using NFS clients if the NFS daemon process gets killed on a gluster node?

On 01/27/2016 09:39 PM, Kris Laib wrote:> Hi all,
>
> We're getting ready to roll out Gluster using standard NFS from the
> clients, and CTDB and RRDNS to help facilitate HA.   I thought we were
> good to know, but recently had an issue where there wasn't enough
memory
> on one of the gluster nodes in a test cluster, and OOM killer took out
> the NFS daemon process.   Since there was still IP traffic between nodes
> and the gluster-based local CTDB mount for the lock file was intact,
> CTDB didn't kick in an initiate failover, and all clients connected to
For gluster-NFS, CTDB is typically configured to maintain high 
availability and I guess you have done the same. Could you check why 
CTDB hasn't initiated IP failover?

An alternative solution is to use nfs-ganesha [1][2] to provide NFS 
support for gluster volumes and can be configured to maintain HA using 
gluster CLI.

Thanks,
Soumya

[1] 
http://blog.gluster.org/2015/10/linux-scale-out-nfsv4-using-nfs-ganesha-and-glusterfs-one-step-at-a-time/

[2] 
http://gluster.readthedocs.org/en/latest/Administrator%20Guide/NFS-Ganesha%20GlusterFS%20Intergration/
(section# Using Highly Available Active-Active NFS-Ganesha And GlusterFS 
cli)
> the node where NFS was killed lost their connections.   We'll obviously
> fix the lack of memory, but going forward how can we protect against
> clients getting disconnected if the NFS daemon should be stopped for any
> reason?
>
> Our cluster is 3 nodes, 1 is a silent witness node to help with split
> brain, and the other 2 host the volumes with one brick per node, and 1x2
> replication.
>
> Is there something incorrect about my setup, or is this a known downfall
> to using standard NFS mounts with gluster?
>
> Thanks,
>
> Kris
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>

Gluster users - Jan 2016 - How to maintain HA using NFS clients if the NFS daemon process gets killed on a gluster node?

[Gluster-users] How to maintain HA using NFS clients if the NFS daemon process gets killed on a gluster node?

[Gluster-users] How to maintain HA using NFS clients if the NFS daemon process gets killed on a gluster node?