Peter Spain
2016-Feb-03 14:33 UTC
[Gluster-users] Gluster FS Native Client Behaviour (3.7)
Hello I am very new to GlusterFS and have been playing around with over the last few weeks, with a view to using it in production. So far I found Gluster to be very interesting and easy to get along with. However, there seems to be a giant hole where the Gluster Native Client documentation should live. After using it for a few weeks and playing around (inside various VMs) I am still not entirely sure how the client behaves.>From network captures it is clear that the client communicates to all thenodes for a particular volume, and that the client gets this information from a volfile (which it retrieves when mounting a volume). Various blog posts confirm this and go on to mention that the client is responsible for replicating data across nodes, and not the nodes themselves. I assume this is still the case? Beyond that I really have no idea how the client behaves in a replicated volume. My questions are: There is a "ping-timeout" option to adjust how long it takes the client to connect to a different node, in case of node failure. If the client knows about all nodes and actively communicates with all of them, why does it need a time out at all? Why does the client "stick" to a particular node? Does the client go back to the original node once it recovers? Is it possible to dictate which node a client will initially connect to on mounting a volume? If all this information is contained in some documentation I would love to be pointed to it, as so far I cannot find the answer to these questions. Regards Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160203/ecbe37f0/attachment.html>
On Wed, Feb 3, 2016 at 8:03 PM, Peter Spain <pspain at gmail.com> wrote:> Hello > > I am very new to GlusterFS and have been playing around with over the last > few weeks, with a view to using it in production. So far I found Gluster to > be very interesting and easy to get along with. However, there seems to be a > giant hole where the Gluster Native Client documentation should live. > > After using it for a few weeks and playing around (inside various VMs) I am > still not entirely sure how the client behaves. > > From network captures it is clear that the client communicates to all the > nodes for a particular volume, and that the client gets this information > from a volfile (which it retrieves when mounting a volume). Various blog > posts confirm this and go on to mention that the client is responsible for > replicating data across nodes, and not the nodes themselves. I assume this > is still the case?Yes. This is how GlusterFS works.> > Beyond that I really have no idea how the client behaves in a replicated > volume. My questions are: > > There is a "ping-timeout" option to adjust how long it takes the client to > connect to a different node, in case of node failure. If the client knows > about all nodes and actively communicates with all of them, why does it need > a time out at all?'ping-timeout' defines the time the client will wait for a server to reply, before assuming the node is down and disconnecting from it.> > Why does the client "stick" to a particular node?When using replication, the client reads from the first node that responds. So it sticks to this node for further reads. I'm not sure if the replicate xlator does the selection for each file operation or not. But for writes, the client makes sure writes happen to both servers before replying to the applications. So if one of the servers isn't responding, the operation doesn't return till the ping-timeout happens.> > Does the client go back to the original node once it recovers?If the original node comes back up, and replies quicker than the other nodes, the client will use it.> > Is it possible to dictate which node a client will initially connect to on > mounting a volume?When you mount a volume, you give the address only for fetching the volfile. The client then connects to all servers in the volfile.> > If all this information is contained in some documentation I would love to > be pointed to it, as so far I cannot find the answer to these questions. > > Regards > > Peter > > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users
Ravishankar N
2016-Feb-03 15:53 UTC
[Gluster-users] Gluster FS Native Client Behaviour (3.7)
On 02/03/2016 08:03 PM, Peter Spain wrote:> Hello > > I am very new to GlusterFS and have been playing around with over the > last few weeks, with a view to using it in production. So far I found > Gluster to be very interesting and easy to get along with. However, > there seems to be a giant hole where the Gluster Native Client > documentation should live. > > After using it for a few weeks and playing around (inside various VMs) > I am still not entirely sure how the client behaves. > > From network captures it is clear that the client communicates to all > the nodes for a particular volume, and that the client gets this > information from a volfile (which it retrieves when mounting a > volume). Various blog posts confirm this and go on to mention that the > client is responsible for replicating data across nodes, and not the > nodes themselves. I assume this is still the case?That is correct. The client volfile dictates what translators are loaded on the client process. You can look at the various volfiles in /var/lib/glusterd/vols/<volname> on the server to get an idea of what xlators are loaded for different processes and how they are stacked to form a graph. The graph is also printed in the log files of each process as it starts. Each xlator does a specific function. The replication xlator (AFR) is a client xlator that is loaded (among others) on the client graph in case of replicated volumes and has the replication logic.> > Beyond that I really have no idea how the client behaves in a > replicated volume.If you specifically want to know how AFR works, you can see https://github.com/gluster/glusterfs-specs/blob/master/done/Features/afr-v1.md. It is a bit dated but the most of it is still valid.> My questions are: > > There is a "ping-timeout" option to adjust how long it takes the > client to connect to a different node, in case of node failure.Actually, it is the time duration for which the client waits to check if the server is responsive. (see `gluster volume help`), even after connection is established.> If the client knows about all nodes and actively communicates with all > of them, why does it need a time out at all?You would need some sort of a heartbeat for the clients to know that its connection to servers are still intact or lost because a brick went down or there was a network disconnect etc.> > Why does the client "stick" to a particular node?Not sure what you mean. The client process connects to all the bricks of the volume. Open one of the client volfiles (or the logfile) and study the graph, you'll see many 'protocol/client' xlators, one for each brick that the client needs to talk to.> > Does the client go back to the original node once it recovers? > > Is it possible to dictate which node a client will initially connect > to on mounting a volume?Like I said, the client connects to all bricks of the volume.> > If all this information is contained in some documentation I would > love to be pointed to it, as so far I cannot find the answer to these > questions.There are many presentations you can find online on glusterfs architecture. https://gluster.readthedocs.org/en/latest/ is another good start. Hope that helps, Ravi> > Regards > > Peter > > > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160203/032f048e/attachment.html>