Highly Available NFS based Kerberos KDC aka. Ganesha + GlusterFS + HAProxy.

Open Source Published on 8 mins Last updated

NFS, the Network File System created by Sun Microsystems, has been around for a while. And as with any tool, when used properly, it is very powerful. NFS still gets used heavily in 10's of thousands of systems. However, load balancing NFS is a real pain — especially when it comes to the locked mounts issue. In this blog I'll explain how to create a highly available NFS server for Kerberos KDCs.

Why is NFS so popular?

It's not meant for sharing large files or to provide backup. It's meant for a very simple human need to interact with digital systems efficiently on a day to day basis. It's meant to store small to medium user files centrally instead of resulting in file duplication across a vast infrastructure, potentially leaving sensitive user information spread all over many server disks across a data center.

And so it's no wonder that NFS has seen it's share of development via the Open Source community.

Enter NFS Ganesha.

NFS is not without it's quirks, particularly with locked mounts. So creating a highly available NFS server for our Kerberos KDC's will be our article here to counter that particular issue and provide failover.

We'll cut down to the chase and demonstrate how to configure a highly available NFS Ganesha server with these techs in mind. All this based on our trials with these techs:

  • GlusterFS
  • NFS Ganesha
  • CentOS 7
  • HAPROXY
  • keepalived
  • firewalld
  • selinux

And now for the quick steps:

Build at least 2 CentOS 7 servers. Add the hosts to your DNS server for a clean setup. Alternately add them to /etc/hosts (ugly) on both hosts (nfs01 / nfs02).

192.168.0.80 nfs-c01 (nfs01, nfs02)  VIP DNS Entry
192.168.0.131 nfs01
192.168.0.119 nfs02

On both nodes, compile and build nfsganesha 2.60+ from latest stable source. (At this time RPM packages did not work). We do this to take advantage of the latest feature sets available and minimize failure. Install the listed packages as well:

wget https://github.com/nfs-ganesha/nfs-ganesha/archive/V2.6-.0.tar.gz

PACKAGES NEEDED : 

yum install glusterfs-api-devel.x86_64
yum install xfsprogs-devel.x86_64
yum install xfsprogs.x86_64
xfsdump-3.1.4-1.el7.x86_64
libguestfs-xfs-1.36.3-6.el7_4.3.x86_64
libntirpc-devel-1.5.4-1.el7.x86_64
libntirpc-1.5.4-1.el7.x86_64


COMMANDS

git clone https://github.com/nfs-ganesha/nfs-ganesha.git
cd nfs-ganesha;
git checkout V2.6-stable
git submodule update --init --recursive
ccmake /root/ganesha/nfs-ganesha/src/
# Press the c, e, c, g keys to create and generate the config and make files.
make
make install

Add a separate disk to each of the VM's such as /dev/sdb.

Add the gluster filesystem to both nodes and link them together. Steps are done on both nodes:

mkfs.xfs /dev/sdb
mount /dev/sdb /bricks/0

yum install centos-release-gluster
systemctl enable glusterd.service
yum -y install glusterfs glusterfs-fuse glusterfs-server glusterfs-api glusterfs-cli
( node01 ONLY ) gluster volume create gv01 replica 2 nfs01:/bricks/0/gv01 nfs02:/bricks/0/gv01

gluster volume info
gluster volume status

Install and Configure HAPROXY on both nodes. The config:

PACKAGES:
yum install haproxy     # ( 1.5.18-6.el7.x86_64 used in this case )

/etc/haproxy/haproxy.cfg

global
    log         127.0.0.1 local2
    stats       socket /var/run/haproxy.sock mode 0600 level admin
    # stats     socket /var/lib/haproxy/stats
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon
    debug

defaults
    mode                    tcp
    log                     global
    option                  dontlognull
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout http-keep-alive 10s
    timeout check           10s
    maxconn                 3000

frontend nfs-in
    bind nfs-c01:2049
    mode tcp
    option tcplog
    default_backend             nfs-back


backend nfs-back
    balance     roundrobin
    server      nfs01.nix.mine.dom    nfs01.nix.mine.dom:2049 check
    server      nfs02.nix.mine.dom    nfs02.nix.mine.dom:2049 check

The following kernel bind parameters are required to be set before configuring keepalived below (More on this available via the RedHat Documentation):

nfs01 / nfs02   # echo "net.ipv4.ip_nonlocal_bind = 1" >> /etc/sysctl.conf
# echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf
# sysctl -p
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.ip_forward = 1
#       

Configure keepalived on both nodes:

PACKAGES:

yum install keepalived    # ( Used 1.3.5-1.el7.x86_64 in this case )

NFS01:

vrrp_script chk_haproxy {
  script "killall -0 haproxy"           # check the haproxy process
  interval 2                            # every 2 seconds
  weight 2                              # add 2 points if OK
}

vrrp_instance VI_1 {
  interface eth0                        # interface to monitor
  state MASTER                          # MASTER on haproxy1, BACKUP on haproxy2
  virtual_router_id 51
  priority 101                          # 101 on haproxy1, 100 on haproxy2
  virtual_ipaddress {
       192.168.0.80                        # virtual ip address
  }
  track_script {
       chk_haproxy
  }
}

NFS02:

vrrp_script chk_haproxy {
  script "killall -0 haproxy"           # check the haproxy process
  interval 2                            # every 2 seconds
  weight 2                              # add 2 points if OK
}

vrrp_instance VI_1 {
  interface eth0                        # interface to monitor
  state BACKUP                          # MASTER on haproxy1, BACKUP on haproxy2
  virtual_router_id 51
  priority 102                          # 101 on haproxy1, 100 on haproxy2
  virtual_ipaddress {
    192.168.0.80                        # virtual ip address
  }
  track_script {
    chk_haproxy
  }
}

Configure firewalld on both nodes. DO NOT disable firewalld.

# cat public.bash

firewall-cmd --zone=public --permanent --add-port=2049/tcp

firewall-cmd --zone=public --permanent --add-port=111/tcp

firewall-cmd --zone=public --permanent --add-port=111/udp

firewall-cmd --zone=public --permanent --add-port=24007-24008/tcp

firewall-cmd --zone=public --permanent --add-port=49152/tcp

firewall-cmd --zone=public --permanent --add-port=38465-38469/tcp

firewall-cmd --zone=public --permanent --add-port=4501/tcp

firewall-cmd --zone=public --permanent --add-port=4501/udp

firewall-cmd --zone=public --permanent --add-port=20048/udp

firewall-cmd --zone=public --permanent --add-port=20048/tcp
firewall-cmd --reload

# cat dmz.bash

firewall-cmd --zone=dmz --permanent --add-port=2049/tcp

firewall-cmd --zone=dmz --permanent --add-port=111/tcp

firewall-cmd --zone=dmz --permanent --add-port=111/udp

firewall-cmd --zone=dmz --permanent --add-port=24007-24008/tcp

firewall-cmd --zone=dmz --permanent --add-port=49152/tcp

firewall-cmd --zone=dmz --permanent --add-port=38465-38469/tcp

firewall-cmd --zone=dmz --permanent --add-port=4501/tcp

firewall-cmd --zone=dmz --permanent --add-port=4501/udp

firewall-cmd --zone=dmz --permanent --add-port=20048/tcp

firewall-cmd --zone=dmz --permanent --add-port=20048/udp

firewall-cmd --reload

#

# On Both

firewall-cmd --permanent --direct --add-rule ipv4 filter INPUT 0 -m pkttype --pkt-type multicast -j ACCEPT
firewall-cmd --reload


HANDY STUFF:

firewall-cmd --zone=dmz --list-all
firewall-cmd --zone=public --list-all
firewall-cmd --set-log-denied=all
firewall-cmd --permanent --add-service=haproxy
firewall-cmd --list-all

Configure selinux. Don't disable it. This actually makes your host safer and is actually easy to work with using just these commands.

Run any of the following command, or a combination of, on deny entries in /var/log/audit/audit.log that may appear as you stop, start or install above services:

METHOD 1:
grep AVC /var/log/audit/audit.log | tail -n1 | audit2allow -M systemd-allow
semodule -i systemd-allow.pp

METHOD 2:
audit2allow -a
audit2allow -a -M ganesha_<NUM>_port
semodule -i ganesha_<NUM>_port.pp

Configure NFS Ganesha on both nodes. The configs differ slightly:

NODE 1:

[root@nfs01 ~]# cat /etc/ganesha/ganesha.conf
###################################################
#
# EXPORT
#
# To function, all that is required is an EXPORT
#
# Define the absolute minimal export
#
###################################################


NFS_Core_Param {
        Bind_addr = 192.168.0.131;
        NFS_Port = 2049;
        MNT_Port = 20048;
        NLM_Port = 38468;
        Rquota_Port = 4501;
}

%include "/etc/ganesha/export.conf"
[root@nfs01 ~]# cat /etc/ganesha/export.conf
EXPORT{
    Export_Id = 1 ;                             # Export ID unique to each export
    Path = "/n";                                # Path of the volume to be exported. Eg: "/test_volume"

    FSAL {
        name = GLUSTER;
        hostname = "nfs01.nix.mine.dom";         # IP of one of the nodes in the trusted pool
        volume = "gv01";                        # Volume name. Eg: "test_volume"
    }

    Access_type = RW;                           # Access permissions
    Squash = No_root_squash;                    # To enable/disable root squashing
    Disable_ACL = FALSE;                        # To enable/disable ACL
    Pseudo = "/n";                              # NFSv4 pseudo path for this export. Eg: "/test_volume_pseudo"
    Protocols = "3","4";                        # NFS protocols supported
    Transports = "UDP","TCP" ;                  # Transport protocols supported
    SecType = "sys";                            # Security flavors supported
}
[root@nfs01 ~]#

NODE 2:

[root@nfs02 ~]# cd /etc/ganesha/
[root@nfs02 ganesha]# cat ganesha.conf
###################################################
#
# EXPORT
#
# To function, all that is required is an EXPORT
#
# Define the absolute minimal export
#
###################################################


NFS_Core_Param {
        Bind_addr=192.168.0.119;
        NFS_Port=2049;
        MNT_Port=20048;
        NLM_Port=38468;
        Rquota_Port=4501;
}

%include "/etc/ganesha/export.conf"
[root@nfs02 ganesha]# cat export.conf
EXPORT{
    Export_Id = 1 ;                             # Export ID unique to each export
    Path = "/n";                                # Path of the volume to be exported. Eg: "/test_volume"

    FSAL {
        name = GLUSTER;
        hostname = "nfs02.nix.mine.dom";         # IP of one of the nodes in the trusted pool
        volume = "gv01";                        # Volume name. Eg: "test_volume"
    }

    Access_type = RW;                           # Access permissions
    Squash = No_root_squash;                    # To enable/disable root squashing
    Disable_ACL = FALSE;                        # To enable/disable ACL
    Pseudo = "/n";                              # NFSv4 pseudo path for this export. Eg: "/test_volume_pseudo"
    Protocols = "3","4";                        # NFS protocols supported
    Transports = "UDP","TCP" ;                  # Transport protocols supported
    SecType = "sys";                            # Security flavors supported
}
[root@nfs02 ganesha]#

STARTUP:

/usr/bin/ganesha.nfsd -L /var/log/ganesha/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT

Ensure mounts are done and everything is started up on both nodes.

[root@nfs01 ~]# cat /etc/fstab|grep -Ei "brick|gv01"
/dev/sdb /bricks/0                              xfs     defaults        0 0
nfs01:/gv01 /n                                  glusterfs defaults      0 0
[root@nfs01 ~]#

[root@nfs01 ~]# mount|grep -Ei "brick|gv01"
/dev/sdb on /bricks/0 type xfs (rw,relatime,seclabel,attr2,inode64,noquota)
nfs01:/gv01 on /n type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
[root@nfs01 ~]#



[root@nfs01 ~]# ps -ef|grep -Ei "haproxy|keepalived|ganesha"; netstat -pnlt|grep -Ei "haproxy|ganesha|keepalived"
root      1402     1  0 00:59 ?        00:00:00 /usr/sbin/keepalived -D
root      1403  1402  0 00:59 ?        00:00:00 /usr/sbin/keepalived -D
root      1404  1402  0 00:59 ?        00:00:02 /usr/sbin/keepalived -D
root     13087     1  0 01:02 ?        00:00:00 /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid
haproxy  13088 13087  0 01:02 ?        00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
haproxy  13089 13088  0 01:02 ?        00:00:01 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
root     13129     1 15 01:02 ?        00:13:11 /usr/bin/ganesha.nfsd -L /var/log/ganesha/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT
root     19742 15633  0 02:30 pts/2    00:00:00 grep --color=auto -Ei haproxy|keepalived|ganesha
tcp        0      0 192.168.0.80:2049       0.0.0.0:*               LISTEN      13089/haproxy
tcp6       0      0 192.168.0.131:20048     :::*                    LISTEN      13129/ganesha.nfsd
tcp6       0      0 :::564                  :::*                    LISTEN      13129/ganesha.nfsd
tcp6       0      0 192.168.0.131:4501      :::*                    LISTEN      13129/ganesha.nfsd
tcp6       0      0 192.168.0.131:2049      :::*                    LISTEN      13129/ganesha.nfsd
tcp6       0      0 192.168.0.131:38468     :::*                    LISTEN      13129/ganesha.nfsd
[root@nfs01 ~]#

Since you compiled from source you don't have nice startup scripts. To get your nice startup scripts from an existing ganesha RPM do the following. Then use systemctl to stop and start nfs-ganesha as you would any other service.

yumdownloader nfs-ganesha.x86_64
rpm2cpio nfs-ganesha-2.5.5-1.el7.x86_64.rpm | cpio -idmv ./usr/lib/systemd/system/nfs-ganesha-lock.service
rpm2cpio nfs-ganesha-2.5.5-1.el7.x86_64.rpm | cpio -idmv ./usr/lib/systemd/system/nfs-ganesha.service
rpm2cpio nfs-ganesha-2.5.5-1.el7.x86_64.rpm | cpio -idmv ./usr/lib/systemd/system/nfs-ganesha-config.service
rpm2cpio nfs-ganesha-2.5.5-1.el7.x86_64.rpm | cpio -idmv ./usr/libexec/ganesha/nfs-ganesha-config.sh

Copy above to the same folders under / instead of ./ :

systemctl enable nfs-ganesha.service
systemctl status nfs-ganesha.service

If you're reading this up to this point, then you're still with me and you're done with the setup. Good. So let's go down to the testing bits. But first ensure all services are up and working on both nodes. Two handy one liners for this are:

[root@nfs01 n]# ps -ef|grep -Ei "haproxy|ganesha|keepalived"; netstat -pnlt|grep -Ei ganesha; netstat -pnlt|grep -Ei haproxy; netstat -pnlt|grep -Ei keepalived
[root@nfs02 n]# gluster volume status

First mount the volume using the cluster IP on any client of your choice:

[root@ipaclient01 /]# mount -t nfs4 nfs-c01:/n /n

From here, it's all the fun bits because we get to blow up things. Start and stop either node one at a time to see if your client ever looses a connection and if data is replicated across via Gluster to ensure no files are ever lost. Note: Shut down the nodes one at a time and let the previous one come up before you shut down the second node. All in all, you should see an entirely uninterrupted session off the client end.

You can see the full NFS Ganesha HA NFS Server setup or the condensed version available on my OpenSource blog.