我想通过NFS和CIFS提供2 TB左右.我正在寻找一个2(或更多)服务器解决方案,以实现高可用性,并尽可能在服务器之间实现负载平衡.有关群集或高可用性解决方案的任何建议吗?
这是商业用途,计划在未来几年内增长到5-10 TB.我们的设施几乎每天24小时,每周六天.我们可能有15-30分钟的停机时间,但我们希望尽量减少数据丢失.我想尽量减少凌晨3点的电话.
我们目前在Solaris上运行一台带有ZFS的服务器,我们正在考虑用于HA部分的AVS,但是我们在Solaris上遇到了一些小问题(CIFS实现不能与Vista一起使用等).
我们已经开始关注了
GFS上的DRDB(GFS用于分布式锁定功能)
Gluster(需要客户端部分,没有本地CIFS支持吗?)
Windows DFS(doc说文件关闭后只复制?)
我们正在寻找一个提供数据的"黑匣子".
我们目前在ZFS中对数据进行快照,并通过网络将快照发送到远程数据中心进行异地备份.
我们最初的计划是每10到15分钟就有一台第二台机器和rsync.失败的问题在于,正在进行的生产过程将丢失15分钟的数据并留在"中间".从一开始,它们几乎更容易开始,而不是找出中间拾取的位置.这就是驱使我们看待HA解决方案的原因.
我最近使用DRBD作为后端部署了hanfs,在我的情况下,我正在运行主动/待机模式,但我也在主/主模式下使用OCFS2成功测试了它.遗憾的是,关于如何最好地实现这一点的文档并不多,大多数存在的文章几乎没有用处.如果您沿着drbd路线前进,我强烈建议您加入drbd邮件列表,并阅读所有文档.这是我写的ha/drbd设置和脚本来处理ha的失败:
DRBD8是必需的 - 这是由drbd8-utils和drbd8-source提供的.一旦安装了这些(我相信它们是由backports提供的),你可以使用模块助手安装它 - ma ai drbd8.要么depmod -a,要么重新启动,如果你depmod -a,你需要modprobe drbd.
你需要一个后端分区用于drbd,不要制作这个分区LVM,否则你会遇到各种各样的问题.不要将LVM放在drbd设备上,否则你会遇到各种各样的问题.
Hanfs1:
/etc/drbd.conf global { usage-count no; } common { protocol C; disk { on-io-error detach; } } resource export { syncer { rate 125M; } on hanfs2 { address 172.20.1.218:7789; device /dev/drbd1; disk /dev/sda3; meta-disk internal; } on hanfs1 { address 172.20.1.219:7789; device /dev/drbd1; disk /dev/sda3; meta-disk internal; } }
Hanfs2的/etc/drbd.conf:
global { usage-count no; } common { protocol C; disk { on-io-error detach; } } resource export { syncer { rate 125M; } on hanfs2 { address 172.20.1.218:7789; device /dev/drbd1; disk /dev/sda3; meta-disk internal; } on hanfs1 { address 172.20.1.219:7789; device /dev/drbd1; disk /dev/sda3; meta-disk internal; } }
配置完成后,我们需要调出drbd.
drbdadm create-md export drbdadm attach export drbdadm connect export
我们现在必须执行数据的初始同步 - 显然,如果这是一个全新的drbd集群,那么选择哪个节点并不重要.
完成后,你需要在你的drbd设备上使用mkfs.yourchoiceoffiles - 上面配置中的设备是/ dev/drbd1.http://www.drbd.org/users-guide/p-work.html是一个在使用drbd时阅读的有用文档.
心跳
安装heartbeat2.(非常简单,apt-get install heartbeat2).
每台机器上的/etc/ha.d/ha.cf应包括:
hanfs1:
logfacility local0 keepalive 2 warntime 10 deadtime 30 initdead 120ucast eth1 172.20.1.218
auto_failback no
node hanfs1 node hanfs2
hanfs2:
logfacility local0 keepalive 2 warntime 10 deadtime 30 initdead 120ucast eth1 172.20.1.219
auto_failback no
node hanfs1 node hanfs2
两个ha盒子上的/etc/ha.d/haresources应该是相同的:
hanfs1 IPaddr::172.20.1.230/24/eth1 hanfs1 HeartBeatWrapper
我写了一个包装器脚本来处理故障转移场景中由nfs和drbd引起的特性.此脚本应存在于每台计算机上的/etc/ha.d/resources.d/中.
!/bin/bash heartbeat fails hard. so this is a wrapper to get around that stupidity I'm just wrapping the heartbeat scripts, except for in the case of umount as they work, mostlyif [[ -e /tmp/heartbeatwrapper ]]; then runningpid=$(cat /tmp/heartbeatwrapper) if [[ -z $(ps --no-heading -p $runningpid) ]]; then echo "PID found, but process seems dead. Continuing." else
echo "PID found, process is alive, exiting."
exit 7
fi
fiecho $$ > /tmp/heartbeatwrapper
if [[ x$1 == "xstop" ]]; then
/etc/init.d/nfs-kernel-server stop #>/dev/null 2>&1
NFS init script isn't LSB compatible, exit codes are 0 no matter what happens. Thanks guys, you really make my day with this bullshit. Because of the above, we just have to hope that nfs actually catches the signal to exit, and manages to shut down its connections. If it doesn't, we'll kill it later, then term any other nfs stuff afterwards. I found this to be an interesting insight into just how badly NFS is written.sleep 1
#we don't want to shutdown nfs first! #The lock files might go away, which would be bad. #The above seems to not matter much, the only thing I've determined #is that if you have anything mounted synchronously, it's going to break #no matter what I do. Basically, sync == screwed; in NFSv3 terms. #End result of failing over while a client that's synchronous is that #the client hangs waiting for its nfs server to come back - thing doesn't #even bother to time out, or attempt a reconnect. #async works as expected - it insta-reconnects as soon as a connection seems #to be unstable, and continues to write data. In all tests, md5sums have #remained the same with/without failover during transfer. #So, we first unmount /export - this prevents drbd from having a shit-fit #when we attempt to turn this node secondary. #That's a lie too, to some degree. LVM is entirely to blame for why DRBD #was refusing to unmount. Don't get me wrong, having /export mounted doesn't #help either, but still. #fix a usecase where one or other are unmounted already, which causes us to terminate early. if [[ "$(grep -o /varlibnfs/rpc_pipefs /etc/mtab)" ]]; then for ((test=1; test <= 10; test++)); do umount /export/varlibnfs/rpc_pipefs >/dev/null 2>&1 if [[ -z $(grep -o /varlibnfs/rpc_pipefs /etc/mtab) ]]; then break fi if [[ $? -ne 0 ]]; then #try again, harder this time umount -l /var/lib/nfs/rpc_pipefs >/dev/null 2>&1 if [[ -z $(grep -o /varlibnfs/rpc_pipefs /etc/mtab) ]]; then break fi fi done if [[ $test -eq 10 ]]; then rm -f /tmp/heartbeatwrapper echo "Problem unmounting rpc_pipefs" exit 1 fi fi if [[ "$(grep -o /dev/drbd1 /etc/mtab)" ]]; then for ((test=1; test <= 10; test++)); do umount /export >/dev/null 2>&1 if [[ -z $(grep -o /dev/drbd1 /etc/mtab) ]]; then break fi if [[ $? -ne 0 ]]; then #try again, harder this time umount -l /export >/dev/null 2>&1 if [[ -z $(grep -o /dev/drbd1 /etc/mtab) ]]; then break fi fi done if [[ $test -eq 10 ]]; then rm -f /tmp/heartbeatwrapper echo "Problem unmount /export" exit 1 fi fi #now, it's important that we shut down nfs. it can't write to /export anymore, so that's fine. #if we leave it running at this point, then drbd will screwup when trying to go to secondary. #See contradictory comment above for why this doesn't matter anymore. These comments are left in #entirely to remind me of the pain this caused me to resolve. A bit like why churches have Jesus #nailed onto a cross instead of chilling in a hammock. pidof nfsd | xargs kill -9 >/dev/null 2>&1 sleep 1 if [[ -n $(ps aux | grep nfs | grep -v grep) ]]; then echo "nfs still running, trying to kill again" pidof nfsd | xargs kill -9 >/dev/null 2>&1 fi sleep 1 /etc/init.d/nfs-kernel-server stop #>/dev/null 2>&1 sleep 1 #next we need to tear down drbd - easy with the heartbeat scripts #it takes input as resourcename start|stop|status #First, we'll check to see if it's stopped /etc/ha.d/resource.d/drbddisk export status >/dev/null 2>&1 if [[ $? -eq 2 ]]; then echo "resource is already stopped for some reason..." else for ((i=1; i <= 10; i++)); do /etc/ha.d/resource.d/drbddisk export stop >/dev/null 2>&1 if [[ $(egrep -o "st:[A-Za-z/]*" /proc/drbd | cut -d: -f2) == "Secondary/Secondary" ]] || [[ $(egrep -o "st:[A-Za-z/]*" /proc/drbd | cut -d: -f2) == "Secondary/Unknown" ]]; then echo "Successfully stopped DRBD" break else echo "Failed to stop drbd for some reason" cat /proc/drbd if [[ $i -eq 10 ]]; then exit 50 fi fi done fi rm -f /tmp/heartbeatwrapper exit 0elif [[ x$1 == "xstart" ]]; then
#start up drbd first /etc/ha.d/resource.d/drbddisk export start >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "Something seems to have broken. Let's check possibilities..." testvar=$(egrep -o "st:[A-Za-z/]*" /proc/drbd | cut -d: -f2) if [[ $testvar == "Primary/Unknown" ]] || [[ $testvar == "Primary/Secondary" ]] then echo "All is fine, we are already the Primary for some reason" elif [[ $testvar == "Secondary/Unknown" ]] || [[ $testvar == "Secondary/Secondary" ]] then echo "Trying to assume Primary again" /etc/ha.d/resource.d/drbddisk export start >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "I give up, something's seriously broken here, and I can't help you to fix it." rm -f /tmp/heartbeatwrapper exit 127 fi fi fi sleep 1 #now we remount our partitions for ((test=1; test <= 10; test++)); do mount /dev/drbd1 /export >/tmp/mountoutput if [[ -n $(grep -o export /etc/mtab) ]]; then break fi done if [[ $test -eq 10 ]]; then rm -f /tmp/heartbeatwrapper exit 125 fi #I'm really unsure at this point of the side-effects of not having rpc_pipefs mounted. #The issue here, is that it cannot be mounted without nfs running, and we don't really want to start #nfs up at this point, lest it ruin everything. #For now, I'm leaving mine unmounted, it doesn't seem to cause any problems. #Now we start up nfs. /etc/init.d/nfs-kernel-server start >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "There's not really that much that I can do to debug nfs issues." echo "probably your configuration is broken. I'm terminating here." rm -f /tmp/heartbeatwrapper exit 129 fi #And that's it, done. rm -f /tmp/heartbeatwrapper exit 0elif [[ "x$1" == "xstatus" ]]; then
#Lets check to make sure nothing is broken. #DRBD first /etc/ha.d/resource.d/drbddisk export status >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "stopped" rm -f /tmp/heartbeatwrapper exit 3 fi #mounted? grep -q drbd /etc/mtab >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "stopped" rm -f /tmp/heartbeatwrapper exit 3 fi #nfs running? /etc/init.d/nfs-kernel-server status >/dev/null 2>&1 if [[ $? -ne 0 ]]; then echo "stopped" rm -f /tmp/heartbeatwrapper exit 3 fi echo "running" rm -f /tmp/heartbeatwrapper exit 0fi
完成上述所有操作后,您只需要配置/ etc/exports即可
/export 172.20.1.0/255.255.255.0(rw,sync,fsid=1,no_root_squash)
然后,这只是在两台机器上启动心跳并在其中一台机器上发出hb_takeover的情况.您可以通过确保您发出接管的那个是主要的 - 检查/ proc/drbd,设备是否正确安装以及您可以访问nfs来测试它是否正常工作.
-
祝你好运.对我来说,从头开始设置它是一种非常痛苦的经历.