The Puppet Labs Issue Tracker has Moved: https://tickets.puppetlabs.com

Bug #2019

puppet managed nfs mount point filebuckets all of its files when nfs servers goes away and then comes back

Added by jay - about 5 years ago. Updated about 2 years ago.

Status:Needs More InformationStart date:02/23/2009
Priority:HighDue date:
Assignee:-% Done:

0%

Category:mount
Target version:-
Affected Puppet version:0.24.6 Branch:
Keywords:

We've Moved!

Ticket tracking is now hosted in JIRA: https://tickets.puppetlabs.com

This ticket may be automatically exported to the PUP project on JIRA using the button below:


Description

We are having an issue with puppet trying to file bucket the contents of an nfs server mount point when the nfs server goes away and comes back. Puppet things this has changed and tries to file bucket the contents of the nfs mount. Both the NFS mount, and the mount point are managed by the mount and file types respectively. We have seen this behavior for several versions now but did not know of a way to reproduce it until now. This happened on 9 hosts all sharing the same nfs server for this mount point.

History

#1 Updated by James Turnbull about 5 years ago

  • Status changed from Unreviewed to Needs More Information

Platform?

Can you show some logs from a run —trace —debug —verbose please.

#2 Updated by jay - about 5 years ago

we are running on centos 5 (we have seen this on 5.0, 5.1 and 5.2).

Logs are going to be hard because this tends to happen only when something bad goes on ie: nfs server crash or connectivity issue. Is their a way to force a crashdump/stacktrace from the running ruby/puppetd ? Reporting is on so I might be able to track down some reports if they have not been rotated away.

#3 Updated by Jay Munzner about 5 years ago

Different Jay, but same problem.

We had a connectivity issue to a nfs server and when it returned it was getting hammered by workstations running the puppet client. Before noticing the puppet client issue, we had stopped the puppetmaster to ensure that puppet was not causing the problem, only noticing a bit later that the clients were already copying files from the nfs server locally.

Around the time the server connection dropped we found this line in the client log: Tue Mar 10 16:10:25 +1100 2009 PATH/TO/NFS/MOUNT (notice): Recursively backing up to filebucket

We don’t recall configuring anything that should even be doing something like this at any time, so any insight would be helpful.

Here is what we have configured in the manifests: site.pp

define nfsmount(
        $device,
        $options = "nfsvers=3,tcp,rsize=32768,wsize=32768,rw,async,soft,intr"
) {
        mount { $name:
                name => $name,
                device => $device,
                ensure => "mounted",
                fstype => "nfs",
                atboot => true,
                options => $options,
        }
}

templates.pp:

class drd_nfs_mounts {
        file { "/drd/users": ensure=>"directory" }
        nfsmount { "/drd/users":
                device => "homedir.drd.roam:/srv/users",
                require => [ File["/drd/users"] ],
        }
}

Thanks

#4 Updated by Luke Kanies about 5 years ago

The only thing I can think that could be happening here is that somehow Puppet thinks that your mount point is no longer a directory, so it has to remove the thing that’s there and replace it with something.

When the filesystem is mounted, what does ‘ralsh file /path/to/mount’ say?

Also, you can try adding a ‘backup => false’ to your file statement to create the directory as a should-work workaround.

#5 Updated by Jay Munzner about 5 years ago

luke wrote:

The only thing I can think that could be happening here is that somehow Puppet thinks that your mount point is no longer a directory, so it has to remove the thing that’s there and replace it with something.

When the filesystem is mounted, what does ‘ralsh file /path/to/mount’ say?

Also, you can try adding a ‘backup => false’ to your file statement to create the directory as a should-work workaround.

‘ralsh file /drd/users’ says:

file { '/drd/users':
    ensure => 'directory',
    type => 'directory',
    mode => '493',
    group => '0',
    checksum => '{mtime}Thu Mar 12 17:10:09 +1100 2009',
    owner => '0'
}

We’ve added ‘backup => false’ to the file statements, thanks for the tip.

BTW, we were able to reproduce this by blocking a specific client from the nfs server while puppetd was running and then unblocking it a few minutes later and it started the backing up of the mount to filebucket.

#6 Updated by Luke Kanies about 5 years ago

Did the ‘backup => false’ stop that backing up from happening, at least?

#7 Updated by jay - about 5 years ago

debug: Puppet::Type::Mount::ProviderParsed: Executing ‘/bin/mount’ debug: //garry_mount/File[/mnt/isilon/cluster3/garry]: File does not exist debug: //garry_mount/File[/mnt/isilon/cluster3/garry]: Changing ensure debug: //garry_mount/File[/mnt/isilon/cluster3/garry]: 1 change(s)

This is happening when the NFS server is down. garry is the mountpoint.. but any sort of file operations on it fail because of said down NFS server. Even tryping to TAB complete this hangs… puppet run time goes up while the NFS server is down as well waiting for the softmount to timeout.

I am not going to add backup=>false because if it actually starts deleting content from the underlying directories… that is not good.

I will update when I get some status info from when the NFS server comes back up.

-j

#8 Updated by jay - about 5 years ago

Oh here is an ls on ‘garry’

$ ls -asl garry ls: garry: Input/output error

#9 Updated by Luke Kanies about 5 years ago

jay wrote:

debug: Puppet::Type::Mount::ProviderParsed: Executing ‘/bin/mount’ debug: //garry_mount/File[/mnt/isilon/cluster3/garry]: File does not exist debug: //garry_mount/File[/mnt/isilon/cluster3/garry]: Changing ensure debug: //garry_mount/File[/mnt/isilon/cluster3/garry]: 1 change(s)

This is happening when the NFS server is down. garry is the mountpoint.. but any sort of file operations on it fail because of said down NFS server. Even tryping to TAB complete this hangs… puppet run time goes up while the NFS server is down as well waiting for the softmount to timeout.

So what does the mount point look like when the mount is down? Is that garry file still there? When Puppet tries to make it, does it just hang forever, or what?

The only thing I can think of that triggers a recursive backup is if Puppet becomes convinced it needs to replace the thing that’s there with something else, but you’ve said it should be a directory, and it is considered a directory, so that shouldn’t be it.

If you could do some more debugging to maybe figure out what’s triggering the backup — maybe include debug logs from a run that triggers it — that’d be great.

#10 Updated by Nigel Kersten over 3 years ago

This is a serious enough bug to leave open, but I have a nagging feeling that this was resolved by some problem where Puppet would create files instead of directories under certain conditions?

#11 Updated by nathan halsey about 2 years ago

is this still a problem on newer versions of puppet?

Also available in: Atom PDF