Minimalistic Practical Introduction to Puppet (Not Only) For Vagrant Users

I couldn’t find any good, brief, practical introduction into Puppet that gives you basic working knowledge in minimal time, so here it is. You will learn how to do the elementary things with Puppet – install packages, copy files, start services, execute commands. I won’t go into Puppet installation, nodes, etc. as this introduction focuses on the users of Vagrant, which comes with Puppet pre-installed and working in the serverless configuration.

What is Puppet?

Puppet is a provisioner – a cross-OS software that sets up operating systems by installing and configuring software etc., based on some form of instructions. Here as an example of such instructions – a manifest – for Puppet:

# my_sample_manifest.pp
class my_development_env {
  package {'vim': ensure => 'present' }

# Apply it
include my_development_env

Running it with

puppet apply --verbose --debug my_sample_manifest.pp

would install vim on the system.

Notice that while Puppet can be run only once, it is generally intended to be run repeatedly to fetch and apply the latest configuration (usually from some source code management system such as Git). Therefore all its operations must be idempotent – they can be performed safely multiple times.

The Puppet Configuration File a.k.a. Manifest

Puppet manifests are written in a Ruby-like syntax and are composed of the declaration of “resources” (packages, files, etc.), grouped optionally into one or more classes (i.e. templates that can be applied to a system). Each concrete resource has a title (e.g. ‘vim’) followed by a colon and comma-separated pairs of property => value. You may have multiple resources of the same type (e.g. package) as long as they have different titles.

The property values are most often strings, enclosed by ‘single quotes’ or alternatively with “double quotes” if you want variables within them replaced with their values. (A variable starts with the dollar sign.)

Instead of a name or a value you can also use an array of titles/values, enclosed with [  ].

(Note: It is a common practice to leave a trailing comma behind the last property => value pair.)

You can group resources within classes (class my_class_name { … }) and then apply the class with either include (include my_class_name) or with the more complex class (class { ‘my_class_name’: }). You can also include a class in another class like this.

Doing Things With Puppet

Installing Software With Package

The most used way to install software packages is with this simple usage of package:

package {['vim', 'apache2']: ensure => 'present' }

Puppet supports various package providers and by default picks the system one (such as apt at Ubuntu or rpm at RedHat). You can explicitly state another supported package provider such as Ruby’s gem or Python’s pip.

You can also request a particular version of the package (if supported) with ensure => ‘<version>’ or to the latest one with ensure => ‘latest’ (this will also reinstall it whenever a new version is released and Puppet runs). In the case of ensure => ‘present’ (also called ‘installed’): if the package already is installed then nothing happens otherwise the latest version is installed.

Copying and creating Files With File

Create a file with content specified in-line:

file {'/etc/myfile.example':
  ensure => 'file',
  content => "line1\nline2\n",

Copy a directory including its content, set ownership etc.:

file {'/etc/apache2':
  ensure  => 'directory',
  source  => '/vagrant/files/etc/apache2',
  recurse => 'remote',
  owner   => 'root',
  group   => 'root',
  mode    => '0755',

This requires that the directory /vagrant/files/etc/apache2 exists. (Vagrant automatically shares the directory with the Vagrantfile as /vagrant in the VM so this actually copies files from the host machine.With the master-agent setup of Puppet you can also get files remotely, from the master, using the puppet:// protocol in the source.)

You can also create files based on ERB templates (with source => template(‘relative/path/to/it’)) but we won’t discuss that here.

You can also create symlinks (with ensure => link, target => ‘path/to/it’) and do other stuff, reader more in the file resource documentation.

(Re)Starting Daemons with Service

When you’ve installed the necessary packages and copied their configuration files, you’ll likely want to start the software, which is done with service:

service { 'apache2':·
  ensure => running,·
  require => Package['apache2'],

(We will talk about require later; it makes sure that we don’t try to start Apache before it’s installed.)

On Linux, Puppet makes sure that the service is registered with the system to be started after OS restart and starts it. Puppet reuses the OS’ support for services, such as the service startup scripts in /etc/init.d/ (where service = script’s name) or Ubuntu’s upstart.

You can also declare your own start/stop/status commands with the properties of the same names, f.ex. start => ‘/bin/myapp start’.

When Everything Fails: Executing Commands

You can also execute any shell command with exec:

exec { 'install hive':
  command => 'wget -O - | tar -xzC /tmp',
  creates => '/tmp/hive-0.8.1-bin',
  path => '/bin:/usr/bin',
  user => 'root',

Programs must have fully qualified paths or you must specify where to look for them with path.

It is critical that all such commands can be run multiple times without harm, i.e., they are idempotent. To achieve that you can instruct Puppet to skip the command if a file exists with creates => … or if a command succeeds or fails with unless/onlyif.

You can also run a command in reaction to a change to a dependent object by combining refreshonly and subscribe.

Other Things to Do

You can create users and groups, register authorized ssh keys, define cron entries, mount disks and much more – check out Puppet Type Reference.

Enforcing Execution Order With Require, Before, Notify etc.

Puppet processes the resources specified in a random order, not in the order of specification. So if you need a particular order – such as installing a package first, copying config files second, starting a service third – then you must tell Puppet about these dependencies. There are multiple ways to express dependencies and several types of dependencies:

  • Before and require – simple execution order dependency
  • Notify and subscribe – an enhanced version of before/require which also notifies the dependent resource whenever the resource it depends on changes, used with refreshable resources such as services; typically used between a service and its configuration file (Puppet will refresh it by restarting it)


service { 'apache2':
  ensure => running,
  subscribe => File['/etc/apache2'],
  require => [ Package['apache2'], File['some/other/file'] ],

Notice that contrary to resource declaration the resource reference has the resource name uppercased and the resource title is within [].

Puppet is clever enough to derive the “require” dependency between some resource that it manages such as a file and its parent folder or an exec and its user – this is well documented for each resource in the Puppet Type Reference in the paragraphs titled “Autorequires:”.

You can also express dependencies between individual classes by defining stages, assigning selected classes to them, and declaring the ordering of the stages using before & require. Hopefully you won’t need that.

Bonus Advanced Topic: Using Puppet Modules

Modules are self-contained pieces of Puppet configuration (manifests, templates, files) that you can easily include in your configuration by placing them into Puppet’s manifest directory. Puppet automatically find them and makes their classes available to you for use in your manifest(s). You can download modules from the Puppet Forge.

See the examples on the puppetlabs/mysql module page about how such a module would be used in your manifest.

With Vagrant you would instruct Vagrant to provide modules from a particular directory available to Puppet with

config.vm.provision :puppet,
  :module_path => "my_modules" do |puppet|
        puppet.manifest_file = "my_manifest.pp"

(in this case you’d need manifest/ next to your Vagrantfile) and then in your Puppet manifest you could have class { ‘mysql’: } etc.

Where to Go Next?

There are some things I haven’t covered that you’re likely to encounter such as variables and conditionals, built-in functions such as template(..), parametrized classes, class inheritance. I have also skipped all master-agent related things such as nodes and facts. It’s perhaps best to learn them when you encounter them.

In each case you should have a look at the Puppet Type Reference and if you have plenty of time, you can start reading the Language Guide. In the on-line Puppet CookBook you can find many useful snippets. You may also want to download the Learning Puppet VM to experiment with Puppet (or just try Vagrant).

Published by Jakub Holý

I’m a JVM-based developer since 2005, consultant, and occasionally a project manager, working currently with Iterate AS in Norway.

6 thoughts on “Minimalistic Practical Introduction to Puppet (Not Only) For Vagrant Users

  1. Hi!

    Do you know where puppet try to get packages? From os native repo (yum, apt, etc.)? what if there are no needed packages, say java (jdk)?

    1. Hi Stas,
      if there is no [up to date] puppet package then install it manually as described on their page. I’m not sure why you’d need jdk for Puppet but again, I’d try to google it out / install whatever Oracle provides. Good luck!

      1. Thanks for reply. I am not talking about jdk for puppet, but I mean if I want to ensure jdk installation with puppet (like you do with vim in your example).

        Problem is with jdk is that it can’t be distributed via repositories, only downloaded from Oracle website directly (with accepting license on downloading). So it needs to be some how attached to puppet manifest when executing it.

        Also, what bothers me are interactive installations of some software. I am not proficient in *nix, but sometimes I came across with installation of packages which require some interaction with user when installing. Have you ever tried install it with puppet?

      2. @Stas” As Justin wrote. Particularly for JDK, we are installink open jdk to our Ubuntu servers, it;s in the repositories (you can also add 3rd-party repositories as we do e.g. for MongoDB and install from them). For some packages we manually install the .deb package the offer, e.g. hadoop (using exec to execute wget and package with the dpkg provider).

  2. Thanks, Jakub – I had the same difficulty when I started learning Vagrant and Puppet last year. It’s time I had another go, and learned a bit more. I need to take my first steps with a master/agent setup – maybe a good subject for a follow-up article. Thanks for the pointer to the Learning Puppet VM (and the associated tutorial) – I shall use that for my next steps.

    @Stas – Puppet’s package operations give an abstraction over the OS/distro-specific package management. For the problem you are referring to, you would need to take your own private copy of the JDK and use that to install. For example, if you accept the license agreement, download the binary installer and execute it (which may involve a bit of further interaction regarding registration), you end up with a JDK directory like jdk1.6.0_32. If you zip that up and put it on your own server, it should then be easy to write a Puppet script to download and install it (putting it in the right place and adding environment variable settings). A more polished approach would be to make your own package containing the JDK – then you could use Puppet’s package type to install it, specifying the source. (And that’s another thing I need to learn to do myself – I’ll see if I can use the OpenJDK packages as a model.)

    1. Hi Justin, thank you for the positive feedback 🙂 I;m not using Vagrant with master-slave so there won;t be an article about that. However the learning VM might well help with that. Good luck with Vagrant and Puppet.

Comments are closed.