canvas-lms/gems/canvas_partman
Cody Cutrer 2c2e687e59 qualified names fixes for migrations
refs CNVS-21900

Change-Id: I6b7e323e3c700aee3a7c86cb8f06fce78e5606df
Reviewed-on: https://gerrit.instructure.com/63384
Tested-by: Jenkins
Reviewed-by: Jacob Fugal <jacob@instructure.com>
Product-Review: Cody Cutrer <cody@instructure.com>
QA-Review: Cody Cutrer <cody@instructure.com>
2015-09-17 22:03:28 +00:00
..
lib qualified names fixes for migrations 2015-09-17 22:03:28 +00:00
spec refactor and improve partition management 2015-05-15 02:15:09 +00:00
.rspec canvas-partman & quiz log auditing partitioning 2014-11-11 19:06:00 +00:00
Gemfile refactor and improve partition management 2015-05-15 02:15:09 +00:00
LICENSE.txt canvas-partman & quiz log auditing partitioning 2014-11-11 19:06:00 +00:00
README.md canvas-partman - rolling partition migrations 2014-12-03 19:53:53 +00:00
canvas_partman.gemspec canvas-partman & quiz log auditing partitioning 2014-11-11 19:06:00 +00:00

README.md

CanvasPartman

Helpers for dynamically managing postgres partitions and an ActiveRecord::Base layer for managing partitioned models.

Usage

Let's say we have a Newsletter model which we want to partition by periods of months. We will always be creating one partition in advance for the coming month's publishes, and initially we'll create one for the current month so we can start storing data.

First, we begin by making our model partitionable. We do that by including the concern CanvasPartman::Concerns::Partitioned.

class Newsletter < ActiveRecord::Base
    include CanvasPartman::Concerns::Partitioned

    # self.partitioning_field = 'created_at'
    # self.partitioning_interval = :months
end

You can tune several partitioning parameters which you can find in the concern's source docs.

The next step is to set up our partitions so that we can start creating records.

partman = CanvasPartman::PartitionManager.new(Newsletter)

# a partition for this month's data:
partman.create_partition(Time.now)

# and another for the coming month:
partman.create_partition(1.month.from_now)

That's actually all you need to get started. However, you should be aware that due to way inheritance (and by extension, partitioning) works in postgres, any indices or constraints you define on the master table are not inherited when we create a partition. This means our migrations can not solely deal with the master table if we're doing anything other than adding/changing/dropping columns.

We'll see how to manage the partition schema in the next section.

Partition schema management

canvas-partman defines a custom migration type you can use to manage the schema of all existing (and to-be-created) partition tables. The migrations work, and read, just like regular ActiveRecord migrations but are run in two different ways depending on the context. Basically:

  1. all partition migrations are run along with the regular ones when using rake db:migrate
  2. all partition migrations for a specific master table are run when a new partition of that master table is created (this is handled implicitly by the PartitionManager)

Item #2 ensures that all newly-created partitions have a consistent schema with their predecessors which may have been migrated over time into their current schema.

Writing partition migrations

A custom Rails generator partition_migration is made available for generating skeleton partition migrations. Let's create a migration that adds an index on created_at for sorting to all existing partitions as well as ones to be created in the future:

rails generate partition_migration AddCreatedAtIndexToNewsletters newsletters
# create  db/migrate/20141115282317_add_created_at_index_to_newsletters.partitions.rb

Note the .partitions.rb part of the generated migration filename; more on this in the notes below, at the end of the section.

The generator requires 1 parameter which is similar to the stock migration generator and that is the name of the migration. The second optional argument is a string denoting the master table for the partitions. If you leave this unspecified, you will have to manually specify it in the migration itself.

Let's actually write the migration. Follow the inline comments:

# db/migrate/20141115282317_add_created_at_index_to_newsletters.partitions.rb

# We must subclass from CanvasPartman::Migration
# instead of ActiveRecord::Migration
class AddCreatedAtIndexToNewsletters < CanvasPartman::Migration
  self.master_table = :newsletters

  # If the base class can not be infered from the master table name
  # because, for example, it is namespaced, you may explicitly specify
  # it here:
  # self.base_class = MyApp::Newsletter

  def up
    # #with_each_partition() is a helper available to the migration
    # that allows you iterate over all existing partition tables.
    #
    # The passed block receives the name of the partition table as
    # the only parameter:
    with_each_partition do |partition|
      add_index partition, :created_at
    end
  end

  def down
    with_each_partition do |partition|
      remove_index partition, :created_at
    end
  end
end

Alternatively, you can define a reversible change runner using a change_table block which will be yielded with a table for every partition as you would expect:

class AddCreatedAtIndexToNewsletters < CanvasPartman::Migration
  self.master_table = :newsletters

  def change
    with_each_partition do |partition|
      change_table(partition) do |t|
        t.index :created_at
      end
    end
  end
end

A few notes:

  • Partition migration files must be "scoped" for canvas-partman to identify them and pick them up when creating new partitions. A scope is an identifier that comes after the name of the migration file and right before the rb extension, prefixed by a dot. In the example above, the scope is partitions. You can customize this by overriding CanvasPartman.migrations_scope = 'my_scope' but DO NOT LEAVE IT EMPTY.

Cascading changes

Adding, modifying, or removing columns do not need to be applied to each partition; instead we can rely on PG inheritance to take care of cascading those changes down to all partitions (and future ones.)

Because of this, we don't have to use CanvasPartman::Migration migrations at all; just use the regular ActiveRecord ones. Let's add a new column called publisher of type string to our newsletters table:

rails g migration AddPublisherToNewsletters
# create  db/migrate/20141115282318_add_publisher_to_newsletters.rb

And the migration:

# db/migrate/20141115282318_add_publisher_to_newsletters.rb

class AddPublisherToNewsletters < ActiveRecord::Migration
  def change
    change_table('newsletters') do |t|
      t.string :name
    end

    # Done! No need to worry about handling each partition,
    # pg inheritance will add this column to the master
    # table as well as the existing partition tables.
  end
end

TODO