mysqlconf2009: taking activerecord to the next level
DESCRIPTION
Taking ActiveRecord to the next level contains tips and tricks for using ActiveRecord with enterprise Ruby on Rails Applications. Learn how to import and export multiple records, read off replicas, handle deadlocks, and use temporary tables. Use MySQL functionality such as adding index hints, on duplicate key update, insert select and more.TRANSCRIPT
Taking ActiveRecord to the Next Level
Blythe [email protected]://snowgiraffe.com
GoalLeverage advanced MySQL functionality
with ActiveRecord
Disclaimer!!!!
PrematureOptimization
ActiveRecord 101
What's going on under the covers?
ActiveRecord 101
class User < ActiveRecord::Base
end
CREATE TABLE `users` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(255) default NULL,
`password` varchar(255) default NULL,
`email` varchar(255) default NULL,
`created_at` datetime default NULL,
`updated_at` datetime default NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Database Table ff
Active
Record
Model
ActiveRecord 101 with Animals!
class Animal < ActiveRecord::Base
end
CREATE TABLE `animals` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(255) default NULL,
`password` varchar(255) default NULL,
`email` varchar(255) default NULL,
`species_id` int(11) default NULL,
`created_at` datetime default NULL,
`updated_at` datetime default NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Database Table ff
Active
Record
Model
Creating a Record
animal = Animal.new
animal.name = 'Jerry Giraffe'
animal.password = 'jerry'
animal.save!
INSERT INTO `animals`
(`name`, `updated_at`, `species_id`,
`password`, `email`, `created_at`)
VALUES('Jerry Giraffe', '2009-03-15 00:48:28',
NULL, 'jerry', NULL, '2009-03-15 00:48:28')
Updating a Record
animal.name = 'Jerry G'
animal.save!
UPDATE `animals`
SET `updated_at` = '2009-03-15 03:01:06',
`name` = 'Jerry G'
WHERE `id` = 1
Finding a Record
jerry = Animal.find :first,
:conditions => ['name = ?', 'Jerry G']
SELECT * FROM `animals`
WHERE (name = 'Jerry G') LIMIT 1
#shortcut
Animal.find_by_name 'Jerry G'
Representing Relationships
Species
name
Animal
name
password
fav_beer
updated_at
created_at
species_id
CREATE TABLE `animals` ( `id` int(11) NOT NULL auto_increment, `name` varchar(35) NOT NULL, `email` varchar(40) default NULL, `fav_beer` enum('Guiness','Pabst','Redhook','Chimay') default 'Pabst', `created_at` datetime default NULL, `updated_at` datetime default NULL, `password` varchar(25) character set latin1 collate latin1_bin NOT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB DEFAULT CHARSET=utf8
Representing Relationships(DDL)
CREATE TABLE `species` ( `id` int(11) NOT NULL auto_increment, `name` varchar(255), PRIMARY KEY (`id`),) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8
Representing Relationships (AR)
class Animal < ActiveRecord::Base
belongs_to :species
end
class Species < ActiveRecord::Base
has_many :animals
end
Representing Relationships (AR)
jerry.species
SELECT * FROM `species`
WHERE (`species`.`id` = 1)
species.animals
SELECT * FROM `animals`
WHERE (`animals`.species_id = 1)
Representing Relationships (AR)
giraffe = Species.find_by_name 'giraffe'
giraffe.animals << jerry
SELECT * FROM `species` WHERE (`species`.`name` = 'giraffe' ) LIMIT 1
UPDATE `animals` SET `species_id` = 1, `updated_at` = '2009-03-19 23:15:54' WHERE `id` = 7
Migration• Set limits• Set default values• Identify NOT NULL columns• Use enumerated columns• Custom DDL• Add (unique) indices• Foreign Keys are great• Primary Key Modifications
Migration 101ruby script/generate scaffold Animal name:string password:string email:string fav_beer:string
class CreateAnimals < ActiveRecord::Migration def self.up
create_table :animals do |t| t.string :name t.string :password t.string :email t.string :fav_beer
t.timestamps end
end
def self.down drop_table :animals endend
Set LimitsDefault String is VARCHAR(255)
create_table :animals do |t| t.string :name, :limit => 35 t.string :password, :limit => 25 t.string :email, :limit => 40 t.string :fav_beer, :limit => 40
t.timestampsend
Numeric Type Limits
t.integer :mysmallint, :limit => 2
"Smart types" determines numeric type for MySQL
:limitNumeric Type
Column Size
1 tinyint 1 byte
2 smallint 2 bytes
3 mediumint 3 bytes
4, nil, 11 int(11) 4 bytes
5 to 8 bigint 8 bytes
Set columns to NOT NULL
create_table :animals do |t| t.string :name, :limit => 35, :null => false t.string :password, :limit => 25, :null => false t.string :email, :limit => 40 t.string :fav_beer, :limit => 40
t.timestampsend
Set default values
create_table :animals do |t| t.string :name, :limit => 35, :null => false t.string :password, :limit => 25, :null => false t.string :email, :limit => 40, :default => nil t.string :fav_beer, :limit => 40 :default => 'Pabst' t.timestampsend
Remove unneeded columns
create_table :animals do |t| t.string :name, :limit => 35,
:null => false t.string :password, :limit => 25,
:null => false t.string :email, :limit => 40,
:default => nil t.string :fav_beer, :limit => 40
:default => 'Pabst' t.timestampsend
Enumerated Column Plugin
create_table :animals do |t| t.string :name, :limit => 35, :null => false t.string :password, :limit => 25, :null => false t.string :email, :limit => 40, :default => nil t.enum :fav_beer, :default => 'Pabst' :limit => %w(Chimay Pabst Redhook)
t.timestampsend
Think about the table parameters
create_table :animals, :options => 'ENGINE=MyISAM' do |t| t.string :name, :limit => 35, :null => false t.string :password, :limit => 25, :null => false t.string :email, :limit => 40, :default => nil t.enum :fav_beer, :default => nil :limit => %w(Chimay Pabst Redhook)
t.timestampsend
Custom DDL
create_table :animals do |t| t.string :name, :limit => 35, :null => false t.string :email, :limit => 40, :default => nil t.enum :fav_beer, :default => nil :limit => %w(Chimay Pabst Redhook)
t.timestampsend
#case sensitive password (encrypted)execute "ALTER TABLE `animals` ADD `password` varchar(25) character set latin1 collate latin1_bin NOT NULL"
Create (Unique) Indices
create_table :species do |t| t.string :name, :null => false, :limit => 25 end
add_index :species, :name, :unique => true, :name => 'uk_species_name'
ActiveRecord Uniquenessclass Species < ActiveRecord::Base
validates_uniqueness_of :name
end
Doesn't Guaranty Data Integrity!
I Heart Foreign Keys
Referential Integrity
The AR Way: Foreign Keysclass Species < ActiveRecord::Base
has_many :animals, :dependent => :nullify
end
The Rails Way: Foreign Keysclass Species < ActiveRecord::Base
has_many :animals, :dependent => :nullify
end
Dependent Value SQL Equivalent:
:nullify => ON DELETE SET NULL
:delete_all => ON DELETE CASCADE
:destroy => No SQL equivalent. Every association is instantiated and and callbacks are executed before destruction
Redhills Foreign Key Migration Plugin to the rescue!
add_column :animals, :species_id, :integer,
:references => :species,
:name => 'fk_animal_species',
:on_delete => :set_null,
:on_update => :cascade
ALTER TABLE `animals` ADD `species_id` int(11);
ALTER TABLE animals ADD CONSTRAINT fk_animal_species FOREIGN KEY (species_id) REFERENCES species (id) ON UPDATE CASCADE ON DELETE SET NULL;
CREATE TABLE `animals` ( `id` int(11) NOT NULL auto_increment, `name` varchar(35) NOT NULL, `email` varchar(40) default NULL, `fav_beer` enum('Guiness','Pabst','Redhook','Chimay') default 'Pabst', `species_id` int(11) default NULL, `created_at` datetime default NULL, `updated_at` datetime default NULL, `password` varchar(25) character set latin1 collate latin1_bin NOT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB DEFAULT CHARSET=utf8
Primary Keys
CREATE TABLE `species` ( `id` int(11) NOT NULL auto_increment, `name` varchar(255), PRIMARY KEY (`id`),) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8
Modify the Rails Primary KeyChange type with change_column MySQL Migration Optimization Plugin
create_table :animals, :primary_key => "special_key", :primary_column => { :type => :integer, :limit => 2, :precision => :unsigned, :scale => 3 } do |t| end
CREATE TABLE `animals` (`special_key` smallint(3) UNSIGNED NOT NULL auto_increment PRIMARY KEY) ENGINE=InnoDB;
Advanced ActiveRecord• Insert and update options• Import bulk data• Finder Options• Tools: plugin AR-Extensions
ActiveRecord on Steroids: ar-extensions pluginAdditional Create and Update optionssave(options={})
save!(options={})
create(args, options={})
create!(args, options={})
Options
:ignore
:on_duplicate_key_update
:keywords
:reload
:pre_sql
:post_sql
:ignore
Standard ActiveRecord:
Create the record if it does not already exist
unless Animal.find_by_name('Jerry G')
Animal.create!(:name => 'Jerry G',
:password => 'jerry')
end
:ignore
Ignore duplicates! One query, less code, fewer queries!
Animal.create!({:name => 'Jerry G',
:password => 'jerry'},
{:ignore => true})
:on_duplicate_key_update
Update the record if it exists, if not create a new one.
A lot of code to upsert and performs two SQL queries!
jerry = Animal.find_by_name 'Jerry G'
jerry ||= Animal.new(:name => 'Jerry G')
jerry.password = 'frenchfry'
jerry.save!
:on_duplicate_key_updatejerry = Animal.new :name => 'Jerry G',
:password => 'frenchfry'
jerry.save! :on_duplicate_key_update =>
[:password, :updated_at]
INSERT INTO animals
(`name`, `updated_at`, `species_id`,
`password`,`email`, `created_at`)
VALUES('Jerry G', '2009-03-15 06:17:51', NULL,
'frenchfry', NULL, '2009-03-15 06:17:51')
ON DUPLICATE KEY UPDATE
`animals`.`password`=VALUES(`password`),
`animals`.`updated_at`=VALUES(`updated_at`)
Reloading the instance
AR Data can become inconsistent with DB after an IGNORE, UPDATE, or ON DUPLICATE KEY UPDATE
reload executes more queries
For UPDATE the duplicate is automatically reloaded
jerry.email = '[email protected]'
jerry.save! :on_duplicate_key_update =>
[:password, :updated_at],
:reload => true,
:duplicate_columns => [:name]
More Customization
jerry.save(:keywords => 'LOW_PRIORITY',
:pre_sql => '/*Pre comment*/',
:post_sql =>
"/*#{__FILE__} #{__LINE__}*/")
/*Pre comment*/ UPDATE LOW_PRIORITY `animals`
SET `created_at` = '2009-03-15 06:13:48',
`species_id` = NULL, `email` = NULL,
`password` = 'frenchfry',
`updated_at` = '2009-03-15 06:45:38',
`name` = 'Jerry G'
WHERE `id` = 7/*animal_controller.rb 147 */
Import (Bulk Insert)Instead of one-by-one, insert a ton of records fast
Import (Bulk Insert)Standard way: Insert each animal one by one
Animal.create!(:name => 'dolly dolphin',
:password => 'dolly')
Animal.create!(:name => 'big birdie',
:password => 'birdie')
and so on….
Fast Import: One INSERT
animals = [ Animal.new(:name => 'dolly dolphin',
:password => 'dolly'),
Animal.new(:name => 'big birdie',
:password => 'birdie')]
Animal.import animals
INSERT INTO `animals`
(`id`,`name`,`email`,`fav_beer`,`created_at`,`updated_at`,`password`)
VALUES
(NULL,'dolly dolphin',NULL,'Pabst',
'2009-03-20 00:17:15','2009-03-20 00:17:15','dolly'),
(NULL,'big birdie',NULL,'Pabst',
'2009-03-20 00:17:15','2009-03-20 00:17:15','birdie')
ON DUPLICATE KEY UPDATE `animals`.`updated_at`=VALUES(`updated_at`)
Fastest Import: fewer columns
columns = [ :name, :password ]
values = [['dolly dolphin', 'dolly'],
['big birdie', 'birdie']]
options = {:validate => false,
:timestamps => false}
Animal.import columns, values, options
INSERT INTO `animals` (`name`,`password`)
VALUES
('dolly dolphin','dolly'),('big birdie','birdie')
Insert SelectStandard: Query and Insert one by one
Species.find(:all).each do |s|
SpeciesZoo.create!(:species_id => s.id,
:zoo_id => zoo.id,
:extra_info => 'awesome')
end
Executes a query for each species
INSERT INTO `species_zoos` (`zoo_id`, `id`, `species_id`, `extra_info`)
VALUES (1, 3, 3, 'awesome')
INSERT INTO `species_zoos` (`zoo_id`, `id`, `species_id`, `extra_info`)
VALUES (1, 3, 2 , 'awesome')
And so on…
Insert Select ImportSpeciesZoo.insert_select(
:from => :species,
:select => ['species.id, ?', zoo],
:into => [:species_id, :zoo_id])
One INSERT statement
INSERT INTO `species_zoos`
( `species_id`, `zoo_id` )
SELECT species.id, 1 FROM `species`
Temporary TablesNot so good for slave replication
Can be used as a sandbox then imported into a real table with ar-extensions gem
Animal.create_temporary_table do |t|
t.create!(:name => 'giraffe',
:password => 'goo')
Animal.insert_select(
:from => t,
:select => [:name, :password, :fav_beer],
:into => [:name, :password, :fav_beer],
:on_duplicate_key_update =>
[:password, :fav_beer])
end
Customizing Find
Additional finder options
:keywords
:pre_sql
:post_sql
:index_hint
Customizing Find
Animal.find(:all,
:conditions => ['name = ?', 'Jerry G'],
:keywords => 'HIGH_PRIORITY',
:pre_sql => '/*Pre comment*/',
:post_sql => 'FOR UPDATE /*After the fact*/',
:index_hint => 'USE INDEX (uk_animal_name)'
)
/*Pre comment*/ SELECT HIGH_PRIORITY *
FROM `animals` USE INDEX (uk_animal_name)
WHERE (name = 'Jerry G') FOR UPDATE
/*After the fact*/
Need more? Get dirty with find_by_sql
sql = Animal.send :finder_sql_to_string,
:conditions => ['name = ?', 'Jerry G']
sql.gsub! /WHERE/, 'where /* Dirty hand */'
Animal.find_by_sql sql
More: find_union & count_union
Animal.find_union(
{:conditions => ['animals.name like ?', 'Jerry%']},
{:conditions => ['species.name = ?', 'giraffe'],
:include => :species}
)
(SELECT `animals`.* FROM `animals`
WHERE (animals.name = 'Jerry G'))
UNION
(SELECT `animals`.* FROM `animals`
LEFT OUTER JOIN `species` ON
`species`.id = `animals`.species_id
WHERE (species.name = 'giraffe'))
Finder Issues: Speed and Memorypaginate - less loaded into memory
:select option - Data is retrieved faster when fewer columns are selected
Paginated FindersRails 2.3.2 includes :batch_size option
Animal.find_each(:batch_size => 2) do |animal|
#do something
end
Will Paginate Plugin
page = 1
begin
animals = Animal.paginate :per_page => 2, :page => page
animals.each{|animal| …do something… }
end while (page = animals.next_page)
Paginating Find Plugin
:selectData is retrieved faster when fewer columns are selected
Animal.find :first, :select => 'name'
:include hates :selectBut :select is ignored with eager loading (:include)!Animal.find :first,
:select => 'animals.name, species.name',
:include => :species,
:conditions => ['species.name like ?', 'giraffe']
SELECT `animals`.`id` AS t0_r0,
`animals`.`name` AS t0_r1,
`animals`.`email` AS t0_r2,
`animals`.`fav_beer` AS t0_r3,
`animals`.`created_at` AS t0_r4,
`animals`.`updated_at` AS t0_r5,
`animals`.`password` AS t0_r6,
`animals`.`species_id` AS t0_r7,
`species`.`id` AS t1_r0,
`species`.`name` AS t1_r1
FROM `animals` LEFT OUTER JOIN `species` ON
`species`.id = `animals`.species_id
WHERE (species.name like 'giraffe') LIMIT 1
Alternatives to Eager Loading Eager loading for sparse :include can be time consuming
Use :join instead of :include
Eager Load Plugin
Rails 2.3.2 Query Cache helps
ActiveRecordContext Plugin
PiggyBack Plugin
:join instead of :include Eager loading is slow in Rails and can be slow on the
database.
Use an (inner) :join instead of :includeanimal = Animal.find :first,
:select => 'animals.name, species.name as spec_name',
:joins => :species,
:conditions => ['species.name like ?', 'giraffe']
animal.spec_name == 'giraffe'
Force it with Eager loading pluginsEager loading is slow in Rails and can be slow on the
database.
eload-select plugin
Help from Rails 2 Query Cache
Animals of the same species are only loaded onceActiveRecord::Base.cache {
Animal.find(:all).each {|a| a.species }
}
Animal Load (1.8ms) SELECT * FROM `animals`
Species Load (0.3ms) SELECT * FROM `species` WHERE (`species`.`id` = 2)
CACHE (0.0ms) SELECT * FROM `species` WHERE (`species`.`id` = 2)
Species Load (0.3ms) SELECT * FROM `species` WHERE (`species`.`id` = 1)
CACHE (0.0ms) SELECT * FROM `species` WHERE (`species`.`id` = 1)
ActiveRecordContext Plugin
ActiveRecord::Base.with_context do
animals = Animal.find :all
Species.prefetch animals
animals.each {|a| a.species }
end
Animal Load (0.4ms) SELECT * FROM `animals`
[Context] Storing Animal records: 1, 2, 3, 4, 5, 6, and 7
Species Load (0.4ms) SELECT * FROM `species` WHERE (`species`.`id` IN( 2,1 ))
[Context] Storing Species records: 2 and 1
[Context] Found Species #2
[Context] Found Species #2
[Context] Found Species #1
[Context] Found Species #1
Piggyback Plugin
Delegate records with :has_one and :belongs_to associations
Great for accessing extension tables with TEXT or BLOB
Piggyback Plugin Uses joins to delegate records from :has_one and :belongs_to
associations
class Animal < ActiveRecord::Base
belongs_to :species
piggy_back :species_name, :from => :species,
:attributes => [:name]
end
animal = Animal.find :first, :piggy => :species_name
animal.species_name == 'giraffe'
SELECT animals.*, species.name AS species_name
FROM `animals` LEFT JOIN species ON
species.id=animals.species_id LIMIT 1
Avoiding DeadlockDeadlock Retry plugin - retries query up to 3 times
Batch Operations (AR-Extension plugin)
Animal.delete_all(['name like ?','giraffe%'],
:batch_size => 50)
Reading from a ReplicaMasochism plugin
Export Dataar-fixtures: Export entire table to yaml
ar-extensions: export to csv
ar-dumper: export (paginated) to yaml, xml or csv
Cache! Show me the money!Query Cache
Memcache
Static Record Cache plugin
ActiveRecord plugins and gems
AR-Extensions http://www.continuousthinking.com/tags/arext
Piggy Back http://railsexpress.de/svn/plugins/piggy_back/trunk/README
Eager Loading http://www.snowgiraffe.com/tech/?p=329
Active Record Context http://svn.techno-weenie.net/projects/plugins/active_record_context/
Will Paginate http://github.com/mislav/will_paginate/tree/master
Deadlock Retry http://agilewebdevelopment.com/plugins/deadlock_retry
Paginating Find http://www.railslodge.com/plugins/287-paginating-find
Static Record Cache http://github.com/blythedunham/static_record_cache/tree/master
Other Useful PluginsMigration Plugins
Redhills Consulting Foreign Key Migrations http://agilewebdevelopment.com/plugins/foreign_key_migrations
Enum Column http://enum-column.rubyforge.org/
MySQL Migration Optimizer http://github.com/blythedunham/mysql_migration_optimizer/tree/master
Replica Plugins
Masochism http://github.com/technoweenie/masochism/tree/master
Active Delegate http://www.robbyonrails.com/articles/2007/10/05/multiple-database-connections-in-ruby-on-rails
Export Plugins
Ar-fixtures http://github.com/topfunky/ar_fixtures/tree/master
Ar-dumper http://github.com/blythedunham/ar_dumper/tree/master
Questions?
Thanks for attending! I hope you are on the next level.
Slides available at: http://snowgiraffe.com
PicsBiz card: http://www.twozdai.com/
Toolbox: http://www.flickr.com/photos/29802022@N07/2971774302/
Cash: http://www.flickr.com/photos/gnerk/2466566500/
Emoo cow: http://blog.petegraham.co.uk/2007/05/15/animals-with-emo-haircuts/
Giraffe in Car: http://www.norrishallstudio.com/assets/img/products/switchplates/Giraffe_in_Car_SO.jpg
Leafcutters: http://www.makingthelink.co.uk/Leafcutter%20Ants%20Corcovado.jpg
Puppy: http://www.flickr.com/photos/todorrovic/2287792473/
Dump truck: http://farm3.static.flickr.com/2065/2197925262_bd2726c3fa.jpg?v=0
Ignore sign: http://www.flickr.com/photos/alexsuarez/2504638107/
Tigers: http://www.flickr.com/photos/sharynmorrow/19981568/
Single Leafcutter: http://www.flickr.com/photos/mattblucas/2176783448/
Giraffe with tongue: http://www.flickr.com/photos/ucumari/2570608134/
Giraffe in a can: http://www.flickr.com/photos/10159247@N04/2877489356/
Beaver: http://www.flickr.com/photos/krrrista/2286455954/
Mini Pig: http://www.richardaustinimages.com/
Giraffe herd: http://www.flickr.com/photos/acastellano/2260928018/
Giraffe skin: http://www.flickr.com/photos/tomorrowstand/1806095442
Dirty hands: http://www.flickr.com/photos/dragonhide/2372544373/
Blythe twins: http://www.flickr.com/photos/petitewanderlings/434252916/
Dead lock: http://www.flickr.com/photos/fpsurgeon/2453544236/
Foreign Keys: http://www.flickr.com/photos/zeeny79/347753999/
Gorilla: http://www.flickr.com/photos/shuttershrink/425613091/
Speed of light: http://www.flickr.com/photos/laserstars/908946494/
Fat Giraffe: http://funnies.com/funny-picture-fat-giraffe.htm