machine learning in ruby

27
Machine Learning in Ruby An Introduction

Upload: brad-arner

Post on 12-Aug-2015

57 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Machine Learning In Ruby

Machine Learning in Ruby

An Introduction

Page 2: Machine Learning In Ruby

kNN Algorithm

Who am I?

Brad Arner

Head of Product

website: http://www.bradfordarner.com email: [email protected] twitter: @bradfordarner github: github.com/arnerjohn

Page 3: Machine Learning In Ruby

The Goal

Page 4: Machine Learning In Ruby

Make Machine Learning Less Scary

Page 5: Machine Learning In Ruby

The Caveat

Page 6: Machine Learning In Ruby

Machine Learning is a Huge Topic

Page 7: Machine Learning In Ruby

I will talk about a general, workhorse algorithm

Page 8: Machine Learning In Ruby

k Nearest Neighbor (kNN)

Page 9: Machine Learning In Ruby

Fill in missing information

Page 10: Machine Learning In Ruby

kNN Algorithm

What is the value of a home?

$250K

$275K

$280K

$240K

$235K

$215K

$240K

$195K

$300K

$210K

$250K

$????

1650 sq ft 2100 sq ft 1950 sq ft 1700 sq ft 2100 sq ft 1800 sq ft

2100 sq ft1700 sq ft2100 sq ft 1650 sq ft 1800 sq ft1950 sq ft

Page 11: Machine Learning In Ruby

kNN Algorithm

What is in a name?

k =>

N =>

N =>

Number of Elements

Nearest / Distance

Neighbors / More Exist

Page 12: Machine Learning In Ruby

kNN Algorithm

What is in a name?

k =>

N =>

N =>

Number of Elements

Nearest / Distance

Neighbors / More Exist

k = 3

Square Feet

Value

Rooms

Page 13: Machine Learning In Ruby

kNN Algorithm

What is in a name?

k =>

N =>

N =>

Number of Elements

Nearest / Distance

Neighbors / More Exist

k = 5

Square Feet

Value

Rooms

Page 14: Machine Learning In Ruby

kNN Algorithm

What is in a name?

k =>

N =>

N =>

Number of Elements

Nearest / Distance

Neighbors / More Exist

Square Feet

Value

Rooms

Page 15: Machine Learning In Ruby

kNN Algorithm

What is in a name?

k =>

N =>

N =>

Number of Elements

Nearest / Distance

Neighbors / More Exist

Value: $200K

Value: $220K

Value: $210K

Value: $195K

Value: $215K

Estimated Value: $208K

Page 16: Machine Learning In Ruby

The Code

Page 17: Machine Learning In Ruby

github.com/arnerjohn/machine_learning_in_ruby

eventually @ http://www.machinelearninginruby.com

Page 18: Machine Learning In Ruby

The Code

General Organizationclass Property

attr_accessor :rooms, :area, :type

def initialize(options = {}) options = {

rooms: 1, area: 500, type: false

}.merge(options)

@rooms = options[:rooms] @area = options[:area] @type = options[:type]

@neighbors = [] @distance = nil @guess = nil

end . . .

end

class Orchestrator def initialize(k)

@property_list = [] @k = k @rooms = { max: 0, min: 10000, range: 0 } @area = { max: 0, min: 10000, range: 0 }

end . . .

end

Page 19: Machine Learning In Ruby

The Code

Program Execution

property_list = Orchestrator.new(3)

property_list.load_training_data

property_list.scale_features

property_list.add( Property.new({ rooms: 2, area: 1550, type: false }) )

property_list.add( Property.new({ rooms: 4, area: 1800, type: false }) )

property_list.determine_unknowns

Page 20: Machine Learning In Ruby

The Code

Load Data

property_list = Orchestrator.new(3)

property_list.load_training_data

property_list.scale_features

property_list.add( Property.new({ rooms: 2, area: 1550, type: false }) )

property_list.add( Property.new({ rooms: 4, area: 1800, type: false }) )

property_list.determine_unknowns

def load_training_data file = CSV.read(“data.csv”, { headers: true })

file.each do |line| property = Property.new({rooms: line[“rooms”].to_i, area: line[“area”].to_i, type: line[“type”]})

add(property) end

end

Page 21: Machine Learning In Ruby

The Code

Load Data

property_list = Orchestrator.new(3)

property_list.load_training_data

property_list.scale_features

property_list.add( Property.new({ rooms: 2, area: 1550, type: false }) )

property_list.add( Property.new({ rooms: 4, area: 1800, type: false }) )

property_list.determine_unknowns

def scale_features rooms_array = self.filter_knowns.map do |p|

property.rooms end

area_array = self.filter_knowns.map do |p| property.area

end

@rooms[:min] = rooms_array.min @rooms[:max] = rooms_array.max @rooms[:range] = rooms_array.max - rooms_array.min

@area[:min] = area_array.min @area[:max] = area_array.max @area[:range] = area_array.max - area_array.min

end

Page 22: Machine Learning In Ruby

The Code

Find Neighbors

property_list = Orchestrator.new(3)

property_list.load_training_data

property_list.scale_features

property_list.add( Property.new({ rooms: 2, area: 1550, type: false }) )

property_list.add( Property.new({ rooms: 4, area: 1800, type: false }) )

property_list.determine_unknowns

def filter_unknowns property_list.select do |property|

property.type == false end

end

def determine_unknowns self.filter_unknowns.each do |property|

property.neighbors = self.filter_knowns property.calculate_neighbor_distances(self.rooms[:range], self.area[:range])

property.guess_type(self.k) end

end

Page 23: Machine Learning In Ruby

The Code

Calculate Distances

property_list = Orchestrator.new(3)

property_list.load_training_data

property_list.scale_features

property_list.add( Property.new({ rooms: 2, area: 1550, type: false }) )

property_list.add( Property.new({ rooms: 4, area: 1800, type: false }) )

property_list.determine_unknowns

def calculate_neighbor_distances(room_range, area_range)

@neighbors.each do |neighbor| rooms_delta = neighbor.rooms - self.rooms area_delta = neighbor.area - self.area rooms_delta = rooms_delta / room_range.to_f area_delta = area_delta / area_range.to_f

neighbor.distance = Math.sqrt(rooms_delta*rooms_delta + area_delta*area_delta)

end end

Page 24: Machine Learning In Ruby

The Code

Guess Type

property_list = Orchestrator.new(3)

property_list.load_training_data

property_list.scale_features

property_list.add( Property.new({ rooms: 2, area: 1550, type: false }) )

property_list.add( Property.new({ rooms: 4, area: 1800, type: false }) )

property_list.determine_unknowns

def guess_type(k) guess_hash = gen_guess_hash(self.sort_neigbors_by_distance.take(k))

@guess = assign_guess(guess_hash)

msg = %Q{ Property attrs => rooms: #{ @rooms }, area: #{ @area } The property type is guessed to be: #{ @guess }

}

puts msg

return @guess end

Page 25: Machine Learning In Ruby

The Code

Guess Type

property_list = Orchestrator.new(3)

property_list.load_training_data

property_list.scale_features

property_list.add( Property.new({ rooms: 2, area: 1550, type: false }) )

property_list.add( Property.new({ rooms: 4, area: 1800, type: false }) )

property_list.determine_unknowns

def gen_guess_hash(properties) guess_hash = Hash.new(0) properties.each do |property|

guess_hash[property.type] += 1 end

return guess_hash end

def assign_guess(guess_hash) highest = 0 guess = ""

guess_hash.each do |key, value| if value > highest

highest = value guess = key

end end

return guess end

Page 26: Machine Learning In Ruby

Questions?

Page 27: Machine Learning In Ruby

Thank You!