machine learning. quotes for machine learning “a breakthrough in machine learning would be worth...

Machine Learning

Quotes for Machine learning

“A breakthrough in machine learning would be worthten Microsofts” (Bill Gates, Chairman, Microsoft)

“Machine learning is the next Internet” (Tony Tether, Director, DARPA)

“Machine learning is the hot new thing” (John Hennessy, President, Stanford)

“Web rankings today are mostly a matter of machine learning” (Prabhakar Raghavan, Dir. Research, Yahoo)

“Machine learning is today’s discontinuity” (Jerry Yang, CEO, Yahoo)

“ What are all the important questions in Machine Learning”( Student(X) , Pre-Final year B.Tech-Cse, SRM University)

Traditional Programming

Machine Learning

ComputerComputerData

ProgramOutput

ComputerComputerData

OutputProgram

So What is Machine Learning?

• Getting computers to program themselves• Writing software is the bottleneck• Let the data do the work instead!

Applications• Web search • Computational biology• Finance• E-commerce• Space exploration• Robotics• Information extraction• Social networks• Debugging• [And your favorite area]

ML is growing growing…

• More than tens thousands of machine learning algorithms

• Hundreds are adding every year• Every machine learning algorithm has three

components:– Representation– Evaluation– Optimization

Representation

• Decision trees• Sets of rules / Logic programs• Instances• Graphical models (Bayes/Markov nets)• Neural networks• Support vector machines• Model ensembles• Etc.

Evaluation

• Accuracy• Precision and recall• Squared error• Likelihood• Posterior probability• Cost / Utility• Margin• Entropy• K-L divergence• Etc.

Optimization

• Combinatorial optimization– E.g.: Greedy search

• Convex optimization– E.g.: Gradient descent

• Constrained optimization– E.g.: Linear programming

Types of Machine Learning

• Supervised (inductive) learning– Training data includes desired outputs

• Unsupervised learning– Training data does not include desired outputs

• Semi-supervised learning– Training data includes a few desired outputs

• Reinforcement learning– Rewards from sequence of actions

Supervised Learning• It is the task of inferring a function from

labeled data. The trained data consist of set of trained examples. The supervised learning analyzes the trained data and produces an inferred function (which can be used for new samples)

Supervised Learning- Application

• Bio/che informatics• Handwriting recognition• OCR• Spam detection• Pattern recognition• Speech recognition

Unsupervised learning

• The examples given to the learner are unlabeled data. There is no wrong or improper values to achieve the optimized solution.

• Clustering (K means , hierarchical clustering)• Hidden markov models• Self organizing maps (AOM)• Adaptive resonance theory (ART)• Etc

Semi – supervised Learning• A sub class of supervised learning, taking small

amount of labeled data and large amount of unlabeled data.

• It falls between unsupervised learning ( without any labeled training data) and supervised

learning (completely labeled training data). Example : heuristic function (From the calulated

distance between two cities to discovers the distance between known to unknown cities)

Reinforcement Learning

• In the Markov decision processed environments, the dynamic programming techniques should apply.

• It differs from slandered supervised learning in that correct input /output pairs are never presented.

• Q-Learning algorithm• Sarsa algorithm

ML in Practice• Understanding domain, prior knowledge, and

goals• Data integration, selection, cleaning,

pre-processing, etc.• Learning models• Interpreting results• Consolidating and deploying discovered

knowledge• Loop

ML is Just like gardening?

Seeds

Nutrients

Gardener

Plants

You

Data

Algorithms

Programs

Current “hot” practical issues

• Learning from labeled and unlabeled data.• Nonlinear embeddings.• MDPs and stochastic games.• Adaptive programs and environments.• Kernels, comp bio, many others, ...

Learning from labeled and unlabeled data

• Often unlabeled data is cheap, labeled data is expensive.

• unlabeled data can suggest which hypothesis are a-priori more reasonable.

• Algorithmically, can use to bootstrap.– E.g., e.g., words on page; words pointing to

page

machine learning. quotes for machine learning “a breakthrough in machine learning would be worth...

Documents