Machine Learning Week11 Application Example Photo OCR

Photo OCR

Problem Description and Pipeline

What is photo OCR problem?

  • Photo OCR = photo optical character recognition
    • With growth of digital photography, lots of digital pictures
    • One idea which has interested many people is getting computers to understand those photos
    • The photo OCR problem is getting computers to read text in an image
      • Possible applications for this would include
        • Make searching easier (e.g. searching for photos based on words in them)
        • Car navigation
  • OCR of documents is a comparatively easy problem
    • From photos it's really hard

Machine Learning Week10 Large Scale Machine Learning

Gradient Descent with Large Datasets

Learning With Large Datasets

Why large datasets?

  • One of best ways to get high performance is take a low bias algorithm and train it on a lot of data

    • e.g. Classification between confusable words
  • We saw that so long as you feed an algorithm lots of data they all perform pretty similarly
  • So it's good to learn with large datasets

Machine Learning Week9 Anomaly Detection

Density Estimation

Problem Motivation

  • We have a dataset which contains normal(data)
    • How we ensure they're normal is up to us
    • In reality it's OK if there are a few which aren't actually normal
  • Using that dataset as a reference point we can see if other examples are anomalous
  • First, using our training dataset we build a model

    • We can access this model using p(x)
      • This asks, "What is the probability that example x is normal"

Machine Learning Week8 Unsupervised Learning

Clustering

Unsupervised Learning Introduction

  • What is clustering good for
    • Market segmentation - group customers into different market segments
    • Social network analysis - Facebook "smartlists"
    • Organizing computer clusters and data centers for network layout and location
    • Astronomical data analysis - Understanding galaxy formation###

Machine Learning Week7 Support Vector Machines

Large Margin Classification

Optimization Objective

An alternative view of logistic regression

  • Begin with logistic regression, see how we can modify it to get the SVM

    • With hθ(x) close to 1, (θTx) must be much larger than 0
    • With hθ(x) close to 0, (θTx) must be much less than 0

Machine Learning Week6 Advice for Applying Machine Learning

Evaluating a Learning Algorithm

Deciding what to try next

  • We know many learning algorithm
    • But,how to choose the best algorithm to explore the various techniques
    • Here we focus deciding what avenues to try

Machine Learning Week5 Neural Networks Learning

Cost Function and Backpropagation

Cost Function

  • L = total number of layers in the network
  • sls = number of units (not counting bias unit) in layer l
  • K = number of output units/classes

Machine Learning Week4 Neural Networks Representation

Motivations

Non-linear Hypotheses

Why do we need neural networks?

  • Consider a supervised learning classification problem
    • logistic regression
    • g as usual is sigmoid function
    • And, if you include enough polynomial terms then, you know, maybe you can get a hypotheses.
    • However, this problem is just about two features x1 and x2, many machine learning problems would have a lot more features.
  • e.g. our housing example
    • 100 house features, predict odds of a house being sold in the next 6 months
    • Here, if you included all the quadratic terms (second order)
      • There are lots of them (x12 ,x1x2, x1x4 ..., x1x100)
      • For the case of n = 100, you have about 5000 features
      • Number of features grows O(n2)
  • Not a good way to build classifiers when n is large

Machine Learning Week3 Logistic Regression

Classification and Representation

Classification

  • y is a discrete value

  • Variable in these problems is Y

    • Y is either 0 or 1
      • 0 = negative class (absence of something)
      • 1 = positive class (presence of something)
    • Start with binary class problems
      • Later look at multiclass classification problem

Machine Learning Week2 Linear Regression with Multiple Variables

Multivariate Linear Regression

Multiple Features

  • Multiple variables = multiple features
  • If in a new scheme we have more features to predict the price of the house
    • x1, x2, x3, x4 are the four features
      • x1 - size (feet squared)
      • x2 - Number of bedrooms
      • x3 - Number of floors
      • x4 - Age of home (years)
    • y is the output variable (price)
Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×