# Machine Learning Week11 Application Example Photo OCR

## Photo OCR

### Problem Description and Pipeline

#### What is photo OCR problem?

• Photo OCR = photo optical character recognition
• With growth of digital photography, lots of digital pictures
• One idea which has interested many people is getting computers to understand those photos
• The photo OCR problem is getting computers to read text in an image
• Possible applications for this would include
• Make searching easier (e.g. searching for photos based on words in them)
• OCR of documents is a comparatively easy problem
• From photos it's really hard

# Machine Learning Week10 Large Scale Machine Learning

## Gradient Descent with Large Datasets

### Learning With Large Datasets

#### Why large datasets?

• One of best ways to get high performance is take a low bias algorithm and train it on a lot of data

• e.g. Classification between confusable words
• We saw that so long as you feed an algorithm lots of data they all perform pretty similarly
• So it's good to learn with large datasets

# Machine Learning Week9 Anomaly Detection

## Density Estimation

### Problem Motivation

• We have a dataset which contains normal(data)
• How we ensure they're normal is up to us
• In reality it's OK if there are a few which aren't actually normal
• Using that dataset as a reference point we can see if other examples are anomalous
• First, using our training dataset we build a model

• We can access this model using p(x)
• This asks, "What is the probability that example x is normal"

# Machine Learning Week8 Unsupervised Learning

## Clustering

### Unsupervised Learning Introduction

• What is clustering good for
• Market segmentation - group customers into different market segments
• Social network analysis - Facebook "smartlists"
• Organizing computer clusters and data centers for network layout and location
• Astronomical data analysis - Understanding galaxy formation###

# Machine Learning Week7 Support Vector Machines

## Large Margin Classification

### Optimization Objective

#### An alternative view of logistic regression

• Begin with logistic regression, see how we can modify it to get the SVM

• With hθ(x) close to 1, (θTx) must be much larger than 0
• With hθ(x) close to 0, (θTx) must be much less than 0

# Machine Learning Week6 Advice for Applying Machine Learning

## Evaluating a Learning Algorithm

### Deciding what to try next

• We know many learning algorithm
• But,how to choose the best algorithm to explore the various techniques
• Here we focus deciding what avenues to try

# Machine Learning Week5 Neural Networks Learning

## Cost Function and Backpropagation

### Cost Function

• L = total number of layers in the network
• sls = number of units (not counting bias unit) in layer l
• K = number of output units/classes

# Machine Learning Week4 Neural Networks Representation

## Motivations

### Non-linear Hypotheses

#### Why do we need neural networks?

• Consider a supervised learning classification problem
• logistic regression
• g as usual is sigmoid function
• And, if you include enough polynomial terms then, you know, maybe you can get a hypotheses.
• However, this problem is just about two features x1 and x2, many machine learning problems would have a lot more features.
• e.g. our housing example
• 100 house features, predict odds of a house being sold in the next 6 months
• Here, if you included all the quadratic terms (second order)
• There are lots of them (x12 ,x1x2, x1x4 ..., x1x100)
• For the case of n = 100, you have about 5000 features
• Number of features grows O(n2)
• Not a good way to build classifiers when n is large

# Machine Learning Week3 Logistic Regression

## Classification and Representation

### Classification

• y is a discrete value

• Variable in these problems is Y

• Y is either 0 or 1
• 0 = negative class (absence of something)
• 1 = positive class (presence of something)
• Later look at multiclass classification problem

# Machine Learning Week2 Linear Regression with Multiple Variables

## Multivariate Linear Regression

### Multiple Features

• Multiple variables = multiple features
• If in a new scheme we have more features to predict the price of the house
• x1, x2, x3, x4 are the four features
• x1 - size (feet squared)
• x2 - Number of bedrooms
• x3 - Number of floors
• x4 - Age of home (years)
• y is the output variable (price)