The main focus of machine learning is making decisions or predictions based on data

Yüklə 167,41 Kb.

Pdf görüntüsü

səhifə	1/13
tarix	02.01.2022
ölçüsü	167,41 Kb.
	#45744

1 2 3 4 5 6 7 8 9 ... 13

Chapter 1 Introduction

Problem class
Model type

CHAPTER

Introduction

The main focus of machine learning is making decisions or predictions based on data. There

are a number of other fields with significant overlap in technique, but difference in focus:

in economics and psychology, the goal is to discover underlying causal processes and in

This story paraphrased

from a post on 9/4/12

at andrewgelman.com

This story paraphrased

from a post on 9/4/12

at andrewgelman.com

statistics it is to find a model that fits a data set well. In those fields, the end product is a

model. In machine learning, we often fit models, but as a means to the end of making good

predictions or decisions.

As machine-learning (ML) methods have improved in their capability and scope, ML

has become the best way, measured in terms of speed, human engineering time, and ro-

bustness, to make many applications. Great examples are face detection and speech recog-

nition and many kinds of language-processing tasks. Almost any application that involves

understanding data or signals that come from the real world can be best addressed using

machine learning.

One crucial aspect of machine learning approaches to solving problems is that human

and often undervalued

engineering plays an important role. A human still has to frame the problem: acquire and

organize data, design a space of possible solutions, select a learning algorithm and its pa-

rameters, apply the algorithm to the data, validate the resulting solution to decide whether

it’s good enough to use, etc. These steps are of great importance.

The conceptual basis of learning from data is the problem of induction: Why do we think

that previously seen data will help us predict the future? This is a serious philosophical

problem of long standing. We will operationalize it by making assumptions, such as that

all training data are IID (independent and identically distributed) and that queries will be

drawn from the same distribution as the training data, or that the answer comes from a set

of possible answers known in advance.

In general, we need to solve these two problems:

• estimation: When we have data that are noisy reflections of some underlying quan-

tity of interest, we have to aggregate the data and make estimates or predictions

about the quantity. How do we deal with the fact that, for example, the same treat-

ment may end up with different results on different trials? How can we predict how

well an estimate may compare to future results?

• generalization: How can we predict results of a situation or experiment that we have

never encountered before in our data set?

MIT 6.036

Fall 2021

We can describe problems and their solutions using six characteristics, three of which

characterize the problem and three of which characterize the solution:

1. Problem class: What is the nature of the training data and what kinds of queries will

be made at testing time?

2. Assumptions: What do we know about the source of the data or the form of the

solution?

3. Evaluation criteria: What is the goal of the prediction or estimation system? How

will the answers to individual queries be evaluated? How will the overall perfor-

mance of the system be measured?

4. Model type: Will an intermediate model be made? What aspects of the data will be

modeled? How will the model be used to make predictions?

5. Model class: What particular parametric class of models will be used? What criterion

will we use to pick a particular model from the model class?

6. Algorithm: What computational process will be used to fit the model to the data

and/or to make predictions?

Without making some assumptions about the nature of the process generating the data, we

cannot perform generalization. In the following sections, we elaborate on these ideas.

Don’t feel you have

to memorize all these

kinds of learning, etc.

We just want you to

have a very high-level

view of (part of) the

breadth of the field.