Abstractly, vectors are objects that can be added together (to form new vectors) and that can be multiplied by scalars (i.e., numbers), also to form new vectors. Concretely (for us), vectors are points in some finite-dimensional space. Although you might not think of your data as vectors, they are a good way to represent numeric data.
For example, if you have the heights, weights, and ages of a large number of people, you can treat your data as three-dimensional vectors (height, weight, age). If you’re teaching a class with four exams, you can treat student grades as four-dimensional vectors (exam1, exam2, exam3, exam4). The simplest from-scratch approach is to represent vectors as lists of numbers. A list of three numbers corresponds to a vector in three-dimensional space, and vice versa:
height_weight_age = [70, # inches,
170, # pounds,
40 ] # years
grades = [95, # exam1
80, # exam2
75, # exam3
62 ] # exam4
One problem with this approach is that we will want to perform arithmetic on vectors. Because Python lists aren’t vectors (and hence provide no facilities for vector arithmetic), we’ll need to build these arithmetic tools ourselves. So let’s start with that. To begin with, we’ll frequently need to add two vectors. Vectors add componentwise. This means that if two vectors v and w are the same length, their sum is just the vector whose first element is v[0] + w[0], whose second element is v[1] + w[1], and so on. (If they’re not the same length, then we’re not allowed to add them.)
For example, adding the vectors [1, 2] and [2, 1] results in [1 + 2, 2 + 1] or [3, 3]. This addition iss shown below:
We can easily implement this by zip-ing the vectors together and using a list comprehension to add the corresponding elements:
def vector_add(v, w):
"""adds corresponding elements"""
return [v_i + w_i
for v_i, w_i in zip(v, w)]
Similarly, to subtract two vectors we just subtract corresponding elements:
def vector_subtract(v, w):
"""subtracts corresponding elements"""
return [v_i - w_i
for v_i, w_i in zip(v, w)]
We’ll also sometimes want to componentwise sum a list of vectors. That is, create a new vector whose first element is the sum of all the first elements, whose second element is the sum of all the second elements, and so on. The easiest way to do this is by adding one vector at a time:
def vector_sum(vectors):
"""sums all corresponding elements"""
result = vectors[0] # start with the first vector
for vector in vectors[1:]: # then loop over the others
result = vector_add(result, vector) # and add them to the result
return result
If you think about it, we are just reduce-ing the list of vectors using vector_add, which means we can rewrite this more briefly using higher-order functions:
def vector_sum(vectors):
return reduce(vector_add, vectors)
or even:
vector_sum = partial(reduce, vector_add)
although this last one is probably more clever than helpful.
We’ll also need to be able to multiply a vector by a scalar, which we do simply by multiplying each element of the vector by that number:
def scalar_multiply(c, v):
"""c is a number, v is a vector"""
return [c * v_i for v_i in v]
This allows us to compute the componentwise means of a list of (same-sized) vectors:
def vector_mean(vectors):
"""compute the vector whose ith element is the mean of the
ith elements of the input vectors"""
n = len(vectors)
return scalar_multiply(1/n, vector_sum(vectors))
A less obvious tool is the dot product. The dot product of two vectors is the sum of their componentwise products:
def dot(v, w):
"""v_1 * w_1 + ... + v_n * w_n"""
return sum(v_i * w_i
for v_i, w_i in zip(v, w))
The dot product measures how far the vector v extends in the w direction. For example, if w = [1, 0] then dot(v, w) is just the first component of v. Another way of saying this is that it’s the length of the vector you’d get if you projected v onto w as shown in figure below:
Using this, it’s easy to compute a vector’s sum of squares:
def sum_of_squares(v):
"""v_1 * v_1 + ... + v_n * v_n"""
return dot(v, v)
Which we can use to compute its magnitude (or length):
import math
def magnitude(v):
return math.sqrt(sum_of_squares(v)) # math.sqrt is square root function
We now have all the pieces we need to compute the distance between two vectors, defined as:
def squared_distance(v, w):
"""(v_1 - w_1) ** 2 + ... + (v_n - w_n) ** 2"""
return sum_of_squares(vector_subtract(v, w))
def distance(v, w):
return math.sqrt(squared_distance(v, w))
Which is possibly clearer if we write it as (the equivalent):
def distance(v, w):
return magnitude(vector_subtract(v, w))
For example, if you have the heights, weights, and ages of a large number of people, you can treat your data as three-dimensional vectors (height, weight, age). If you’re teaching a class with four exams, you can treat student grades as four-dimensional vectors (exam1, exam2, exam3, exam4). The simplest from-scratch approach is to represent vectors as lists of numbers. A list of three numbers corresponds to a vector in three-dimensional space, and vice versa:
height_weight_age = [70, # inches,
170, # pounds,
40 ] # years
grades = [95, # exam1
80, # exam2
75, # exam3
62 ] # exam4
One problem with this approach is that we will want to perform arithmetic on vectors. Because Python lists aren’t vectors (and hence provide no facilities for vector arithmetic), we’ll need to build these arithmetic tools ourselves. So let’s start with that. To begin with, we’ll frequently need to add two vectors. Vectors add componentwise. This means that if two vectors v and w are the same length, their sum is just the vector whose first element is v[0] + w[0], whose second element is v[1] + w[1], and so on. (If they’re not the same length, then we’re not allowed to add them.)
For example, adding the vectors [1, 2] and [2, 1] results in [1 + 2, 2 + 1] or [3, 3]. This addition iss shown below:
We can easily implement this by zip-ing the vectors together and using a list comprehension to add the corresponding elements:
def vector_add(v, w):
"""adds corresponding elements"""
return [v_i + w_i
for v_i, w_i in zip(v, w)]
Similarly, to subtract two vectors we just subtract corresponding elements:
def vector_subtract(v, w):
"""subtracts corresponding elements"""
return [v_i - w_i
for v_i, w_i in zip(v, w)]
We’ll also sometimes want to componentwise sum a list of vectors. That is, create a new vector whose first element is the sum of all the first elements, whose second element is the sum of all the second elements, and so on. The easiest way to do this is by adding one vector at a time:
def vector_sum(vectors):
"""sums all corresponding elements"""
result = vectors[0] # start with the first vector
for vector in vectors[1:]: # then loop over the others
result = vector_add(result, vector) # and add them to the result
return result
If you think about it, we are just reduce-ing the list of vectors using vector_add, which means we can rewrite this more briefly using higher-order functions:
def vector_sum(vectors):
return reduce(vector_add, vectors)
or even:
vector_sum = partial(reduce, vector_add)
although this last one is probably more clever than helpful.
We’ll also need to be able to multiply a vector by a scalar, which we do simply by multiplying each element of the vector by that number:
def scalar_multiply(c, v):
"""c is a number, v is a vector"""
return [c * v_i for v_i in v]
This allows us to compute the componentwise means of a list of (same-sized) vectors:
def vector_mean(vectors):
"""compute the vector whose ith element is the mean of the
ith elements of the input vectors"""
n = len(vectors)
return scalar_multiply(1/n, vector_sum(vectors))
A less obvious tool is the dot product. The dot product of two vectors is the sum of their componentwise products:
def dot(v, w):
"""v_1 * w_1 + ... + v_n * w_n"""
return sum(v_i * w_i
for v_i, w_i in zip(v, w))
The dot product measures how far the vector v extends in the w direction. For example, if w = [1, 0] then dot(v, w) is just the first component of v. Another way of saying this is that it’s the length of the vector you’d get if you projected v onto w as shown in figure below:
Using this, it’s easy to compute a vector’s sum of squares:
def sum_of_squares(v):
"""v_1 * v_1 + ... + v_n * v_n"""
return dot(v, v)
Which we can use to compute its magnitude (or length):
import math
def magnitude(v):
return math.sqrt(sum_of_squares(v)) # math.sqrt is square root function
We now have all the pieces we need to compute the distance between two vectors, defined as:
def squared_distance(v, w):
"""(v_1 - w_1) ** 2 + ... + (v_n - w_n) ** 2"""
return sum_of_squares(vector_subtract(v, w))
def distance(v, w):
return math.sqrt(squared_distance(v, w))
Which is possibly clearer if we write it as (the equivalent):
def distance(v, w):
return magnitude(vector_subtract(v, w))
0 comments:
Post a Comment