Thursday, May 10, 2012

Basic linear algebra with Numpy on Python 3.2

I'm taking Stanford's online course on machine learning through Coursera. The second week has a good overview of linear algebra and matrix operations. The instructor has provided a useful PowerPoint deck in which he explains the basics. I'm going to go through this pdf and implement the linear algebra using NumPy. I have NumPy 1.6 installed with Python 3.2.
Ml SlideMachine Learning: Linear Algebra Reviews Lecture3
First, open an interactive Python shell and load NumPy:
>>> from numpy import *

Create a Matrix

Let's try just creating the 4x2 matrix he shows in slides 2 and 3.The basic form for creating arrays is to use the array method with parenthesis:
a = array()
Inside the parenthesis, nest some square brackets, and in those brackets just put comma-separated lists of elements in more square brackets. Each square bracket represents a row. Make sure they all have the same number of columns. So to create the 4x2 matrix from slides 2 and 3 use the following
>>> a = array([[1402,191],[1371,821],[949,1437]])
>>> print(a)
[[1402  191]
 [1371  821]
 [ 949 1437]] 

Matrix Elements

Next he talks about accessing the matrix element-wise, so a11 should access '1402.'
>>> a[1,1]
821
Instead of returning the first row of the first column, it gave us the second row of the second column. This is because NumPy is using 0-indexing (start at 0) instead of 1-indexing (start at 1). So to get the first row of the first column we index from 0:
>>> a[0,0]
1402

Matrix Addition

Next let's create two 3x2 matrices and add them together. First, instantiate the matrices:
>>> b = array([[1,0],[2,5],[3,1]])
>>> c = array([[4,0.5],[2,5],[0,1]])
>>> print(b)
[[1 0]
 [2 5]
 [3 1]]
>>> print(c)
[[ 4.   0.5]
 [ 2.   5. ]
 [ 0.   1. ]]
Add b and c and see what happens:
>>> b+c
array([[  5. ,   0.5],
       [  4. ,  10. ],
       [  3. ,   2. ]])
As you can see, NumPy correctly performed an element-wise addition.

Scalar Multiplication

Next, multiply a scalar by a 3x2 matrix. We'll use matrix 'b' from above:

>>> 3 * b
array([[ 3,  0],
       [ 6, 15],
       [ 9,  3]], dtype=int32)
Again, NumPy correctly multiplied each element of the matrix by 3. Note that this can be done in any order (i.e. scalar * matrix = matrix * scalar).
Division is just multiplication by a fraction:
>>> d / 4
array([[ 1.  ,  0.  ],
       [ 1.5 ,  0.75]])

Combination of Operands

Order of operations is important. In this slide the instructor sets up three vectors (3x1) and provides an example in which he multiples by a scalar then adds then subtracts then divides. Let NumPy do it and see what happens:
>>> e = array([[1],[4],[2]])
>>> f = array([[0],[0],[5]])
>>> g = array([[3],[0],[2]])
>>> 3 * e + f - g / 3
array([[  2.        ],
       [ 12.        ],
       [ 10.33333333]])
As before, NumPy produces the same answer as the instructor found by doing it by hand.

Matrix-vector multiplication

We can multiply a matrix by a vector as long as the number of columns of the matrix is the same as the number of rows of the vector. In other words, the matrix must be as wide as the vector is long.
>>> h = array([[1,3],[4,0],[2,1]]) # 3x2
>>> i = array([[1],[5]]) # 2x1
>>> h * i
Traceback (most recent call last):
  File "<pyshell#54>", line 1, in <module>
    h * i
ValueError: operands could not be broadcast together with shapes (3,2) (2,1)
My multiplication operation didn't work, and it helpfully gave me the shapes of the arrays for which multiplication failed. This is because the multiplication operator '*' causes element-wise multiplication. For that to work, the matrices need to be of the same shape (hence the error message shosed us the shapes were different). What we want is the dot product.

The Dot Product

To multiply two matrices using the dot product, use the dot() method:
>>> h = array([[1,3],[4,0],[2,1]]) # 3x2
>>> i = array([[1],[5]]) # 2x1
>>> dot(h,i)
array([[16],
       [ 4],
       [ 7]])
That gives us the correct answer, according to the slides. A longer example:
>>> j = array([[1,2,1,5],[0,3,0,4],[-1,-2,0,0]])
>>> k = array([[1],[3],[2],[1]])
>>> dot(j,k)
array([[14],
       [13],
       [-7]])
This one checks out with the slides as well.

Matrix-Matrix Multiplication

As far as NumPy is concerned, matrix-matrix multiplication is just like matrix-vector multiplication.
>>> l = array([[1,3,2],[4,0,1]])
>>> m = array([[1,3],[0,1],[5,2]])
>>> dot(l,m)
array([[11, 10],
       [ 9, 14]])
He goes on to give one more example with a pair of 2x2 matrices:
>>> n = array([[1,3],[2,5]]) # 2x2
>>> o = array([[0,1],[3,2]]) # 2x2
>>> dot(n,o)
array([[ 9,  7],
       [15, 12]])
Next he provides some context by using the example of home prices. He sets up a 4x2 and 2x3 matrix and multiplies them to quickly come up with price predictions:
>>> p = array([[1,2104],[1,1416],[1,1534],[1,852]]) # 4x2
>>> q = array([[-40,200,-150],[0.25,0.1,0.4]]) # 2x3
>>> dot(p,q)
array([[ 486. ,  410.4,  691.6],
       [ 314. ,  341.6,  416.4],
       [ 343.5,  353.4,  463.6],
       [ 173. ,  285.2,  190.8]])

Matrix Multiplication Properties

Show that matrix multiplication is not commutative:
>>> A = array([[1,1],[0,0]]) # 2x2
>>> B = array([[0,0],[2,0]]) # 2x2
>>> dot(A,B)
array([[2, 0],
       [0, 0]])
>>> dot(B,A)
array([[0, 0],
       [2, 2]])
To test this another way I asserted equality between the two operations and found a neat element-wise comparison:
>>> dot(A,B) == dot(B,A)
array([[False,  True],
       [False, False]], dtype=bool)
To show the associative property of arrays, create another 2x2 array C and multiply them in different groupings, but in the same order, to show that the result is always the same:
>>> C = array([[1,3],[0,2]]) # 2x2
>>> A = array([[1,1],[0,0]]) # 2x2
>>> B = array([[0,0],[2,0]]) # 2x2
>>> C = array([[1,3],[0,2]]) # 2x2
>>> dot(A,dot(B,C))
array([[2, 6],
       [0, 0]])
>>> dot(dot(A,B),C)
array([[2, 6],
       [0, 0]])

Identity Matrix

NumPy comes with a built-in function for producing an identity matrix. Just pass it the dimension (numnber of rows or columns) as the argument. Optionally tell it to output elements as integers in order to clean up the output:
>>> identity(3)
array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])
>>> identity(3, dtype=int)
array([[1, 0, 0],
       [0, 1, 0],
       [0, 0, 1]])
Show that for any matrix A, AI=IA=A. We can use the same A from before:
>>> A = array([[4,2,1],[4,8,3],[1,1,0]]) # 3x3
>>> I = identity(3, dtype=int)
>>> dot(A,I)
array([[4, 2, 1],
       [4, 8, 3],
       [1, 1, 0]])
>>> A
array([[4, 2, 1],
       [4, 8, 3],
       [1, 1, 0]])
>>> dot(A,I)==dot(I,A)
array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]], dtype=bool)
>>> dot(A,I) == A
array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]], dtype=bool)

Inverse and Transpose

Show that if A is an mxm matrix, and if it has an inverse, then A(A-1) = (A-1)A = I. To do this, we can use the same 3x3 matrix A from above:
>>> A = array([[4,2,1],[4,8,3],[1,1,0]]) # 3x3
>>> inv(A)
Traceback (most recent call last):
  File "<pyshell#112>", line 1, in <module>
    inv(A)
NameError: name 'inv' is not defined
We got this error because though we loaded NumPy, we need to also load the special linear algebra library:
>>> from numpy.linalg import *
Then we can try the inversion again:
>>> inv(A)
array([[ 0.3, -0.1,  0.2],
       [-0.3,  0.1,  0.8],
       [ 0.4,  0.2, -2.4]])
Now show that that A(A-1) = (A-1)A = I:
>>> dot(A, inv(A))
array([[  1.00000000e+00,  -2.77555756e-17,   0.00000000e+00],
       [ -2.22044605e-16,   1.00000000e+00,   0.00000000e+00],
       [ -5.55111512e-17,  -1.38777878e-17,   1.00000000e+00]])
>>> dot(inv(A), A)
array([[  1.00000000e+00,   0.00000000e+00,  -5.55111512e-17],
       [ -2.22044605e-16,   1.00000000e+00,  -5.55111512e-17],
       [  0.00000000e+00,   0.00000000e+00,   1.00000000e+00]])
Note that this was meant to return the identity matrix, but that the float operations returned not zeros but numbers very close to zero. Note also that a matrix of all zeros has no inverse:
>>> C = array([[0,0],[0,0]])
>>> C
array([[0, 0],
       [0, 0]])
>>> inv(C)
Traceback (most recent call last):
  File "<pyshell#121>", line 1, in <module>
    inv(C)
  File "C:\Python32\lib\site-packages\numpy\linalg\linalg.py", line 445, in inv
    return wrap(solve(a, identity(a.shape[0], dtype=a.dtype)))
  File "C:\Python32\lib\site-packages\numpy\linalg\linalg.py", line 328, in solve
    raise LinAlgError('Singular matrix')
numpy.linalg.linalg.LinAlgError: Singular matrix
The error message agrees with the machine learning instructor, who calls these special matrices "singular" or "degenerate."

Matrix Transpose

Lastly, show some matrix transposition, whereby the rows are flipped element-wise:

>>> A = array([[1,2,0],[3,5,9]])
>>> A.transpose()
array([[1, 3],
       [2, 5],
       [0, 9]])
If we name the transposed array 'B', then we can show that the ijth element of A is the jith element of B:
>>> B = A.transpose()
>>> A[0,2] == B[2,0]
True
>>> A[1,2] == B[2,1]
True 

Conclusion

We used NumPy and NumPy's library linalg to go through the linear algebra review slides from Coursera's Machine Learning course.





No comments:

Post a Comment