What is Numpy?

Numpy is a numerical computing Python library which provides n-dimensional arrays.

Why use Numpy?

If we were to implement arrays on python we would make use of lists which are slow to process. Meanwhile Numpy uses C++ on its background to access and manipulate arrays very efficiently on a continuous space.

Import numpy

Usually numpy is imported under the alias np

In [1]:
import numpy as np
print( 'Version:', np.__version__ )
Version: 1.18.1

Create arrays

Numpy arrays are of type ndarray. To create an array we have to pass a list/tupples/array-like object-nested sequences to the np.array function.

In [2]:
a = np.array( [1, 2, 3] )# Create array
print( 'type', type(a) )
print( 'shape', a.shape )
print( 'elements:', a[0], a[1], a[2] )
a[0] = 5# Change an element of the array
print( a )
type <class 'numpy.ndarray'>
shape (3,)
elements: 1 2 3
[5 2 3]

0-D

0-D arrays are scalar values

In [3]:
a = np.array( 27 )
In [4]:
a.shape
Out[4]:
()
In [5]:
a
Out[5]:
array(27)
In [6]:
type( a.item() )
Out[6]:
int
In [7]:
a
Out[7]:
array(27)

1-D

List that consists of 0-D arrays

In [8]:
a = np.array( [40,50,60] )
In [9]:
a.shape
Out[9]:
(3,)
In [10]:
a
Out[10]:
array([40, 50, 60])
In [11]:
print( a )
[40 50 60]

2-D

List that consists of 1-D arrays

In [12]:
b = np.array( [ 
                [4,5,6],
                [9,8,7]
    
            ])
In [13]:
print( 'Shape', b.shape )
Shape (2, 3)
In [14]:
b
Out[14]:
array([[4, 5, 6],
       [9, 8, 7]])
In [15]:
print( b )
[[4 5 6]
 [9 8 7]]

3-D

List that consists of 2-D arrays. In general N-D arrays, list that consists of (N-1)-D arrays.

In [16]:
c = np.array( [ 
    
                [
                    
                    [4,5,6],
                [9,8,7]
                
                ],
    
                [[44,55,66],
                [99,88,77]],
    
            ])
print( c )
print( 'Shape', c.shape )
[[[ 4  5  6]
  [ 9  8  7]]

 [[44 55 66]
  [99 88 77]]]
Shape (2, 2, 3)

Question: Find the position of number 8 in the 3d array

In [17]:
#c[  ][ ][  ]

Standard arrays

Numpy also provides many functions to create arrays:

All of zeros

In [18]:
a = np.zeros( (5,5,10) )#,43,5,4,2,4,2,4,3) )
a.shape
Out[18]:
(5, 5, 10)

All of ones

In [19]:
b = np.ones( (10,10) )
b
Out[19]:
array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])

Identity matrix

In [20]:
c = np.eye(7)
c
Out[20]:
array([[1., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0., 0.],
       [0., 0., 0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 0., 0., 1.]])
In [21]:
d = np.eye(7,10)
d
Out[21]:
array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.]])

Matrix with all its cells sets to a value

In [22]:
e = np.full( (6,8), 4 )
e
Out[22]:
array([[4, 4, 4, 4, 4, 4, 4, 4],
       [4, 4, 4, 4, 4, 4, 4, 4],
       [4, 4, 4, 4, 4, 4, 4, 4],
       [4, 4, 4, 4, 4, 4, 4, 4],
       [4, 4, 4, 4, 4, 4, 4, 4],
       [4, 4, 4, 4, 4, 4, 4, 4]])

Index/Access

To iterate through an array use the shape attribute to get its dimensions.

In [23]:
a = np.array( [ [43,42], [10,5], [27,8], [85,52] ] )

print( 'Shape', a.shape )

print('Values')
for i in range( a.shape[0] ):
    for j in range( a.shape[1] ):
        print( a[ i,j ], end=' ' )
    print()
Shape (4, 2)
Values
43 42 
10 5 
27 8 
85 52 
In [24]:
a
Out[24]:
array([[43, 42],
       [10,  5],
       [27,  8],
       [85, 52]])
In [ ]:
 

To refer to some cell of matrix a we use a[idx1][idx2]...[idxn] per dimension or a[idx1,idx2,...,idxn]

In [25]:
a[1][1]
Out[25]:
5
In [26]:
a[1,1]
Out[26]:
5

Subarrays

Function np.arange(n) return all numbers from [0,n)

In [27]:
a = np.arange(100).reshape( (10,10) )
In [28]:
a
Out[28]:
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

To refer to subarrays use the notation below

array[ firstRow:lastRow, firstCol:lastCol, ... ]

where firstRow:lastRow and firstCol:lastCol are some list

In [29]:
b = a[ 1:4, 1:5 ]
In [30]:
b
Out[30]:
array([[11, 12, 13, 14],
       [21, 22, 23, 24],
       [31, 32, 33, 34]])
In [31]:
c = a[ 1 ]
c
Out[31]:
array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
In [32]:
a[1].shape
Out[32]:
(10,)
In [33]:
c.shape
Out[33]:
(10,)
In [34]:
d = a[1]
In [35]:
d
Out[35]:
array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
In [36]:
d.shape
Out[36]:
(10,)
In [37]:
a[ 1:4, 1:5 ] = 50
In [38]:
a
Out[38]:
array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 50, 50, 50, 50, 15, 16, 17, 18, 19],
       [20, 50, 50, 50, 50, 25, 26, 27, 28, 29],
       [30, 50, 50, 50, 50, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

Reshape

In [39]:
d = np.arange( 10 ).reshape(  (1,1,1,1,1,1,10)  )
print( d )
print( d.shape )
[[[[[[[0 1 2 3 4 5 6 7 8 9]]]]]]]
(1, 1, 1, 1, 1, 1, 10)
In [40]:
e = d.reshape( (2,5) )
In [41]:
e
Out[41]:
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])
In [42]:
e.shape
Out[42]:
(2, 5)
In [43]:
d = np.arange(10).reshape( (5,2) )
print( d.shape )
print( d.T.shape )
np.transpose(d).T
(5, 2)
(2, 5)
Out[43]:
array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])

Exercise: Make use of the function np.tranpose or T attribute of the ndarray object.

$tr(A) = A^{T}$

Write the corresponding code below each comment

In [44]:
a = np.array( [  [2,4,5,2,1], [45,2,45,1,76] ] )
In [45]:
print(a)
a.shape
[[ 2  4  5  2  1]
 [45  2 45  1 76]]
Out[45]:
(2, 5)
In [46]:
print( np.transpose(a), np.transpose(a).shape )
[[ 2 45]
 [ 4  2]
 [ 5 45]
 [ 2  1]
 [ 1 76]] (5, 2)
In [47]:
print( a.T, a.T.shape )
[[ 2 45]
 [ 4  2]
 [ 5 45]
 [ 2  1]
 [ 1 76]] (5, 2)
In [48]:
#Create an array of size 2x5

#print array and its shape

#use np.transpose( array ) or array.T attribute to calculate the tranpose of array

#print tranposed array and its shape
In [ ]:
 

Minor fix at 1-D arrays

Notice that 1-D arrays have a shape of (N,) instead of (N,1).

In [49]:
a = np.array( [ [1,2,3] ] )
In [50]:
a
Out[50]:
array([[1, 2, 3]])
In [51]:
a.T
Out[51]:
array([[1],
       [2],
       [3]])
In [52]:
a.T.shape
Out[52]:
(3, 1)
In [53]:
a.T.T.shape
Out[53]:
(1, 3)

Exercise: What do you notice at the result of the tranpose? Is it correct? Justify your answer.

In [ ]:
 

Copy

In [54]:
a = np.array( [ [1,2,3,4,5], [1,2,3,4,5] ] )
b = a
In [55]:
a
Out[55]:
array([[1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5]])
In [56]:
b
Out[56]:
array([[1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5]])
In [57]:
a[0,1] = 7
In [58]:
a
Out[58]:
array([[1, 7, 3, 4, 5],
       [1, 2, 3, 4, 5]])
In [59]:
b
Out[59]:
array([[1, 7, 3, 4, 5],
       [1, 2, 3, 4, 5]])
In [60]:
b = a.copy()
In [61]:
a[0,1] = 10
In [62]:
a
Out[62]:
array([[ 1, 10,  3,  4,  5],
       [ 1,  2,  3,  4,  5]])
In [63]:
b
Out[63]:
array([[1, 7, 3, 4, 5],
       [1, 2, 3, 4, 5]])

For nested sequences/lists use deepcopy from copy library

In [64]:
from copy import deepcopy
In [65]:
"""
For example
E.x.
    b = [ 
    np.array(...), 
    np.array(...), 
    [ 
        [ 
            [ np.array(...)] ] ], [] ]
            
a = copy(b)
"""
Out[65]:
'\nFor example\nE.x.\n    b = [ \n    np.array(...), \n    np.array(...), \n    [ \n        [ \n            [ np.array(...)] ] ], [] ]\n            \na = copy(b)\n'

Stack

In [66]:
a = np.array( [ [1,2,3,4,5], [1,2,3,4,5] ] )
In [67]:
a
Out[67]:
array([[1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5]])

Horizontal

In [68]:
np.concatenate( [ a, a ], axis=0 )
Out[68]:
array([[1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5]])
Vertical
In [69]:
np.concatenate( [a,a], axis=1 )
Out[69]:
array([[1, 2, 3, 4, 5, 1, 2, 3, 4, 5],
       [1, 2, 3, 4, 5, 1, 2, 3, 4, 5]])

Exercise:

$ B = \begin{bmatrix} 1 & 4 & 7 \\ 2 & 5 & 8 \\ 3 & 6 & 9 \end{bmatrix} $

Create matrix $[ I_{3} B ]$

In [ ]:
 

Split

In [70]:
a = np.arange( 20 ) + 1
b = np.array_split( a, 5 )
In [71]:
b
Out[71]:
[array([1, 2, 3, 4]),
 array([5, 6, 7, 8]),
 array([ 9, 10, 11, 12]),
 array([13, 14, 15, 16]),
 array([17, 18, 19, 20])]

Random

np.random.random return numbers $\in [0,1)$

In [72]:
b = np.random.random( (10,3) )
In [73]:
b
Out[73]:
array([[0.63809927, 0.91678934, 0.47744116],
       [0.31358575, 0.71933166, 0.81652894],
       [0.658512  , 0.22192909, 0.79475529],
       [0.78700837, 0.93264252, 0.03436544],
       [0.51651081, 0.45760812, 0.04164653],
       [0.37220615, 0.27750414, 0.38176527],
       [0.19814033, 0.60345403, 0.28816499],
       [0.24849064, 0.52231646, 0.61232522],
       [0.4454142 , 0.99145742, 0.95541816],
       [0.61896372, 0.02119171, 0.10469099]])
In [74]:
a = np.random.permutation( 10 )[0:3]
In [75]:
b[ a ]
Out[75]:
array([[0.19814033, 0.60345403, 0.28816499],
       [0.37220615, 0.27750414, 0.38176527],
       [0.658512  , 0.22192909, 0.79475529]])

Dtype

Arrays are defined to be within a data type ex. np.float32, np.float64, np.int8, np.int16, np.32, np.in64

In [76]:
a = np.arange(10).reshape( (2,5) )
b = np.array( [ [2.5,3.2], [6.8,8.9] ] )
In [77]:
a.dtype
Out[77]:
dtype('int64')
In [78]:
b.dtype
Out[78]:
dtype('float64')
In [79]:
c = np.array( [ [2.5,3.2], [6.8,8.9] ], 
             dtype=np.float32 )
In [80]:
c
Out[80]:
array([[2.5, 3.2],
       [6.8, 8.9]], dtype=float32)
In [81]:
d = np.array( [ [3,5], [8,13] ], dtype=np.int8 )
In [82]:
d
Out[82]:
array([[ 3,  5],
       [ 8, 13]], dtype=int8)

To convert to another datatype use astype function

In [83]:
f = d.astype( np.float64 )
In [84]:
f
Out[84]:
array([[ 3.,  5.],
       [ 8., 13.]])
In [85]:
f.dtype
Out[85]:
dtype('float64')

Operations between matrices

Addition

In [86]:
import numpy as np
a = np.arange(25).reshape( (5,5) )
In [87]:
a
Out[87]:
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

As you would expect the additon is executed elemtwise

In [88]:
a + a
Out[88]:
array([[ 0,  2,  4,  6,  8],
       [10, 12, 14, 16, 18],
       [20, 22, 24, 26, 28],
       [30, 32, 34, 36, 38],
       [40, 42, 44, 46, 48]])

Likewise for subtraction

In [89]:
a - a
Out[89]:
array([[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])

Multiplication

Elementwise multiplication

In [90]:
a * a
Out[90]:
array([[  0,   1,   4,   9,  16],
       [ 25,  36,  49,  64,  81],
       [100, 121, 144, 169, 196],
       [225, 256, 289, 324, 361],
       [400, 441, 484, 529, 576]])
In [91]:
np.multiply(a,a)
Out[91]:
array([[  0,   1,   4,   9,  16],
       [ 25,  36,  49,  64,  81],
       [100, 121, 144, 169, 196],
       [225, 256, 289, 324, 361],
       [400, 441, 484, 529, 576]])

For matrix multiplication use np.dot

$$AxB = C$$$$NxM \times MxN = NxN$$
In [92]:
a
Out[92]:
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])
In [93]:
np.dot(a,a)
Out[93]:
array([[ 150,  160,  170,  180,  190],
       [ 400,  435,  470,  505,  540],
       [ 650,  710,  770,  830,  890],
       [ 900,  985, 1070, 1155, 1240],
       [1150, 1260, 1370, 1480, 1590]])

Matrix x Vector

$$A \times x = b$$$$ (N \times m) \times (M\times1) = N \times 1$$
In [94]:
x = np.array( [0,5,10,15,20] ).reshape( (-1,1) )
In [95]:
x
Out[95]:
array([[ 0],
       [ 5],
       [10],
       [15],
       [20]])
In [96]:
np.dot(a, x )
Out[96]:
array([[ 150],
       [ 400],
       [ 650],
       [ 900],
       [1150]])

Vector x Vector

In [97]:
x
Out[97]:
array([[ 0],
       [ 5],
       [10],
       [15],
       [20]])

Dot/Inner/Scalar product

$x^{T} \times x$

In [98]:
np.dot(x.T,x)
Out[98]:
array([[750]])

Elementwise

In [99]:
np.multiply(x,x)
Out[99]:
array([[  0],
       [ 25],
       [100],
       [225],
       [400]])
In [100]:
x * x 
Out[100]:
array([[  0],
       [ 25],
       [100],
       [225],
       [400]])

Broadcast

When there is a mismatch on the array shapes, operation are still possible under certain conditions. For example in the example below the array [4,8,12] is replicated to match the dimension of the matrix its being added to.

broadcast

In [101]:
a = np.zeros( (5,5) )
In [102]:
a
Out[102]:
array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])
In [103]:
b = np.array( [1,2,3,4,5] ).reshape( (-1,1) )
b
Out[103]:
array([[1],
       [2],
       [3],
       [4],
       [5]])

broadcast per row

In [104]:
a + b.T
Out[104]:
array([[1., 2., 3., 4., 5.],
       [1., 2., 3., 4., 5.],
       [1., 2., 3., 4., 5.],
       [1., 2., 3., 4., 5.],
       [1., 2., 3., 4., 5.]])

broadcast per column

In [105]:
a + b.T
Out[105]:
array([[1., 2., 3., 4., 5.],
       [1., 2., 3., 4., 5.],
       [1., 2., 3., 4., 5.],
       [1., 2., 3., 4., 5.],
       [1., 2., 3., 4., 5.]])
In [ ]:
 

Exercise: For each cell calculate its percentage per row

In [106]:
a = np.array( [ [1,1,2], [24,42,12], [10,20,30], [80,40,20] ] ).T
In [107]:
a
Out[107]:
array([[ 1, 24, 10, 80],
       [ 1, 42, 20, 40],
       [ 2, 12, 30, 20]])
In [108]:
#hint use np.sum( ..., axis=0 ) to get the sum per col
In [ ]:
 

Tile

Instead of broadcasting we could repeat the pattern using the np.tile( array, shape )

In [109]:
a = np.array( [1,2,3] ).reshape( (-1,1) )
In [110]:
a
Out[110]:
array([[1],
       [2],
       [3]])
In [111]:
np.tile( a, (1,1) )
Out[111]:
array([[1],
       [2],
       [3]])
In [112]:
np.tile( a, (1,10) )
Out[112]:
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
       [3, 3, 3, 3, 3, 3, 3, 3, 3, 3]])
In [113]:
np.tile( a, (3,1) )
Out[113]:
array([[1],
       [2],
       [3],
       [1],
       [2],
       [3],
       [1],
       [2],
       [3]])
In [114]:
np.tile( a, (2,3) )
Out[114]:
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3],
       [1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])

Other

Determinant

In [115]:
a = np.array( [ [10,20], [30,40] ] )
b = np.linalg.det(a)
print(a)
[[10 20]
 [30 40]]
In [116]:
b
Out[116]:
-200.0000000000001

Inversion

In [117]:
a = np.array( [ [10,20], [30,40] ] )
In [118]:
b = np.linalg.inv( a )
In [119]:
a
Out[119]:
array([[10, 20],
       [30, 40]])
In [120]:
b
Out[120]:
array([[-0.2 ,  0.1 ],
       [ 0.15, -0.05]])
In [121]:
b.shape
Out[121]:
(2, 2)
In [122]:
np.dot( a, b )
Out[122]:
array([[1.00000000e+00, 1.11022302e-16],
       [0.00000000e+00, 1.00000000e+00]])

Argmax/Argmax

In [123]:
a = np.eye(5)
a
Out[123]:
array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])
In [124]:
a[ np.arange(5) ] = a[ np.random.permutation(5) ] 
In [125]:
a
Out[125]:
array([[0., 0., 1., 0., 0.],
       [1., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.],
       [0., 1., 0., 0., 0.]])

This kind of encoding is known as one-hot vector in machine learning. Instead of zeros and ones it could have real values to represent probabilities. Each row represent its percentage to be classified to a specific category(e.x. 5 columns = 5 classes/categories ). So argmax per row would choose the category that has the maximum probability/value.

In [126]:
np.argmax( a, axis=1 )
Out[126]:
array([2, 0, 3, 4, 1])
Where

Given a condition returns the indices per dimension

In [127]:
a = np.arange( 15 ).reshape(3,5)
b = np.where( a <= 5 )
In [128]:
a
Out[128]:
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])
In [129]:
print( 'X', b[0] )
print( 'Y', b[1] )
X [0 0 0 0 0 1]
Y [0 1 2 3 4 0]
In [ ]:
 
In [ ]: