# python numpy

If you learn python based AI software such as tensorflow, you’ll inevitably meet the package Numpy and DataFrame,etc. So you must learn how to use Numpy and DataFrame before you can dig further into the tensorflow related source code.

You can use numpy to create a ndarray object. It is a multi-dimension array. Without numpy, you can also create a multi-dimensional array using python list, like [[1,2,3],[4,5,6],[7,8,9]]. But numpy ndarray provides more methods to handle multi-dimensional array. So you’d better convert your list to a ndarray object first, then call numpy functions to handle it:

```import numpy;

data = [1,2,3,4,5,6]

x = numpy.array(data)

print (x)

print (x.dtype)

print (x.ndim)

print(x.shape)

​

​

[1 2 3 4 5 6]
int32
1
(6,)
```

```import numpy;

data = [[1,2],[3,4],[5,6]]

x = numpy.array(data)

print (x)

print (x.dtype)

print (x.ndim)

print(x.shape)

​

​

[[1 2]
[3 4]
[5 6]]
int32
2
(3, 2)
```

From the above code, we can know that a ndarray object can be created with the numpy.array() function, and its dtype attribute represents the data type of the data contained in it, its ndim attribute represents the number of dimensions of the multi-dimensional array and its shape attribute is a tuple that represents the sizes of each dimension of the array. We know that the elements in a list can have different types, however, the elements in a ndarray object must be the same type.

How to access an element in a ndarray? You can use x[i,j] or x[i][j]. They get the same element. x[i1:i2,j1:j2] gets the elements from row i1 to i2(not include i2), for each row, only gets the elements from column j1 to column j2(not include j2). The number of indices separated by comma should be equal or less than the dimensions of the array. If the number of indices is less than the array dimension, the result would be another ndarray. If the number of indices is the same as the dimension of the array, the result would be the basic data type(dtype) of the ndarray.

It’s interesting that you can also use a list or ndarray as the index of a ndarray, which is called fancy indexing. Typically, the number of indexing arrays is the same as the dimensions of the ndarray being indexed, and the shape of indexing array matches each other. The result is a ndarray with the same shape as the indexing array/list, not the ndarray being indexed. For example, supposing x is a one-dimensional array, x[[1,3]] will select the second element and the fourth element to form the new ndarray. If x is a two dimensional array, x[[1,3],[0,1]] will select x[1,0],x[3,1] to form the resultant one-dimensional ndarray. So x[[1,3],[0,1]] only contains two elements of the original array. If the shapes of indexing arrays do not match, broadcasting rules will apply to some of the indexing arrays, i.e., expanding their dimensions/size to force a match. If the number of indexing arrays is less than the dimensions of the ndarray being indexed, the resultant ndarray has the dimensions of the ndarray being indexed. For example, if x is two dimensional, x[[1,2]] would be two dimensional too, with the second and third rows of x.

The indexing array can be a Boolean array, in this case, the elements of the ndarray being indexed at the same locations where the elements at the indexing array are true, are selected to form the resultant one-dimensional array. So the boolean indexing array must have the same shape as the ndarray being indexed.

When a ndarray is calculated with a scalar, every element in the ndarray will be calculated with the scalar. But if a ndarray is calculated with another ndarray, only corresponding elements in the two ndarrays are calculated.

numpy.where(cond,x,y) is used to create a new ndarray based on the bool cond array. The new array has the same size as cond. If the element in the cond is true, the corresponding  element in the result array takes the value from x, otherwise, takes the value from y.