In the first part of the series you learned how to install numpy and you saw an example of a numpy array. Today we’ll be talking about numpy arrays in more detail. We’ll see how they can be created.
Creating numpy Arrays from Lists and Tuples
In the previous part created a numpy array from a regular Python list. Here’s again how we can do it:
>>> import numpy as np
>>> nums = [5, 3, 8, 2, 0, 9, 4, 1, 6]
>>> N = np.array(nums)
>>> print(N)
[5 3 8 2 0 9 4 1 6]
We can convert a tuple to a numpy array in the same way:
>>> nums = (5, 3, 8, 2, 0, 9, 4, 1, 6)
>>> N = np.array(nums)
>>> print(N)
[5 3 8 2 0 9 4 1 6]
The arange Function
We can create numpy arrays in quite a few other ways as well. Let’s start with arrays containing evenly spaced values. We can use the arange or the linspace function to do that. Let’s have a look at the former first.
In the following examples I’ll assume that you have already imported the numpy module.
The arange function is very much like the built-in range function as far as syntax is concerned. But not identical. We can call it with 1, 2, 3 or 4 arguments. The syntax is:
arange([start,] stop[, step][,dtype])
So, stop is the only mandatory parameter. It’s usually not included in the array. If it’s the only argument, the start parameter is 0. The arange function returns a numpy array, which is of type ndarray to be exact:
>>> a = np.arange(10)
>>> print(a)
[0 1 2 3 4 5 6 7 8 9]
>>> type(a)
<class 'numpy.ndarray'>
>>> b = np.arange(5, 15)
>>> print(b)
[ 5 6 7 8 9 10 11 12 13 14]
We don’t have to limit ourselves to use integers as the start and stop parameters. Have a look at an example with floats:
>>> c = np.arange(8.36)
>>> print(c)
[0. 1. 2. 3. 4. 5. 6. 7. 8.]
So, if we don’t pass the start argument, it’ll have the default value of 0 and the elements will be now floats between 0 and 8.36 with the default step of 1.
If we use both start and stop, we’ll get floats between the two and not exceeding stop, with the default step of 1:
>>> d = np.arange(2.3, 12.5)
>>> print(d)
[ 2.3 3.3 4.3 5.3 6.3 7.3 8.3 9.3 10.3 11.3 12.3]
The optional step parameter defaults to 1. We can change this value to any integer or float value:
>>> e = np.arange(6.72, 93.12, 10)
>>> print(e)
[ 6.72 16.72 26.72 36.72 46.72 56.72 66.72 76.72 86.72]
>>> f = np.arange(3.45, 3.47, 0.004)
>>> print(f)
[3.45 3.454 3.458 3.462 3.466 3.47 ]
We use the last parameter, dtype, to specify the type of the output array. If you omit it, the type is inferred from the input arguments. Here’s how we can use it:
>>> g = np.arange(2, 10, 2, float)
>>> print(g)
[2. 4. 6. 8.]
As you can see, here the array contains floats.
The linspace Function
Now, the other function that returns an array with evenly spaced values is linspace.
The syntax of this function is:
linspace(start, stop, num = 50, endpoint = True, retstep = False, dtype = None)
The start and stop parameters are pretty straightforward. One difference is that by default the stop element will also be included in the array. This is because by default the endpoint parameter is True. If you set it to False, the stop element will not be included, just like in the arange function or the built-in range function.
Now, by default the function returns 50 evenly spaced elements between start and stop. If you need a different number of elements, you can set the num parameter to the number you want. One thing to bear in mind is that the step is calculated automatically, so depending on whether the stop element is included or not, the step will be different. Here are some examples to demonstrate what we’re talking about:
Here we have 50 elements between 1 and 2, both 1 and 2 included:
>>> print(np.linspace(1, 2))
[1. 1.02040816 1.04081633 1.06122449 1.08163265 1.10204082
1.12244898 1.14285714 1.16326531 1.18367347 1.20408163 1.2244898
1.24489796 1.26530612 1.28571429 1.30612245 1.32653061 1.34693878
1.36734694 1.3877551 1.40816327 1.42857143 1.44897959 1.46938776
1.48979592 1.51020408 1.53061224 1.55102041 1.57142857 1.59183673
1.6122449 1.63265306 1.65306122 1.67346939 1.69387755 1.71428571
1.73469388 1.75510204 1.7755102 1.79591837 1.81632653 1.83673469
1.85714286 1.87755102 1.89795918 1.91836735 1.93877551 1.95918367
1.97959184 2. ]
The same, but this time 2 is not included. Watch how the step is now different:
>>> print(np.linspace(1, 2, endpoint = False))
[1. 1.02 1.04 1.06 1.08 1.1 1.12 1.14 1.16 1.18 1.2 1.22 1.24 1.26
1.28 1.3 1.32 1.34 1.36 1.38 1.4 1.42 1.44 1.46 1.48 1.5 1.52 1.54
1.56 1.58 1.6 1.62 1.64 1.66 1.68 1.7 1.72 1.74 1.76 1.78 1.8 1.82
1.84 1.86 1.88 1.9 1.92 1.94 1.96 1.98]
And now 4 elements between 1 and 2:
>>> print(np.linspace(1, 2, 4))
[1. 1.33333333 1.66666667 2. ]
And if we exclude the stop value of 2, we get:
>>> print(np.linspace(1, 2, 4, endpoint = False))
[1. 1.25 1.5 1.75]
We can also explicitly set the value of the dtype parameter, just like with the arange function.
This is what we get if we let the dtype be inferred from input arguments:
>>> print(np.linspace(10, 100, 20))
[ 10. 14.73684211 19.47368421 24.21052632 28.94736842
33.68421053 38.42105263 43.15789474 47.89473684 52.63157895
57.36842105 62.10526316 66.84210526 71.57894737 76.31578947
81.05263158 85.78947368 90.52631579 95.26315789 100. ]
And now let’s set dtype to int:
>>> print(np.linspace(10, 100, 20, dtype = int))
[ 10 14 19 24 28 33 38 43 47 52 57 62 66 71 76 81 85 90
95 100]
There is one more parameter: retstep. If we set it to True, the function will return a 2-tuple containing the array and the step:
>>> print(np.linspace(3, 7, 20, retstep = True))
(array([3. , 3.21052632, 3.42105263, 3.63157895, 3.84210526,
4.05263158, 4.26315789, 4.47368421, 4.68421053, 4.89473684,
5.10526316, 5.31578947, 5.52631579, 5.73684211, 5.94736842,
6.15789474, 6.36842105, 6.57894737, 6.78947368, 7. ]), 0.21052631578947367)
or we can use tuple assignment:
>>> arr, step = np.linspace(8, 20, 10, retstep = True)
>>> print(arr)
[ 8. 9.33333333 10.66666667 12. 13.33333333 14.66666667
16. 17.33333333 18.66666667 20. ]
>>> print(step)
1.3333333333333333
You can also watch the video version: