Python Itertools Python

Oct 15th, 2021 - written by Kimserey with .

Python itertools module provides a set of iterator blocks that can be used to combine iterators into a new iterator which will apply some modifications during the iteration of the sequences. For example building blocks like cycle allows infinitely cycling through a sequence, another example is groupby which provides a new iterator giving the groups on each iteration. In today’s post, we will look at some of the functions provided by itertools with examples.

Infinite iterators

The first type of iterators we will be looking at are the infinite iterators.

count()

count provides a sequential infinite integer iterator:

1
2
3
4
5
6
7
8
9
10
11
In [5]: for i in count():
   ...:     if i > 5:
   ...:         break
   ...: 
   ...:     print(i)
0
1
2
3
4
5

We can see that we can keep iterating infinitely until we decide to break out of the loop.

cycle()

cycle provides a way to infinitely cycle through an iterable.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
In [6]: i = 0
   ...: 
   ...: for c in cycle(['A', 'B']):
   ...:     if i > 5:
   ...:         break
   ...: 
   ...:     print(c)
   ...:     i += 1
A
B
A
B
A
B

We can see that we cycle through A, B until we break out of the loop.

repeat()

repeat will repeat the same provided value.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
In [8]: i = 0
   ...: 
   ...: for c in repeat('A'):
   ...:     if i > 5:
   ...:         break
   ...: 
   ...:     print(c)
   ...:     i += 1
A
A
A
A
A
A

We can see that ‘A’ is infinitely repeated until we break out of the loop.

Iterators terminating on the shortest input sequence

The next type of iterators we will be looking at iterators terminating on the shortest input sequence.

accumulate()

accumulate will accumulate the result on each iteration.

1
2
3
4
5
6
In [3]: for c in accumulate([1, 2, 3, 4]):
   ...:     print(c)
1
3
6
10

Here we use it to get an iterator of the total sum, but we could use it with string and char to append on each iteration.

1
2
3
4
5
6
In [2]: for c in accumulate(['A', 'B', 'C']):
   ...:     print(c)
   ...: 
A
AB
ABC

chain()

chain will chain the iterators into a single iterators:

1
2
3
4
5
6
7
In [4]: for c in chain(['A', 'B'], ['C', 'D']):
   ...:     print(c)
   ...: 
A
B
C
D

compress()

compress will return only the elements that are judged as valid by the second selector sequence:

1
2
3
4
5
In [6]: for c in compress(['A', 'B', 'C'], [1, 0, 1]):
   ...:     print(c)
   ...: 
A
C

We only get A and C because the selector was 1 for them. We could also use True/False rather than 1/0.

dropwhile()

dropwhile will apply the function provided at each iteration and drop the elements until the function returns True.

1
2
3
4
5
6
In [4]: for c in dropwhile(lambda x: x < 5, [1, 3, 5, 10, 3]):
   ...:     print(c)
   ...: 
5
10
3

We can see that we drop all elements while x < 5. One thing to note is that once you stop dropping, the iterator then just iterate normally on the element. We can see that the last value 3 was still returned.

takewhile()

takewhile is the opposite of dropwhile and takes all element until the function provided is False.

1
2
3
4
5
In [8]: for c in takewhile(lambda x: x < 5, [1, 3, 5, 10]):
   ...:     print(c)
   ...: 
1
3

Here we take while x<5 after that we stop taking.

filterfalse()

filterfalse filters what is false. It returns the elements where the function provided return False.

1
2
3
4
5
6
In [10]: for c in filterfalse(lambda x: x == 5, [1, 3, 5, 10]):
    ...:     print(c)
    ...: 
1
3
10

groupby()

groupby will group the elements using the function key provided:

1
2
3
4
5
6
7
8
9
10
In [10]: x = [(1, "hello"), (1, "bye"), (2, "Test")]

In [11]: from itertools import groupby

In [12]: for k, g in groupby(x, lambda v: v[0]):
    ...:     print(k, list(g))
    ...: 
    ...: 
1 [(1, 'hello'), (1, 'bye')]
2 [(2, 'Test')]

We group by the first value of our tuple and we can see that we get an iterable with two values which will be the groups.

tee()

tee allows us to create copies of the iterable.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
In [17]: from itertools import count

In [18]: itcount = count()

In [19]: for c in itcount:
    ...:     if c > 5:
    ...:         break
    ...:     print(c)
    ...: 
0
1
2
3
4
5

In [20]: for c in itcount:
    ...:     if c > 10:
    ...:         break
    ...:     print(c)
    ...: 
7
8
9
10

In [21]: count1, count2 = tee(itcount, 2)

In [22]: for c in count1:
    ...:     if c > 15:
    ...:         break
    ...:     print(c)
    ...: 
12
13
14
15

In [23]: for c in count2:
    ...:     if c > 15:
    ...:         break
    ...:     print(c)
    ...: 
12
13
14
15

We can see that after creating a count iterable, the values returned continue to increase as we continue iterating, even on different loops. If we need two separate iterators, we can use tee to create two or more iterators which will then allow us to iterate them independenly of each other.

zip_longest()

zip_longest allows us to zip but for the shorter sequence, we will be albe to fill with a default value - compared to zip which will stop at the shortest iterable.

1
2
3
4
5
6
7
8
In [31]: for c in zip_longest([1, 2, 3, 4, 5, 6], [10, 20, 30, 40], fillvalue=0):
    ...:     print(c)
(1, 10)
(2, 20)
(3, 30)
(4, 40)
(5, 0)
(6, 0)

5 and 6 didn’t have any equivalent to zip on the other sequence therefore it was filled with fillvalue=0.

Combinatoric iterators

Lastly we will look at the combinatoric iterators.

product()

product will provide the product between two iterables.

1
2
3
4
5
6
7
8
9
10
11
12
In [37]: for c in product('ABC', 'DEF'):
    ...:     print(c)
    ...: 
('A', 'D')
('A', 'E')
('A', 'F')
('B', 'D')
('B', 'E')
('B', 'F')
('C', 'D')
('C', 'E')
('C', 'F')

We can see that we get all product from the two sequaneces, AD, AE, AF, BD, etc…

permutations()

permutations will return all the permutations possible given a sequence:

1
2
3
4
5
6
7
8
9
In [38]: for c in permutations('ABC'):
    ...:     print(c)
    ...: 
('A', 'B', 'C')
('A', 'C', 'B')
('B', 'A', 'C')
('B', 'C', 'A')
('C', 'A', 'B')
('C', 'B', 'A')

We can also specify a length so that we get all permutations for a specific length.

1
2
3
4
5
6
7
8
9
In [39]: for c in permutations('ABC', 2):
    ...:     print(c)
    ...: 
('A', 'B')
('A', 'C')
('B', 'A')
('B', 'C')
('C', 'A')
('C', 'B')

combinations()

combinations will return all the combinations for the specific sequence:

1
2
3
4
5
6
In [45]: for c in combinations('ABC', 2):
    ...:     print(c)
    ...: 
('A', 'B')
('A', 'C')
('B', 'C')

The difference between permutation and combination is that permutation is an arangement of the elements where the order matters; AB and BA would be different permutations, while combination is a selections where the order doesn’t matter, AB and BA would be the same combination.

We can see that we have in total 3 combinations of length 2 for ABC while we have 6 permutations of length 2 for ABC.

combinations_with_replacement()

With replacement means that after each pick, we can pick back the same value.

1
2
3
4
5
6
7
8
9
In [46]: for c in combinations_with_replacement('ABC', 2):
    ...:     print(c)
    ...: 
('A', 'A')
('A', 'B')
('A', 'C')
('B', 'B')
('B', 'C')
('C', 'C')

So here was can have AA as a combination with replacement. And that concludes today’s post!

Conclusion

Today we looked at itertools module, a Python module providing a set of iterator blocks used to combine iterables and construct new iterables. We started by looking at infinite iterators, then moved on to look at finite iterators and completed this post with combinatoric iterators. I hope you liked this post and I see you on the next one!

Designed, built and maintained by Kimserey Lam.