Grokking Algorithms



Yüklə 348,95 Kb.
Pdf görüntüsü
səhifə104/122
tarix05.12.2023
ölçüsü348,95 Kb.
#173611
1   ...   100   101   102   103   104   105   106   107   ...   122
grokking-algorithms-illustrated-programmers-curious

The reduce function
The 
reduce
function confuses people sometimes. The idea is that you 
“reduce” a whole list of items down to one item. With 
map
, you go from 
one array to another.
With 
reduce
, you transform an array to a single item.
Here’s an example:
>>> arr1 = [1, 2, 3, 4, 5]
>>> reduce(lambda x,y: x+y, arr1)
15


211
Bloom filters and HyperLogLog
In this case, you sum up all the elements in the array: 
1 + 2 + 3 + 4 
+ 5 = 15
! I won’t explain 
reduce
in more detail here, because there are 
plenty of tutorials online.
MapReduce uses these two simple concepts to run queries about data 
across multiple machines. When you have a large dataset (billions 
of rows), MapReduce can give you an answer in minutes where a 
traditional database might take hours. 
Bloom filters and HyperLogLog
Suppose you’re running Reddit. When someone posts a link, you want 
to see if it’s been posted before. Stories that haven’t been posted before 
are considered more valuable. So you need to figure out whether this 
link has been posted before. 
Or suppose you’re Google, and you’re crawling web pages. You only 
want to crawl a web page if you haven’t crawled it already. So you need 
to figure out whether this page has been crawled before.
Or suppose you’re running bit.ly, which is a URL shortener. You don’t 
want to redirect users to malicious websites. You have a set of URLs that 
are considered malicious. Now you need to figure out whether you’re 
redirecting the user to a URL in that set. 
All of these examples have the same problem. You have a very large set.


212
Chapter 11

Yüklə 348,95 Kb.

Dostları ilə paylaş:
1   ...   100   101   102   103   104   105   106   107   ...   122




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin