Grokking Algorithms


Why are distributed algorithms useful?



Yüklə 348,95 Kb.
Pdf görüntüsü
səhifə103/122
tarix05.12.2023
ölçüsü348,95 Kb.
#173611
1   ...   99   100   101   102   103   104   105   106   ...   122
grokking-algorithms-illustrated-programmers-curious

Why are distributed algorithms useful?
Suppose you have a table with billions or trillions of rows, and you 
want to run a complicated SQL query on it. You can’t run it on MySQL, 
because it struggles after a few billion rows. Use MapReduce through 
Hadoop!
Or suppose you have to process a long list of jobs. Each job takes 10 
seconds to process, and you need to process 1 million jobs like this. If 
you do this on one machine, it will take you months! If you could run it 
across 100 machines, you might be done in a few days.
Distributed algorithms are great when you have a lot of work to do
and want to speed up the time required to do it. MapReduce in 
particular is built up from two simple ideas: the 
map
function and the 
reduce
function. 
The map function
The 
map
function is simple: it takes an array and applies the same 
function to each item in the array. For example, here we’re doubling 
every item in the array:
>>> arr1 = [1, 2, 3, 4, 5]
>>> arr2 = map(lambda x: 2 * x, arr1)
[2, 4, 6, 8, 10]


210
Chapter 11
 
 
I
 
 
Where to go next
arr2
now contains
[2, 4, 6, 8, 10]
—every element in 
arr1
was 
doubled! Doubling an element is pretty fast. But suppose you apply a 
function that takes more time to process. Look at this pseudocode:
>>> arr1 = # A list of URLs
>>> arr2 = map(download_page, arr1)
Here you have a list of URLs, and you want to download each page and 
store the contents in 
arr2
. This could take a couple of seconds for each 
URL. If you had 1,000 URLs, this might take a couple of hours!
Wouldn’t it be great if you had 100 machines, and 
map
could 
automatically spread out the work across all of them? Then you would 
be downloading 100 pages at a time, and the work would go a lot faster! 
This is the idea behind the “map” in MapReduce.

Yüklə 348,95 Kb.

Dostları ilə paylaş:
1   ...   99   100   101   102   103   104   105   106   ...   122




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©azkurs.org 2024
rəhbərliyinə müraciət

gir | qeydiyyatdan keç
    Ana səhifə


yükləyin