CPSC 461: Copyright © 2002 Katrin Becker 1998-2002 Last Modified July 31, 2000 04:16 PM

Polyphase Sort-Merge

PolyPhase Sort-Merge
maximizes merge phase by reducing copying of runs --> reducing the # of merge cycles
 
FIBONACCI MERGE
 
on pth-order Fibonacci Series where p is one less than the number of devices to use in the merge
 
F(p)n = F(p)n-1 + F(p)n-2 + ... + F(p)n-p for n >= p
F(p)n = 0 for 0 <= n <= p-2
F(p)p-1 = 1
when p = 2, F(2)0 = 0
F(2) 1= 1
F(2)n+1 = F(2)n + F(2)n-1 (sum previous 2)
eg. 0, 1, 1, 2, 3, 5, 8, 13, 21, ....
 
3rd-order Fibonacci Series sums previous 3
eg. 0, 0, 1, 1, 2, 4, 7, 13, 24, 44, ....
 
n = # of merge passes required
Let's say p=3; we will do a three-way merge; we will require 4 devices...
If we have 17 runs then the perfect Fibonacci Distribution is 7 runs on device 1, 6 on device 2 and 4 on device 3.
Polyphase No. of Runs on    
  Device 1 Device 2 Device 3 Device 4
Sort Phase 7 6 4 0
Merge Pass 1 3 2 0 4
Merge Pass 2 1 0 2 2
Merge Pass 3 0 1 1 1
Merge Pass 4 1 0 0 0
pass 1 merges 4 runs from each of three devices;
pass 2 merges 2 runs from each of three devices;
pass 3 merges 1 run from each of three devices
 
File of 17 runs requires 4 passes and never merges < 3 runs. Compare with Balanced 3-Way sort/merge
Three-Way-Sort-Merge No. of Runs on
  Device 1 Device 2 Device 3 Device 4 Device 5 Device 6
Sort Phase 6 6 5 0 0 0
Merge Pass 1 0 0 0 2 2 2
Merge Pass 2 1 1 0 0 0 0
Merge Pass 3 0 0 0 1 0 0
 
With Polyphase Merge the distribution of runs among non-empty devices is a perfect nth-level distribution - 1 device always ends before the others and the last merge phase merges one run from each of the non-empty devices into the empty device.
 
n-th level perfect Fibonacci numbers for p=3
LEVEL

n

1
DEVICE

2

3
Total No. of runs
0
1
0
0
1
1
1
1
1
3
2
2
2
1
5
3
4
3
2
9
4
7
6
4
17
5
13
11
7
31
6
24
20
13
57
7
44
37
24
105
8
81
68
44
193
n
an
bn
cn
tn
         
n+1
an+bn
an+cn
an
tn0an
 
In general:
Level
n an bn cn dn en tn T(k)
n+1 an+bn an+cn an+dn an+en an tn+4an T(k-1)
 
 Level  n  an  bn  cn  dn  en  tn  T(k)
   n+1  an+bn  an+cn  an+dn  an+en  an  tn+4an  T(k-1)
 
en = an-1
dn = an-1 + en-1
cn = an-1 + dn-1
bn = an-1 + cn-1
an = an-1 + bn-1; an-1 + an-2 + an-3 + an-4
where a0 = 1 and an = 0 for n = -1, -2, -3, -4
pth order Fibonacci numbers F(p)n = F(p)n-1 + F(p)n-2 + ... + F(p)n-p for n >= p
F(p)n = 0 for 0 <= n <= p-2; F(p)p-1 = 1
 
start with p-1 0’s then 1; use the series from there
 
EXAMPLE:
5 FILE SORT-MERGE USING 4th ORDER FIBONACCI SERIES
F(4) = 0,0,0,1,1,2,4,8,15,29,56,108,208,401,773,...
 
a
b
c
d
 
 
device 1
device 2
device 3
device 4
total
0
1
0
0
0
1
1
1
1
1
1
4
2
2
2
2
1
7
3
4
4
3
2
13
4
8
7
6
4
25
5
15
14
12
8
49
6
29
27
23
15
94
7
56
52
44
29
181
8
108
100
85
56
349
9
208
193
164
108
673
10
401
372
316
208
1297
 

Fibonacci Distribution Sort

number_of_devices := 3
For I := 1 To number_of_devices
fib[i] := 0
runs[i] := 0
fib[1] := 1
level := 1
 
While more runs
For device_k := 1 to number_of_devices
If runs [device_k] < fib [device_k]
write a sorted run on device_k
runs [device_k] := runs [device_k] + 1
{ compute n + 1st-level distribution }
an := fib [1]
For device_k := 1 To number_of_devices - 1
fib [device_k] := an + fib [device_k + 1]
fib [number_of_devices] := an
 
Polyphase merge performs most efficiently when the number of runs coincides exactly with a perfect Fibonacci Distribution. When the number of runs is not the same it can still be used but it is less efficient. By adding dummy runs (empty or null runs), we can ‘fudge’ the required number of runs on each device.
 

Modified Fibonacci Distribution Sort

number_of_devices := 3
For I := 1 To number_of_devices
fib[i] := 0
runs[i] := 0
fib[1] := 1
level := 1
While more runs
For device_k := 1 to number_of_devices
If runs [device_k] < fib [device_k]
write a sorted run on device_k
runs [device_k] := runs [device_k] + 1
 
{ compute n + 1st-level distribution }
an := fib [1]
For device_k := 1 To number_of_devices - 1
fib [device_k] := an + fib [device_k + 1]
fib [number_of_devices] := an
 
{ add dummy runs if necessary}
For device_k := 1 To number_of_devices
While runs [device_k] < fib [device_k]
output dummy run {end_of_run_marker} to device_k
runs [device_k] := runs [device_k] + 1

Back to Top
CPSC 461: Copyright © 2002 Katrin Becker 1998-2002 Last Modified July 31, 2000 04:16 PM