Next: List of Tables
Up: thesis
Previous: Contents
  Contents
- 2.1. Model parameters affiliated with the remote data
transfer
- 2.2. Point-to-point communications with
- 2.3. Tree-based broadcast operation expressed in our
model terminology with
,
,
,
,
,
; however, the full-duplex constraint
cannot be satisfied.
- 2.4. The same broadcast schedule on another cluster configuration
with g reduced from 2 to 1
- 3.1. Performance breakdown of two DP implementations
- Fast Ethernet (FEDP) and Gigabit Ethernet (GEDP) expressed in the
form of our model parameters.
- 3.2. Single-trip latency performance with back-to-back
(BTB) connection
- . The measured
and
values on the GEDP platform under various PCI settings. With
LTXX stands for setting the PCI latency to XX bus
cycles; and BYY stands for the PCI burst size (YY
d-words).
- 3.4. Uni-directional bandwidth performance of DP
- 3.5. Bi-directional bandwidth performance
- 4.1. Go-Back-N protocol with window flow control - (a) state
transition diagram of sender, (b) logic flow of receiver
- 4.2. Evolution of the queue size (
)
over time on an input FIFO queue with congestion loss problem
- 4.3. The measured throughput efficiency of our GBN
reliable transmission protocol as compared to the measured throughput
efficiency of the simple GBN scheme on the IBM 8275-326 switch.
- 4.4. Comparisons of the measured and predicted performance
on the congestion loss problem under input-buffered architecture with
our GBN reliable transmission protocol.
- 4.5. The sample trace of a sender's activities
- 4.6. A typical activity cycle of a sender which is composed
of a sequence of recurrent patterns
- 4.7. 3-state Markov chain model
- 4.8. Events happened in a typical
sequence
- 4.9. Comparing the performance of our GBN scheme
with the simple GBN scheme on the IBM 8275-416 switch
- 4.10. Comparison of the measured and predicted performance
of the IBM 8275-416 switch under heavy congestion loss problem with
our GBN reliable transmission protocol subjected to different timeout
settings
- 4.11. Comparisons of the measured and predicted performance
of the IBM 8275-416 switch with our GBN reliable transmission protocol.
The main focus is on revealing the effects of the P and
parameters on the final performance.
- 4.12. Comparisons of the measured and predicted performance
of the Cisco Catalyst 2980G switch under heavy congestion loss problem
with our GBN reliable transmission protocol
- 4.13. Effect of the timeout (
)
parameter on the throughput efficiency when the network is under heavy
congestion loss problem. The data are collected on the IBM 8275-416
platform with
,
.
- 4.14. A hierarchical network composes of Gigabit
Ethernet and Fast Ethernet switches
- 4.15. The measured and predicted results of the many-to-one
congestion loss problem on the uplink port under our GBN reliable
transmission protocol
- 4.16. The congestion dynamic of the uplink port under
multiple one-way bulk transfers
- 4.17. The congestion dynamic of the uplink port under
multiple bi-directional data transfers
- 5.1. Shift Exchange pattern
- 5.2. Performance of different complete exchange algorithms
with k=64 on a 16 nodes cluster
- . Achievable bandwidth per node for different
algorithms as compared to the optimal prediction
- 5.4. Achieved bandwidth on IBM 8275-326 with store-and-forward
switching
- 5.5. The efficiency of the group shuffle exchange algorithm
as a function of group size
for various
message size k on 16 nodes over the IBM 8275-326.
- 5.6. Comparison of the problem size scalability on
the input-buffered switch (IBM 8275-326) for p=16
- . Comparison of performance between a virtual
cut-through switch (IBM 8275-326) and a store-and-forward switch (IBM
8275-416) on the synchronous shuffle and group shuffle algorithms
- 5.8. Comparing the performance of the synchronous shuffle
complete exchange algorithm with two MPICH implementations
- 6.1. Poll results on the subject: "What
interconnect do you use or would use in your cluster?" (Source:
Clusters@TOP500 [28])
- 6.2. Interconnection topology of the two-level switch
hierarchy
- 6.3. An example permutation in which global windowing alone
fails to regulate the traffic.
- 6.4. The resulting communication pattern after applying
the contention-aware permutation scheme.
- 6.5. Performance of modified synchronous shuffle exchange
on a single input-buffered switch. (Legends: sync - synchronous shuffle;
pair - pairwise; GW - global windowing)
- 6.6. The achieved performance of the 10X2 hierarchical
network under multiple bidirectional message exchanges. (Legends:
cross-switch - measured aggregated bandwidth over the hierarchical
network; local - measured aggregated bandwidth on the single switch)
- 6.7. The performance of modified synchronous shuffle exchange
on the 8X2 configuration - 8 nodes connect to each FE switch,
which is connected to the PR2200. (Legends: GWCA- global windowing
plus contention-aware permutation scheme; CA-contention-aware permutation
scheme only)
- 6.8. Performance of different complete exchange implementations
on the 8X3 hierarchical configuration
- 6.9. Performance of different complete exchange implementations
on the 6X4 hierarchical configuration
- 6.10. Performance of different complete exchange implementations
on the 8X4 hierarchical configuration
- A.1. Bottleneck stage phenomenon
- A.2. Microbenchmark signatures for (a)(b) IBM 8275-326,
(c)(d) Cisco 2980G, and (e)(f) Intel 510T.
- A.3. A saturated inflow pipe
- A.4. The L parametric functions of three
different network configurations for a 32-node cluster - (a) a single
switch (Cisco Catalyst 2980G), (b) a hierarchical network (8x4) and
(c) another hierarchical network (16x2)