next up previous contents
Up: thesis Previous: A.5 Microbenchmark for the   Contents

Bibliography

1
IEEE 802.3x.
Annex 31B MAC Control PAUSE operation, IEEE Std 802.3, 2000 edition.

2
Alhussein A. Abouzeid, Sumit Roy, and Murat Azizoglu.
Stochastic modeling of TCP over lossy links.
In Proceedings of INFOCOM 2000, pages 1724-1733. IEEE, 2000.

3
Vikram S. Adve.
Analyzing the Behavior and Performance of Parallel Programs.
PhD thesis, Department of Computer Sciences, University of Wisconsin-Madison, December 1993.

4
A. Alexandrov, M. Ionescu, K. Schauser, and C. Scheiman.
LogGP: Incorporating Long Messages into the LogP Model - One Step Closer Towards a Realistic Model for Parallel Computation.
In Proceedings of Symposium on Parallel Algorithms and Architectures (SPAA), pages 95-105, July 1995.

5
10 Gigabit Ethernet Alliance.
http://www.10gea.org/index.htm.

6
InfiniBand Trade Association.
http://www.infinibandta.org/home.php3.

7
Mark Baker and Rajkumar Buyya.
Cluster Computing at a Glance, volume I of High Performance Cluster Computing, chapter 1, pages 246-269.
Prentice Hall PTR, 1999.

8
A. Bar-Noy and S. Kipnis.
Designing broadcasting algorithms in the postal model for message passing systems.
In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures, pages 13-22, June 1992.

9
Ammon Barak, Ilia Gilderman, and Igor Mctrik.
Performance of the Communication Layers of TCP/IP with the Myrinet Gigabit LAN.
Computer Communication, 22(11), Jul 1999.

10
D. Bertsekas, C. Ozveren, G. Stamoulis, P. Tseng, and J. Tsitsiklis.
Optimal communication algorithms for hypercubes.
Journal of Parallel and Distributed Computing, 11:263-275, 1991.

11
Dimitri Bertsekas and Robert Gallager.
Data Networks.
Prentice-Hall International, Inc., second edition, 1992.

12
Raoul A. F. Bhoedjang, Tim Rühl, and Henri E. Bal.
User-Level Network Interface Protocols.
IEEE Computer, 31(11):53-60, 1998.

13
Raoul A.F. Bhoedjang.
Communication Architectures for Parallel-Programming Systems.
PhD thesis, Department of Computer Science, Vrije Universiteit, June 2000.

14
G. Bilardi, K.T. Herley, A. Pietracaprina G. Pucci, and P. Spirakis.
BSP vs LogP.
In 8th ACM Symposium on Parallel Algorithms and Architectures, pages 25-32, June 1996.

15
N.J. Boden, D. Cohen, R.E. Felderman, A.E. Kulawik, C.L. Seitz, J.N. Seizovic, and Wen-King Su.
Myrinet: A Gigabit-per-Second Local Area Network.
IEEE Micro, 15(1):29-36, 1995.

16
S. Bokhari.
Multiphase complete exchange: a theoretical analysis.
IEEE Transactions on Computers, 45(2):220-229, February 1996.

17
S. Bokhari.
Multiphase complete exchange on paragon, sp2, and cs-2.
IEEE Parallel and Distributed Technology, 4(3):45-59, Fall 1996.

18
S.H. Bokhari and D.M. Nicol.
Balancing contention and synchronization on the Intel Paragon.
IEEE Concurrency, 5(2):74-83, 1997.

19
J. Bruck, Ching-Tien Ho, S. Kipnis, E. Upfal, and D. Weathersby.
Efficient Algorithms for All-to-All Communications in Multiport Message-Passing Systems.
IEEE Transactions on Parallel and Distributed Systems, 8(11):1143-1156, 1997.

20
C. Chiola and G. Ciaccio.
Lightweight Messaging Systems, volume I of High Performance Cluster Computing, chapter 10, pages 246-269.
Prentice Hall PTR, 1999.

21
G. Chiola and G. Ciaccio.
Gamma: a low-cost network of workstations based on active messages.
In Proceedings of the 5th EUROMICRO workshop on Parallel and Distributed Processing PDP'97, January 1997.

22
G. Chiola, G. Ciaccio, L. V. Mancini, and P. Rotondo.
Gamma on dec 2114x with efficient flow control.
In Proceedings of the 1999 International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'99), June 1999.

23
Brent N. Chun, Alan M. Mainwaring, and David E. Culler.
Virtual Network Transport Protocols for Myrinet.
IEEE Micro, 18(1):53-63, 1998.

24
David D. Clark, John Romkey, and Howard Salwen.
An analysis of TCP processing overhead.
IEEE Communications Magazine, 27(6):23-29, June 1989.

25
The Asgard Cluster.
http://www.asgard.ethz.ch/.

26
The Biopendium Cluster.
http://www.inpharmatica.co.uk/biopdetail.htm.

27
The CLiC Cluster.
http://www.tu-chemnitz.de/urz/anwendungen/CLIC/.

28
Cluster@TOP500.
http://clusters.top500.org/.

29
Mark Edward Crovella.
Performance Prediction and Tuning of Parallel Programs.
PhD thesis, Department of Computer Science, The College Arts and Sciences, University of Rochester, 1994.

30
D. E. Culler, R. M. Karp, D. A. Patterson, A. Sahay, K. E. Schauser, E. Santos, R. Subramonian, and T. von Eicken.
LogP: Towards a realistic model of parallel computation.
In Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, May 1993.

31
David E. Culler, Lok Tin Liu, Richard P. Martin, and Chad O. Yoshikawa.
Assessing Fast Network Interfaces.
IEEE Micro, 16(1):35-43, February 1996.

32
David E. Culler, Jaswinder Pal Singh, and Anoop Gupta.
Parallel Computer Architecture: A Hardware/Software Approach.
Morgan Kaufmann, 1999.

33
V. Dimakopoulos and N. Dimopoulos.
A theory for total exchange in multidimensional interconnection networks.
IEEE Transactions on Parallel and Distributed Systems, 9(7):639-649, July 1998.

34
S. Donaldson, J. Hill, and D. Skillicorn.
Exploiting Global Structure for Performance on Clusters.
In Proceedings of IPPS/SPDP'99, pages 176-182, 1999.

35
Jose Duato, Sudhakar Yalmanchili, and Lionel Ni.
Interconnection Networks: An Engineering Approach.
IEEE Computer Society, 1997.

36
Sally Floyd and Van Jacobson.
Traffic phase effects in packet-switched gateways.
Journal of Internetworking:Practice and Experience, 3(3):115-156, Sept. 1992.

37
Steven Fortune and James Wyllie.
Parallelism in Random Access Machines.
In Proceedings of the Tenth Annual ACM Symposium on Theory of Computing, pages 114-118, 1978.

38
Massing Passing Interface Forum.
MPI: A Message-Passing Interface Standard.
International Journal of Supercomputing Applications, 8(3/4), 1994.

39
Giganet.
http://wwwip.emulex.com/ip/index.html/.

40
J. Gross and J. Yellen.
Graph Theory and its Applications.
CRC press, 1999.

41
S. Hambrusch and A. Khokhar.
An architecture-independent model for coarse-grained parallel machines.
In Proceedings of the 6th IEEE Symposium on Parallel and Distributed Processing, pages 544-551, Oct 1994.

42
Mathias Hein and David Griffiths.
Switching technology in the local network : from LAN to switched LAN to virtual LAN.
International Thomson Computer Press, 1997.

43
T. Heywood and C. Leopold.
Models of parallelism.
In J.R. Davy and P.M. Dew, editors, Abstract Machine Models for Highly Parallel Computers. Oxford Univ. Press, 1995.

44
Michael G. Hluchyj and Mark J. Karol.
Queueing in High-Performance Packet Switching.
IEEE Journal on Selected Areas in Communications, 6(9):1587-1597, December 1988.

45
R. W. Hockney.
The Science of Computer Benchmarking.
SIAM Publications, 1996.

46
Kai Hwang and Zhiwei Xu.
Scalable Parallel Computing.
McGraw-Hill, 1998.

47
G. Iannello.
Efficient Algorithms for the Reduce-Scatter Operation in LogGP.
IEEE Transactions on Parallel and Distributed Systems, 8(9):970-982, 1997.

48
Intel.
PCI - Efficient Use.
Web document, Apr 1997.
http://support.intel.com/support/chipsets/pc1001.htm.

49
Intel.
Balanced Server Platform Design.
Presentation material, 1998.
http://developer.intel.com/software/asc/documents/19.IDF-ESGS4.pdf.

50
Dean L. Isaacson and Richard W. Madsen.
Markov Chains: Theory and Applications.
John Wiley & Sons, 1976.

51
V. Jacobson.
Congestion avoidance and control.
In Proceedings of ACM SIGCOMM 88, pages 314-329, 1988.

52
Wagner Meira Junior.
Understanding Parallel Program Performance Using Cause-Effect Analysis.
PhD thesis, Department of Computer Science, The College Arts and Sciences, University of Rochester, 1997.

53
R. Karp, A. Sahay, E. Santos, and K. Schauser.
Optimal broadcast and summation in the logp model.
In Proceedings of Symposium on Parallel Algorithms and Architectures (SPAA), pages 142-153, June 1993.

54
Jonathan Kay and Joseph Pasquale.
The importance of non-data touching processing overheads in TCP/IP.
In Proceedings of ACM SIGCOMM 93, pages 259-268, 1993.

55
K. Keeton, T. Anderson, and D. Patterson.
Logp quantified: The case for low-overhead local area networks.
In Hot Interconnects III: A Symposium on High Performance Interconnects, Aug. 1995.

56
T. V. Lakshman and Upamanyu Madhow.
The performance of TCP/IP for networks with high bandwidth-delay products and random loss.
IEEE/ACM Transactions on Networking, 5(3):336-350, June 1997.

57
C. Lam, C. Huang, and P. Sadayappan.
Optimal Algorithms for All-to-all Personalized Communication on Rings and Two Dimensional Tori.
Journal of Parallel and Distributed Computing, 43:3-13, 1997.

58
M. Lauria, S. Pakin, and A. Chien.
Efficient layering for high speed communication: Fast messages 2.x.
In Proceedings of the 7th High Performance Distributed Computing Conference (HPDC7), July 1998.

59
C.M. Lee, A.T.C. Tam, and C.L. Wang.
Directed point: An efficient communication subsystem for cluster computing.
In International Conference on Parallel and Distributed Computing Systems (IASTED), Oct. 1998.

60
Tsern-Huei Lee.
The throughput efficiency of go-back-n arq scheme for burst-error channels.
In Proceedings of INFOCOM'91, pages 773-780. IEEE, 1991.

61
C. H. C. Leung, Y. Kikumoto, and S. A. Sorensen.
The throughput efficiency of the go-back-n arq scheme under markov and related error structures.
IEEE Transactions on Communications, 36(2):231-234, February 1988.

62
P. Marenzoni, Giovanni Rimassa, Massimo Bertozzi, Gianni Conte, and P. Rossi.
An operating system support to low-overhead communications in NOW clusters.
In Proceedings of Communication, Architecture, and Applications for Network-Based Parallel Computing (CANPC97), pages 130-143, February 1997.

63
M. Mathis, J. Semke, J. Mahdavi, and T. Ott.
The macroscopic behavior of the TCP congestion aviodance algorithm.
Computer Communication Review, 27(3), July 1997.

64
R. McClellan.
Evaluating expected network performance based on multilayer switch performance data.
Technical report, McClellan Consulting, August 2000.

65
W. McColl.
BSP programming.
In Proceedings of DIMACS Workshop on Specification of Parallel Algorithms, pages 25-35, May 1994.

66
W. McColl.
The bsp approach to architecture independent parallel programming.
Technical report, Oxford University Computing Laboratory, Dec. 1995.

67
Csaba Andras Moritz and Matthew Frank.
LoGPC: Modeling Network Contention in Message-Passing Programs.
In Measurement and Modeling of Computer Systems, pages 254-263, 1998.

68
MPICH.
MPICH-A Portable Implementation of MPI.
http://www-unix.mcs.anl.gov/mpi/mpich/.

69
Myricom.
http://www.myri.com/.

70
J. Nash, P. Dew, and M. Dyer.
Scalable and Portable Computing Using the WPRAM Model.
In M. Kara, J. Davy, D. Goodeve, and J. Nash, editors, Abstract Machine Models for Parallel and Distributed Computing, pages 47-62. IOS Press, April 1996.

71
Extreme Networks.
http://www.extremenetworks.com/products/.

72
W. Noureddine and F. Tobagi.
Selective back-pressure in switched Ethernet lans.
In Proceedings of the Globecom'99, 1999.

73
Nupairoj, Ni, Park, and Choi.
Architecture-Dependent Tuning of the Parameterized Communication Model for Optimal Multicasting.
In IPPS: 11th International Parallel Processing Symposium, pages 578-582. IEEE Computer Society Press, 1997.

74
J. Padhye, V. Firoiu, D. Towsley, and J. Kurose.
Modeling tcp throughput: A simple model and its empirical validation.
In Proceedings of ACM SIGCOMM, Sept. 1998.

75
Scott Pakin, Vijay Karamcheti, and Andrew A. Chien.
Fast Messages: Efficient, Portable Communication for Workstation Clusters and MPPs.
IEEE Concurrency, 5(2):60-73, 1997.

76
J.-Y.L. Park, H.-A. Choi, N. Nupairoj, and L.M. Ni.
Construction of Optimal Multicast Trees Based on the Parameterized Communication Model.
In Proceedings of the 1996 International Conference on Parallel Processing, pages 180-187, Aug 1996.

77
K. Park.
Warp Control: a Dynamically Stable Congestion Protocol and Its Analysis.
Journal of High Speed Networks, 2(4):373-404, 1993.

78
PCI-X.
The PCI Special Interest Group (SIG) .
http://www.pcisig.com/.

79
Stefan Petri, Gunther Lustig, Andreas Döring, and Rainer Hagenau.
The impact of switch technologies on high performance cluster communication.
Technical report, Institut für Technische Informatik, Medizinische Universität zu Lübeck, Germany, 2000.

80
Gregory F. Pfister.
In Search Of Clusters.
Prentice Hall, 1995.

81
R. Ponnusamy, R. Thakur, A. Choudhary, and G. Fox.
Scheduling regular and irregualar patterns on the CM-5.
In Proceedings of Supercomputing '92, pages 394-402, November 1992.

82
L. Prylli and B. Tourancheau.
BIP: a new protocol designed for high performance networking on Myrinet.
In PC-NOW Workshop, IPPS/SPDP98, March 1998.

83
Loic Prylli and Bernard Tourancheau.
BIP: A New Protocol Designed for High Performance Networking on Myrinet.
In IPPS/SPDP Workshops, pages 472-485, 1998.

84
L. Qiu, Y. Zhang, and S. Keshav.
On individual and aggregate TCP performance.
In Proceedings of the Seventh Annual International Conference on Network Protocols, pages 203-212. IEEE, Nov. 1999.

85
RFC793.
Transmission control protocol - protocol specification.
Internet Archives, Sept 1981.

86
J.L. Roda, F. Sande, C. León, J.A. González, and C. Rodriguez.
The Collective Computing Model.
In Proceedings of the Seventh Euromicro Workshop on Parallel and Distributed Processing PDP'99, pages 19-26, Feb 1999.

87
A. Roy, I. Foster, W. Gropp, N. Karonis, V. Sander, and B. Toonen.
MPICH-GQ: Quality-of-Service for Message Passing Programs.
In Proceedings of the IEEE/ACM SC2000 Conference, 2000.

88
S. Shibusawa, H. Makino, S. Nimiya, and J. Hatta.
Scatter and Gather Operations on an Asynchronous Communication Model.
In ACM Symposium on Applied Computing, Mar 2000.

89
M. Sidi, W. Z. Liu, I. Cidon, and I. Gopal.
Congestion control through input rate regulation.
IEEE Transactions on Communications, 41(3):471-477, 1993.

90
D. Skillicorn and D. Talia.
Models and Languages for Parallel Computation.
ACM Computing Surveys, 30(2):123-169, Jun 1998.

91
John D. Spragins, Joseph L. Hammond, and Krzysztof Pawlikowski.
Telecommunications - Protocols and Design.
Addison Wesley, 1992.

92
William Stallings.
Data & Computer Communications.
Prentice Hall, sixth edition, 2000.

93
W. Stevens.
TCP slow start, congestion avoidance, fast retransmit, and fast recovery algorithms.
Internet Archives RFC 2001, January 1997.

94
Y.J. Suh and S. Yalamanchili.
All-to-all communication with minimum start-up costs in 2D/3D tori and meshes.
IEEE Transactions on Parallel and Distributed Systems, 9(5):442-458, 1998.

95
S. Sumimoto, A. Hori, H. Tezuka, H. Harada, T. Takahashi, and Y. Ishikawa.
GigaE PM II: Design of High Performance Communication Library using Gigabit Ethernet, 1999.
http://pdswww.rwcp.or.jp/db/paper-J/1999/swopp99/sumi/sumi.html.

96
Shinji Sumimoto, Hiroshi Tezuka, Atsushi Hori, Hiroshi Harada, Toshiyuki Takahashi, and Yutaka Ishikawa.
The design and evaluation of high performance communication using a gigabit Ethernet.
In International Conference on Supercomputing `99, pages 243-250. ACM SIGARCH, June 1999.

97
V. S. Sunderam.
PVM: A Framework for Parallel Distributed Computing.
Concurrency, Practice and Experience, 2:315-340, 1990.

98
Cisco Systems.
http://www.cisco.com/warp/public/44/jump/switches.shtml.

99
Cisco Systems.
Catalyst 2980G/2980G-A Enterprise Desktop Switches.
Data Sheet, 2001.

100
Anthony T.C. Tam and Cho-Li Wang.
Realistic Communication Model for Parallel Computing on Cluster.
In Proceedings of the 1st IEEE International Workshop on Cluster Computing (IWCC'99), December 1999.

101
H. Tezuka, A. Hori, Y. Ishikawa, and M. Sato.
PM: A operating system coordinated high performance communication library.
In High-Performance Computing and Networking '97, 1997.

102
H. Tezuka, F. O'Carroll, A. Hori, and Y. Ishikawa.
Pin-down Cache: A Virtual Memory Management Technique for Zero-copy Communication.
In IPPS/SPDP'98, pages 308-315, 1998.

103
Takayoshi Touyama and Susumu Horiguchi.
Performance Evaluation of Practical Parallel Computation Model LogPQ.
In Proceedings of the 4th International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN'99), pages 216-221, 1999.

104
Yu-Chee Tseng and Sandeep K. S. Gupta.
All-to-all personalized communication in a wormhole-routed torus.
IEEE Transactions on Parallel and Distributed Systems, 7(5):498-505, 1996.

105
L. Valliant.
A bridging model for parallel computation.
Communications of the ACM, 33(8):103-111, Aug. 1990.

106
M. Verma and T. Chiueh.
Pupa: A low-latency communication system for fast Ethernet.
Technical report, Computer Science Department, State University of New York at Stony Brook, Apr. 1998.

107
VIA.
http://www.viarch.org/.

108
T. von Eicken, A. Basu, V. Buch, and W. Vogels.
U-Net: A user-level network interface for parallel and distributed computing.
In Proceedings of the 15th ACM Symposium on Operating Systems Principles, December 1995.

109
T. von Eicken, D. Culler, S. Goldstein, and K. Schauser.
Active messages: A mechanism for integrated communication and computation.
In Proceedings of the Nineteenth International Symposium on Computer Architecture. ACM Press, 1992.

110
J. Walrand and P. Varaiya.
High-Performance Communication Networks.
Morgan Kaufmann Publishers, 1st edition, 1996.

111
C.L. Wang, A. Tam, B. Cheung, W.Z Zhu, and C.M. Lee.
Directed point: High performance communication subsystem for gigabit networking in clusters.
Journal of Future Generation Computer Systems, 18(4), 2001.

112
Jonathan L. Wang.
Impact of self-similarity on the Go-Back-N ARQ protocols.
In Proceedings of Fourth International Conference on Computer Communications and Networks, pages 250-257, Sept 1995.

113
T.M. Warschko, J.M. Blum, and W.F. Tichy.
A reliable transmission protocol for Myrinet.
In Proceedings of the 2nd Workshop on Cluster-Computing, March. 1999.

114
M. Welsh, A. Basu, and T. Von Eicken.
Low-latency communication over Fast Ethernet.
In Lecture Notes in Computer Science, volume 1123, 1996.

115
Matt Welsh, Anindya Basu, and Thorsten von Eicken.
Incorporating Memory Management into User-Level Network Interfaces.
In Hot Interconnects V, Aug 1997.

116
Xingfu Wu.
Performance Evaluation, Prediction and Visualization of Parallel Systems.
Kluwer Academic Publishers, 1999.

117
Zhiwei Xu and Kai Hwang.
Modeling Communication Overhead: MPI and MPL Performance on the IBM SP2.
IEEE Parallel and Distributed Technology, 4(1):9-23, 1996.

118
Zhiwei Xu and Kai Hwang.
MPPs and Clusters for Scalable Computing.
In Proceedings of 2nd International Symposium on Parallel Architectures, Algorithms, and Networks, pages 117-123, 1996.

119
M. Yajnik, S. Moon, J. Kurose, and D. Towsley.
Measurement and modeling of the temporal dependence in packet loss.
In Proceedings of IEEE INFOCOM (1999), pages 345-352, 1999.

120
C-Q Yang and A.V.S. Reddy.
A Taxonomy for Congestion Control Algorithms in Packet Switching Networks.
IEEE Network, 9(4):34-45, 1995.

121
Y. Yang and J. Wang.
Optimal all-to-all personalized exchange in self-routable multistage networks.
IEEE Transactions on Parallel and Distributed Systems, 11(3):261-274, 2000.

122
Lixia Zhang, Scott Schenker, and David Clark.
Observations on the Dynamics of a Congestion Control Algorithm: The Effects of Two Way Traffic.
ACM Computer Communications Review, 1991.