TOP > 研究報告検索 > 高信頼分散システムのためのグループ通信とその故障発見方法

高信頼分散システムのためのグループ通信とその故障発見方法

研究報告コード R070000228
整理番号 R070000228
掲載日 2008年4月11日
研究者
  • Xavier Defago
研究者所属機関
報告名称 高信頼分散システムのためのグループ通信とその故障発見方法
報告概要 The objective of the research is to better understand the performance tradeoffs associated with fault-tolerant mechanisms for distributed systems. In particular, group communication protocols, such as Total Order Broadcast, are key factors in determining the performance of the system in the absence of failures. While failure-free executions constitute the common case, the occurrence of failures should not affect system performance too drastically, or else failures risk being perceived by the users, thus defeating the objective of masking them. The performance in the face of failures depends mostly on the ability of the system to detect failures promptly and accurately, but this is made difficult by an inherent tradeoff between these two measures. Thus the second objective is to provide a generic failure detection service, the speed and accuracy of which can be best tuned to the specific needs of each part of the entire distributed system.
画像

※ 画像をクリックすると拡大します。

R070000228_01SUM.gif R070000228_02SUM.gif R070000228_03SUM.gif
研究分野
  • 計算機網
  • 制御方式
関連発表論文 (1) 2005年12月5日“Failure Detection in Distributed Systems: Retrospective and recent advances.”Tutorial. 6th Intl. Conf. on Parallel and Distributed Computing, Applications and Technologies.
(2) 2005年7月2日“Revisiting Failure Detection for Grid Systems..”Invited talk. 48th meeting IFIP working group 10.4 (dependable computing & fault-tolerance).
(3) 2004年10月17日“Panel: Dependable replicated data: Strategies, drawbacks and benchmarking.”Panelist. Workshop on Dependable Distributed Data Management.
(4) 2003年7月3日“Fault-Tolerant Group Communication and the Many Faces of Scalability.”Invited lecture. Information and Communications University (ICU).
(5) 2003年7月3日“Fault-Tolerant Group Communication and the Many Faces of Scalability.”Invited lecture. Electronics and Telecommunications Research Institute (ETRI).
(6) X. Defago, A. Schiper, and P. Urban. Total order broadcast and multicast algorithms: Taxonomy and survey. ACM Computing Surveys, 36(4):372-421, December 2004. ACM Prees.
(7) X. Defago and A. Schiper. Semi-passive replication and Lazy Consensus. Journal of Parallel and Distributed Computing, 64(12):1380-1398, December 2004. Elsevier.
(8) X. Defago, A. Schiper, and P. Urban. Comparative performance analysis of ordering strategies in atomic broadcast algorithms. IEICE Trans. on Information and Systems, Vol.E86-D, No.12, pp.2698-2709, December 2003.
(9) X. Defago, P. Urban, N. Hayashibara, T. Katayama. Definition and specification of accrual failure detectors. In Proc. IEEE/IFIP Intl. Conf. on Dependable Systems and Networks, pp. 206-215, June 2005. IEEE CS Press.
(10) N. Hayashibara, X. Defago, M. Takizawa, and T. Katayama. Information propagation on the φ failure detector. In Proc. 16th Intl. Workshop on Database and Expert Systems Applications, pp.72-76, August 2005.. IEEE CS Press.
(11) R. Yared, X. Defago, and T. Katayama. Fault-tolerant group membership protocols using physical robot messengers. In Proc. 19th IEEE Intl. Conf. on Advanced Information Networking and Applications, Vol.1, pp.921-926, March 2005. IEEE CS Press.
(12) N. Hayashibara, X. Defago, R. Yared, and T. Katayama. The φ accrual failure detector. In Proc. 23rd IEEE Intl. Symp. on Reliable Distributed Systems, pp. 66-78, October 2004. IEEE CS Press.
(13) A. Ben Hassine, X. Defago, and T. B. Ho. Agent-based approach to dynamic meeting scheduling problems. In Proc. 3rd Intl. Joint Conf. on Autonomous Agents and Multi Agent Systems, Vol.3, pp.1130-1137, July 2004. IEEE CS Press.
(14) K. Satou, Y. Nakajima, S. Tsuji, X. Defago, and A. Konagaya.An integrated system for distributed bioinformatics environment on grids. In Grid Computing in Life Science: First Intl. Life Science Grid Workshop, May 2004. LNCS 3370/2005, Springer.
(15) S. Souissi, X. Defago, and T. Katayama. Decomposition of fundamental problems for cooperative autonomous mobile systems. In Proc. 24th IEEE Intl. Conf. on Distributed Computing Systems Workshops, pp.554-560, March 2004. IEEE CS Press.
(16) J. C. Clemente Litran, X. Defago, and K Satou. Asynchronous peer-to-peer communication for failure resilient distributed genetic algorithms. In Proc. 15th LASTED Intl. Conf. on Parallel and Distributed Computing and Systems, Vol.II, pp.769-773, November 2003.
(17) X. Defago, N. Hayashibara, and T. Katayama. On the design of a failure detection service for large scale distributed systems. In Proc. Intl. Symp. Towards Peta-Bit Ultra-Networks, pp.88-95, September 2003.
(18) M. Wiesmann, X. Defago, and A. Schiper. Group communication based on standard interfaces. In Proc. 2nd IEEE Intl. Symp. on Network Computing and Applications, pp. 140-147, April 2003....and a few other papers.
(19) E. Anceaume, X. Defago, M. Gradinariu, and M. Roy. Brief Announcement: Towards a theory of self-organization, In Proc. 19th Intl. Symp. on Distributed Computing, LNCS 3724, pp. 505-506, September 2005. Springer-Verlag.
(20) X. Defago. Semi-passive replication and the eventual leadership (invited paper) . In Proc. Workshop on Dependable Distributed Data Management, pp.13-18, October 2004.
(21) N. Hayashibara, X. Defago, T. Katayama. Two-ways adaptive failure detection with the φ-failure detector. In Proc. Intl. Workshop on Adaptive Distributed Systems, pp.22-27, October 2003.
(22) S. Souissi, X. Defago, and T. Katayama. Convergence of a uniform circle formation algorithm for distributed autonomous mobile robots. In Proc. Japan-Tunisia Workshop on Computer Systems and Information Technology, July 2004.
(23) X. Defago, N. Hayashibara, and T. Katayama. An adaptive failure detection service for large-scale distributed systems. In Actes Journees Scientifiques Francophones, November 2003.
(24) A. Ben Hassine, X. Defago, and T. B. Ho. Novel approach for the dynamic resolution of meeting scheduling problem. In Actes Journees Scientifiques Francophones, November 2003.
(25) S. Souissi, X. Defago, and T. Katayama. Specification of recurrent problems in distributed cooperative mobile robotics. In Actes Journees Scientifiques Francophones, November 2003.
(26) P. Urban, X. Defago, and T. Katayama. NekoLS: prototyping and simulation of large-scale distributed systems. In Actes Journees Scientifiques Francophones, November 2003.
研究制度
  • 戦略的創造研究推進事業 さきがけタイプ(旧若手個人研究推進事業を含む)/機能と構成
研究報告資料
  • Xavier Defago. 高信頼分散システムのためのグループ通信とその故障発見方法. 個人型研究さきがけタイプ研究報告会 先進情報システムとその構成に向けて 「機能と構成」領域 講演要旨集 第Ⅲ期研究者(研究期間2002-2005), 2005. p.43 - 50.

PAGE TOP