Achievements

Conferences

  1. K. Osawa, and S. Swaroop, and A. Jain, and R. Eschenhagen, and R. E. Turner, and R. Yokota, and M. E. Khan, Practical Deep Learning with Bayesian Principles, The 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada, Dec. 8-14, 2019.
  2. Q. Ma, R. Yokota, Runtime System for GPU-based Hierarchical LU Factorization, SC19 re- search poster, Denver, Colorado, 17-22 November, 2019.
  3. H. Ootomo, R. Yokota, TSQR on TensorCores, SC19 research poster, Denver, Colorado, 17-22 November, 2019.
  4. H. Ootomo, R. Yokota, GPU Implementation of TSQR Using Tensor Cores, The 170th Work- shop on High Performance Computing (SWoPP2019), Kitami, Japan, July 24, 2019.
  5. P. Spalthoff, R. Yokota, Flexible and Simplistic Hierarchical Matrix-Based Fast Direct Solver, The 170th Workshop on High Performance Computing (SWoPP2019), Kitami, Japan, July 24, 2019.
  6. Kazuki Osawa, Yohei Tsuji, Yuichiro Ueno, Akira Naruse, Rio Yokota, Satoshi Matsuoka. Second-order Optimization Method for Large Mini-batch: Training ResNet-50 on ImageNet in 35 Epochs, IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019.
  7. Yuichiro Ueno, Rio Yokota. Exhaustive Study of Hierarchical AllReduce Patterns for Large Mes- sages Between GPUs, 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), May. 2019.
  8. Hiroki Naganuma, Rio Yokota. A Performance Improvement Approach for Second-Order Optimization in Large Mini-batch Training, 2nd High Performance Machine Learning Workshop CCGrid2019 (HPML2019), May. 2019.
  9. Hiroki Naganuma, Rio Yokota. Effectiveness of Smoothing for Large-batch Training Using Natural Gradient Descent, The 3rd Cross-disciplinary Workshop on Computing Systems, Infrastructures, and Programming (xSIG), May. 2019.
  10. Rio Yokota, Kazuki Osawa, Yohei Tsuji, Yuichiro Ueno, Akira Naruse. Second Order Optimization for Large Scale Parallel Deep Learning, IEICE General Conference, Mar. 2019.
  11. Hiroki Naganuma, Rio Yokota. Smoothing of the Objective Function for Large Scale Parallel Deep Learning, The 81st National Convention of IPSJ, Mar. 2019.
  12. Hiroki Naganuma, Rio Yokota. Improving the Generalization Gap in Large-batch Training Using Noise Injection, IEICE General Conference, Mar. 2019.
  13. Hiroyuki Ootomo, Rio Yokota. Batched QR Decomposition Using TensorCores, The 81st National Convention of IPSJ, Mar. 2019.
  14. Hikaru Nakata, Kazuki Osawa, Rio Yokota. Variational Inference in Deep Learning Using Natural Gradient Descent, The 81st National Convention of IPSJ, Mar. 2019.
  15. Kazuki Osawa, Rio Yokota, Chuan-Sheng Foo, Vijay Chandrasekhar. Second Order Optimization for Large Scale Parallel Deep Learning Through Analysis of the Fisher Information Matrix, The 81st National Convention of IPSJ, Mar. 2019.
  16. R. Yokota, Optimization Methods for Large Scale Distributed Deep Learning, IPAM Workshop I: Big Data Meets Large-Scale Computing, Los Angeles, USA, September 24-28, 2018.
  17. H. Naganuma, S. Iwase, L. Kaku, H. Nakata, and R. Yokota, Hyper-parameter Tuning of Approximate Natural Gradient Methods for Highly Parallel Distributed Deep Learning, Forum on Information Technology, Fukuoka, Japan, September 19, 2018.
  18. R. Yokota, Early Application Results on TSUBAME 3, Smoky Mountains Computational Sciences and Engineering Conference, Gatlinburg, USA, August 30, 2018.
  19. R. Yokota, Scaling Deep Learning to Thousands of GPUs, HPC 2018, Cetraro, Italy, July 3, 2018.
  20. R. Yokota, Energy Conserving Fast Multipole Methods for the Calculation of Long-range Interactions, Mathematics in Action: Modeling and analysis in molecular biology and electro-physiology, Suzhou, China, June 18, 2018.
  21. I. Yamazaki, A. Abdelfattah, A. Ida, S. Ohshima, S. Tomov, R. Yokota, and J. Dongarra, Analyzing Performance of BiCGStab with Hierarchical Matrix on GPU clusters, 32nd IEEE International Parallel & Distributed Processing Symposium, Vancouver, Canada, May 21, 2018.
  22. S. Ohshima, I. Yamazaki, A. Ida, R. Yokota, Optimization of Hierarchical Matrix Computation on GPU, SC Asia, Singapore, March 26, 2018.
  23. R. Yokota, Can we use Hierarchical Low-Rank Approximation for Deep Learning?, HPC Saudi 2018, Jeddah, Saudi Arabia, March 13, 2018.
  24. H. Ohtomo, K. Osawa, R. Yokota, Deep Learning Using Hierarchical Low-Rank Approximation of the Fisher Information Matrix, The 80th National Convention of IPSJ, Tokyo, Japan, March 13, 2018.
  25. Y. Kuwamura, K. Osawa, R. Yokota, Hyper-parameter Tuning for Approximate Natural Gradient Methods, The 80th National Convention of IPSJ, Tokyo, Japan, March 13, 2018.
  26. H. Otomo, K. Osawa, R. Yokota, Distributed Learning of Deep Neural Networks Using the Kronecker Factorization of the Fisher Information Matrix, The 163rd Workshop on High Performance Computing, Ehime, Japan, March 1, 2018.
  27. H. Naganuma and R. Yokota, Accelerating Convolutional Neural Networks Using Low Precision Arithmetic, HPC Asia, Tokyo, Japan, January 28, 2018.
  28. H. Naganuma and R. Yokota, Verification of Low-precision Arithmetic for the Acceleration of Convolutional Neural Networks, GTC Japan, Tokyo, Japan, December 12, 2017.
  29. K. Osawa, A. Sekiya, H. Naganuma, R. Yokota, Acceleration of Convolutional Neural Networks Using Low-Rank Tensor Decomposition, Pattern Recognition and Media Understanding, Ku- mamoto, Japan, October 12 - 13, 2017.
  30. H. Naganuma, A. Sekiya, K. Osawa, H. Otomo, Y. Kuwamura, R. Yokota, Evaluating the Performance of Deep Learning with Low Precision Arithmetic, Pattern Recognition and Media Understanding, Kumamoto, Japan, October 12 - 13, 2017.
  31. H. Naganuma, K. Osawa, A. Sekiya, R. Yokota, Acceleration of Compressed Models in Deep Learning Using Half Precision Arithmetic, Japan Society for Industrial and Applied Mathe- matics Annual Meeting, Tokyo, Japan, September 6 - 8, 2017.
  32. K. Osawa, R. Yokota, Evaluating the Compression Efficiency of the Filters in Convolutional Neural Networks, The 26th International Conference on Artificial Neural Networks, Sardinia, Italy, 11 - 14, September, 2017.
  33. M. Abduljabbar, M. Al Farhan, R. Yokota, and D. Keyes, Performance Evaluation of Compu- tation and Communication Kernels of the Fast Multipole Method on Intel Manycore Architec- ture, 23rdInternational European Conference on Parallel and Distributed Computing, Galicia, Spain, 30 August - 1 September, 2017.
  34. S. Ohshima, I. Yamazaki, A. Ida, R. Yokota, Optimization of Hierarchical Matrix Computa- tions on a Cluster of GPUs, Summer United Workshops on Parallel, Distributed and Cooper- ative Processing, Akita, Japan, 26 - 28 July, 2017.
  35. K. Osawa, A. Sekiya, H. Naganuma, R. Yokota, Accelerating Matrix Multiplication in Deep Learning by Using Low-Rank Approximation, The 2017 International Conference on High Performance Computing & Simulation, Genoa, Italy, 17 - 20, July, 2017.
  36. R. Yokota Hierarchical Low-Rank Approximations at Extreme Scale, ISC High Performance, Frankfurt, Germany, 18 - 22, June, 2017.
  37. M. Abduljabbar, G. Markomanolis, H. Ibeid, R. Yokota, and D. Keyes, Communication Reduc- ing Algorithms for Distributed Heirarchical N-Body Methods, ISC High Performance, Frank- furt, Germany, 18 - 22, June, 2017.
  38. K. Osawa, A. Sekiya, H. Naganuma, R. Yokota, Accelerating Convolutional Neural Networks Using Low-Rank Approximation, 22nd Conference of Japan Computational Engineering Soci- ety, Omiya, Japan, 31 May - 2 June, 2017.
  39. Y. Motoyama, T. Endo, S. Matsuoka, R. Yokota, K. Fukuda, I. Sato, Optimization of Convolu- tions in CNN Using Low-Rank Approximation Matrices, 158th Research Presentation Seminar in High Performance Computing, Atami, Japan, 8 - 10 March, 2017.
  40. A. Sekiya, K. Oosawa, H. Naganuma, R. Yokota, Acceleration of Matrix Multiplication in Deep Learning Using Low-Rank Approximation, 158th Research Presentation Seminar in High Performance Computing, Atami, Japan, 8 - 10 March, 2017.
  41. R. Yokota, Compute-Memory Tradeoff in Hierarchical Low-Rank Approximation Methods, SIAM Conference on Computational Science and Engineering, Atlanta, USA, 27 February - 3 March, 2017.
  42. R. Yokota, Energy Conservation of Fast Multipole Methods in Classical Molecular Dynamics Simulations, 7th AICS International Symposium, Kobe, Japan, 24-25 February, 2017.
  43. R. Yokota, Improving Data Locality of Fast Multipole Methods, Third Workshop on Program- ming Abstractions for Data Locality, Kobe, Japan, 24-26 October, 2016.
  44. R. Yokota, Fast Multipole Method Library for Multiple Architectures and its Application to Molecular and Fluid Simulations, 8th Symposium of the Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures, Tokyo, Japan, 14-15 July, 2016.
  45. R. Yokota, Perforamance Portability of FMM, 21st Conference of Japan Computational Engi- neering Society, Niigata, Japan, 31 May - 2 June, 2016.
  46. R. Yokota, A Common API for Fast Multipole Methods, Accelerate Data Analytics and Com- puting Workshop, Houston, USA, 14-15 January, 2016.
  47. H. Ibeid, R. Yokota, D. Keyes, A Matrix-Free Preconditioner for Elliptic Solvers Based on the Fast Multipole Method, SIAM Conference on Parallel Processing for Scientific Computing, Paris, France, 12-15 April, 2016.
  48. R. Yokota, Tuning parameters in FMM, Seventh symposium on Automatic Tuning Technology and its Application, Tokyo, Japan, 25, December, 2015.
  49. R. Yokota, F.-H. Rouet, X. S. Li, Comparison of FMM and HSS at Large Scale, SIAM Conference on Applied Linear Algebra, Atlanta, USA, 26-30 October, 2015.
  50. R. Yokota, H. Ibeid, D. E. Keyes, Preconditioning Sparse Matrices Using a Highly Scalable Fast Multipole Method, 3rd International Workshops on Advances in Computational Mechanics, Tokyo, Japan, 12-14, October, 2015.
  51. R. Yokota, Fast Multipole Method as a Matrix-free Hierarchical Low-rank Approximation, International Workshop on Eigenvalue Problems, Tsukuba, Japan, 14-16 September, 2015.
  52. R. Yokota, Various implementations of FMM and their performance on future architectures, Multi-resolution Interactions Workshop, Durham, USA, 28-29 August, 2015.
  53. R. Yokota, ExaFMM – a Testbed for Comparing Various Implementations of the FMM, SIAM Conference on Computational Science and Engineering, Salt Lake City, USA, 14-18 March, 2015.
  54. H. Ibeid, R. Yokota, J. Pestana, D. Keyes, Fast Multipole Preconditioners for Sparse Linear Solvers. 11th World Congress on Computational Mechanics, Barcelona, Spain, 20-25 July, 2014.
  55. R. Yokota, D. Keyes, Communication Complexity of the Fast Multipole Method and its Algebraic Variants, CBMS-NSF Conference: Fast Direct Solvers for Elliptic PDEs, Hanover, New Hampshire, 23-29 June, 2014.
  56. H. Ltaief, R. Yokota, High Performance Numerical Algorithms for Seismic and Reservoir Simulations. GPU Technology Conference, San Jose, California, 24-27 March, 2014.
  57. R. Yokota, Fast N-body Methods as a Compute-Bound Preconditioner for Sparse Solvers on GPUs. GPU Technology Conference, San Jose, California, 24-27 March, 2014.
  58. J. Pestana, R. Yokota, H. Ibeid, D. Keyes, Fast Multipole Method Preconditioning. International Conference On Preconditioning Techniques For Scientific And Industrial Applications, Oxford, UK, 19-21 June, 2013.
  59. R. Yokota, Advances in Fast Multipole Methods for Scalable Electrostatics Calculations. Workshop: Electrostatics methods in Molecular Simulation, Stockholm, Sweeden, 13-15 May, 2013.
  60. A. Abdelfettah, H. Ltaief, R. Yokota, Investigating New Numerical Techniques for Reservoir Simulations on GPUs. GPU Technology Conference, San Jose, California, 24-27 March, 2013.
  61. H. Ibeid, R. Yokota, D. Keyes, Fast Multipole Method as a Preconditioner. SIAM Conference on Computational Science and Engineering, Boston, Massachusetts, 25 February - 1 March, 2013.
  62. K. Taura, J. Nakashima, R. Yokota, N. Maruyama, A Task Parallelism Meets Fast Multi- pole Methods. Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA), Salt Lake City, Utah, 11 November, 2012.
  63. R. Yokota, Petascale Fast Multipole Methods on GPUs. GPU Technology Conference Japan, Tokyo, Japan, 26 July, 2012.
  64. R. Yokota, Petascale Fast Multipole Methods on GPUs. (invited talk) The 11th International Symposium on Parallel and Distributed Computing, Munich, Germany, 25-29 June, 2012.
  65. E. Yunis, R. Yokota, and A. Ahmadia, Scalable Force Directed Graph Layout Algorithms Using Fast Multipole Methods. The 11th International Symposium on Parallel and Distributed Computing, Munich, Germany, 25-29 June, 2012.
  66. H. Ltaief, R. Yokota, Data-Driven Fast Multipole Method on Distributed Memory Systems with Hardware Accelerators. 21st International Conference on Domain Decomposition Methods, INRIA, Rennes-Bretagne-Atlantique, 25-29 June, 2012.
  67. R. Yokota, L. A. Barba, Recent Trends in Hierarchical N-body Methods on GPUs. GPU Technology Conference, San Jose, California, 14-17 May, 2012.
  68. R. Yokota, T. Narumi, L. Barba, K. Yasuoka, Scaling Fast Multipole Methods up to 4000 GPUs. ATIP/A*CRC Workshop on Accelerator Technologies for High Performance Computing, Biopolis, Singapore, 7-10 May, 2012.
  69. R. Yokota, Running Fast Multipole Method on the Full Node of TSUBAME and K computer. Scalable Hierarchical Algorithms for Extreme Computing, Thuwal, Saudi Arabia, 28-30 April, 2012.
  70. H. V. Nguyen, R. Yokota, G. Stenchikov, A Parallel Numerical Simulation of Dust Particles Using Direct Numerical Simulation. European Geosciences Union General Assembly, Vienna, Austria, 22-27 April, 2012.
  71. R. Yokota, (invited talk) Fast N-body Methods on Many-core and Heterogenous Systems. International Workshop on Computational Science and Numerical Analysis, Tokyo, Japan, 24-26 March, 2012.
  72. R. Yokota, Petaflops Scale Turbulence Simulation on TSUBAME 2.0 GPU@BU Workshop, Boston, Massachusetts, 8-9 November, 2011.
  73. T. Narumi, R. Yokota, L. A. Barba, and K. Yasuoka, Petascale Turbulence Simulation Using FMM. HOKKE-19, Hokkaido, Japan, 28-29 November, 2011.
  74. R. Yokota, L. A. Barba, Fast multipole method vs. spectral methods for the simulation of isotropic turbulence on GPUs. 23rd International Conference on Parallel Computational Fluid Dynamics, Barcelona, Spain, 16-20 May, 2011.
  75. R. Yokota, L. A. Barba, Parameter Tuning of a Hybrid Treecode-FMM on GPUs, The First International Workshop on Characterizing Applications for Heterogeneous Exascale Systems, Tucson, Arizona, June 4, 2011.
  76. R. Yokota, L. A. Barba, Large Scale Multi-GPU FMM for Bioelectrostatics. SIAM Conference on Computational Science and Engineering, Reno, Nevada, 28 February-4 March, 2011.
  77. R. Yokota 12 Steps to a Fast Multipole Method on GPUs, Pan-American Advanced Studies Institute, Valparaiso, Chile, 3-14 January, 2011.
  78. R. Yokota, J. P. Bardhan, M. G. Knepley, and L. A. Barba, (Really) Fast macromolecular electrostatics – fast algorithms, open software and accelerated computing, ACS Division of Physical Chemistry 240th National Meeting, Boston, Massachusetts, 22-26 August, 2010.
  79. R. Yokota and L. A. Barba, RBF interpolation using Gaussians with domain decomposition on GPUs, SIAM annual meeting, Philadelphia, Pennsylvania, 12-16 July, 2010.
  80. R. Yokota and L. A. Barba, Performance of the fast multipole method on GPUs using various kernels, 9th World Congress on Computational Mechanics, Sydney, Australia, 19-23 July, 2010.
  81. R. Yokota and L. A. Barba, Comparing the treecode with FMM on GPUs for vortex particle simulations of a leapfrogging vortex ring, 22nd International Conference on Parallel Computational Fluid Dynamics, Kaosiung, Taiwan, 17-21 May, 2010.
  82. R. Yokota and S. Obi, Lagrangian simulation of turbulence using vortex methods, 2nd International Workshops on Advances in Computational Mechanics, Yokohama, Japan, 29-31 March, 2010.
  83. R. Yokota Range of Applications for the Fast Multipole Method on GPUs, Accelerated Computing, Tokyo, Japan, 28-29 January, 2010.
  84. T. Hamada, R. Yokota, K. Nitadori, T. Narumi, K. Yasuoka, M. Taiji, and K. Oguri, 42 TFlops hierarchical N-body simulation on GPUs with applications in both astrophysics and turbulence, SC09, Portland, Oregon, 14-20 November, 2009
  85. R. Yokota, K. Fukagata, and S. Obi, Lagrangian Vortex Methods in Turbulent Channel Flows, 12th EUROMECH European Turbulence Conference, Marburg, Germany, 7-10 September, 2009.
  86. R. Yokota and S. Obi, Validation of Vortex Methods in a Turblent Channel Flow, Annual meeting of the JSFM, Tokyo, Japan, 2-4 September, 2009.
  87. R. Yokota, T. Narumi, R. Sakamaki, K. Yasuoka, and S. Obi, Fast Multipole Methods on GPUs for the Meshfree Simulation of Turbulence, 10th US National Congress on Computational Mechanics, Columbus, Ohio, 16-19 July, 2009.
  88. R. Yokota, T. Narumi, R. Sakamaki, S. Kameoka, K. Yasuoka, and S. Obi, DNS of Homogeneous Turbulence Using Vortex Methods Accelerated by the FMM on a Cluster of GPUs, 21st International Conference on Parallel Compuational Fluid Dynamics, Moffet Field, California. 18-22 May, 2009.
  89. R. Yokota, T. Narumi, R. Sakamaki, S. Kameoka, K. Yasuoka, and S. Obi, Meshfree Simulation of Turbulence Using the Fast Multipole Methods on GPUs, 22nd Symposium on Computational Fluid Dynamics, Tokyo, Japan, 17-19 December, 2008.
  90. R. Yokota and S. Obi, Vortex Method Simulation of Turbulent Channel Flow, Annual meeting of the JSFM, Kobe, Japan, 4-7 September, 2008.
  91. R. Yokota and S. Obi, Direct numerical simulation of homogeneous shear flow using vortex methods, 4th International Conference on Vortex Flows and Vortex Models, Seoul, Korea, 21-23 April, 2008.
  92. R. Yokota and S. Obi, Mesh-free simulation of the homogeneous shear flow using vortex methods, 23rd IIS Turbulence and Shear Flow Dynamics Symposium, Tokyo, Japan, 7 March, 2008.
  93. A. Sato, R. Yokota, and S. Obi, Computation of wing-tip vortex by a three-dimensional vortex method, 21st Symposium on Computational Fluid Dynamics, Tokyo, Japan, 19-21 December, 2007.
  94. R. Yokota, T. Narumi, K. Yasuoka, T. Ebisuzaki, and S. Obi, Mesh-free direct numerical simulation of turbulence using the vortex method on parallel MDGRAPE-3 boards along with the fast multipole method, Next-Generation Supercomputing Symposium 2007, Tokyo, Japan, 3-4 October, 2007.
  95. R. Yokota and S. Obi, Pure Lagrangian vortex methods for the simulation of decaying isotropic turbulence, 5th International Symposium on Turbulence and Shear Flow Phenomena, Munich, Germany, 27-29 August, 2007.
  96. R. Yokota and S. Obi, Vortex methods for the calculation of homogeneous shear flows, Annual meeting of the JSFM, Tokyo, Japan, 6-8 August, 2007.
  97. R. Yokota and S. Obi, Mesh-free turbulence simulation using vortex methods, 56th Conference of the JSME Tokai Branch, Hamamatsu, Japan, 7-8 March, 2007.
  98. R. Yokota and S. Obi, Simulation of homogeneous isotropic turbulence using the vortex method, 20th Symposium on Computational Fluid Dynamics, Nagoya, Japan, 18-20 December, 2006.
  99. R. Yokota and S. Obi, Calculation of fluid structure interaction using VEM and BEM, Conference of the JSME Fluid Engineering Division, Kawagoe, Japan, 28-29 October, 2006.
  100. R. Yokota and S. Obi, Simulation of a wake using a 3-D vortex element method, Annual Meeting of the JSME, Kumamoto, Japan, 18-22 September, 2006.
  101. R. Yokota and S. Obi, Vortex flow simulation between multipole bridge decks, Whither Turbulence Prediction and Control, Seoul, Korea, 26-29 March, 2006.
  102. R. Yokota and S. Obi, Vortex flow simulation of multipole bluff bodies, 3rd International Conference on Vortex Flows and Vortex Models, Yokohama, Japan, 21-23 November, 2005.
  103. R. Yokota and S. Obi, Vortex flow simulation of multipole bluff bodies, 19th Symposium on Computational Fluid Dynamics, Tokyo, Japan, 13-15 December, 2005.
  104. R. Yokota, N. Tokai, and S. Obi, Vortex flow simulation of multipole bluff bodies, 17th Conference on Computational Mechanics, Sendai, Japan, 18-20 November, 2004.