研究成果

論文

  1. Hiroki Naganuma, Kartik Ahuja, Shiro Takagi, Tetsuya Motokawa, Rio Yokota, Kohta Ishikawa, Ikuro Sato, Ioannis Mitliagkas, Empirical Study on Optimizer Selection for Out-of-Distribution Generalization, Transactions on Machine Learning Research, 2023.
  2. Sameer Deshmukh, Rio Yokota, George Bosilca, Cache Optimization and Performance Modeling of Batched, Small, and Rectangular Matrix Multiplication on Intel, AMD, and Fujitsu Processors, ACM Transactions on Mathematical Software, 2023.
  3. Muhammad Ridwan Apriansyah, Rio Yokota, Parallel QR Factorization of Block Low-Rank Matrices, ACM Transactions on Mathematical Software, (2022). https://doi.org/10.1145/3538647
  4. Hiroyuki Ootomo, Rio Yokota, Recovering Single Precision Accuracy from Tensor Cores While Surpassing the FP32 Theoretical Peak Performance, The International Journal of High Performance Computing Application, (2022).
  5. Tingyu Wang, Rio Yokota, Lorena A. Barba, ExaFMM: a high-performance fast multipole method library with C++ and Python interfaces, The Journal of Open Source Software, 6(61):3145 (2021). 10.21105/joss.03145
  6. Kazuki Osawa, Yohei Tsuji, Yuichiro Ueno, Akira Naruse, Chuan-Sheng Foo, Rio Yokota, Scalable and Practical Natural Gradient for Large-Scale Deep Learning, IEEE Transactions on Pattern Analysis and Machine Intelligence 2020;PP:10.1109/TPAMI.2020.3004354.
  7. Davoud S. Shamshirgar, Rio Yokota, Anna-Karin Tornberg, and Berk Hess, Regularizing the Fast Multipole Method for use in Molecular Simulation, The Journal of Chemical Physics, 151, 234113, 2019.
  8. Akihiro Ida, Hiroshi Nakashima, Tasuku Hiraishi, Ichitaro Yamazaki, Rio Yokota, Takeshi Iwashita, QR Factorization of Block Low-rank Matrices with Weak Admissibility Condition, Journal of Information Processing, Vol. 12, No. 4, Nov. 2019
  9. Ichitaro Yamazaki, Akihiro Ida, Rio Yokota, Jack Dongarra, Distributed Memory Lattice H-matrix Factorization, The International Journal of High Performance Computing Applications, Aug. 2019.
  10. Mustafa AbdulJabbar, Mohammed Al Farhan, Noha Al-Harthi, Rui Chen, Rio Yokota, Hakan Bagci, David Keyes, Extreme Scale FMM-Accelerated Boundary Integral Equation Solver for Wave Scattering, SIAM Journal on Scientific Computing, Vol. 4, No. 3, pp. C245--C268, Jun. 2019.
  11. Naoya Maruyama, Takayuki Aoki, Kenjiro Taura, Rio Yokota, Mohamed Wahib, Motohiko Matsuda, Keisuke Fukuda, Takashi Shimokawabe, Naoyuki Onodera, Michel Müller, Shintaro Iwasaki, Highly Productive, High-Performance Application Frameworks for Post-Petascale Computing, Advanced Software Technologies for Post-Peta Scale Computing, pp. 77--98, Dec. 2018.
  12. Huda Ibeid, Rio Yokota, Jennifer Pestana, David Keyes, Fast Multipole Preconditioners for Sparse Matrices Arising from Elliptic Equations, Computing and Visualization in Science, Vol. 18, No. 6, pp. 213--229, Nov. 2017.
  13. 横田理央. FMM と H^2(HSS) 行列のトレードオフについて, 計算工学, Vol. 21, No. 4, pp. 3498--3501, Oct. 2016.
  14. 横田理央. 大規模境界要素法解析における分散並列 FMM の通信最適化, シミュレーション, 日本シミュレーション学会, Vol. 35, No. 3, pp. 147--153, Sep. 2016.
  15. Huda Ibeid, Rio Yokota, David Keyes, A performance model for the communication in fast multipole methods on high-performance computing platforms, International Journal of High Performance Computing Applications, Sage Journals, Vol. 30, No. 4, pp. 423--437, Mar. 2016.
  16. Abdelhalim Amer, Satoshi Matsuoka, Miquel Pericàs, Naoya Maruyama, Kenjiro Taura, Rio Yokota, Pavan Balaji, Scaling FMM with data-driven OpenMP tasks on multicore architectures, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 9903 LNCS, pp. 156-170, 2016.
  17. Julio Castrillon-Candas, Marc Genton, Rio Yokota, Multi-level restricted maximum likelihood covariance estimation and kriging for large non-gridded spatial datasets, Spatial Statistics, Elsevier, Vol. 18, pp. 105--124, Nov. 2015.
  18. Rio Yokota, George Turkiyyah, David E. Keyes, Communication complexity of the fast multipole method and its algebraic variants, Supercomputing Frontiers and Innovations, Vol. 1, No. 1, pp. 63–84, Jun. 2014.
  19. Yousuke Ohno, Rio Yokota, Hiroshi Koyama, Gentaro Morimoto, Aki Hasegawa, Gen Masumoto, Noriaki Okimoto, Yoshinori Hirano, Huda Ibeid, Tetsu Narumi, Makoto Taiji, Petascale molecular dynamics simulation using the fast multipole method on K computer, Computer Physics Communications, Vol. 185, No. 10, pp. 2575–2585, Jun. 2014.
  20. Hatem Ltaief, Rio Yokota, Data-driven execution of fast multipole methods, Concurrency and Computation: Practice and Experience, Vol. 26, No. 11, pp. 1935–1946, Sep. 2013.
  21. R. Yokota, An FMM based on dual tree traversal for many-core architectures, Journal of Algorithms and Computational Technology, Vol. 7, No. 3, pp. 301–324, Sep. 2013.
  22. Rio Yokota, Lorena Barba, Tetsu Narumi, Kenji Yasuoka, Petascale turbulence simulation using a highly parallel fast multipole method, Computer Physics Communications, Vol. 184, No. 3, pp. 445–455, Sep. 2012.
  23. Rio Yokota, Lorena Barba, FMM-based vortex method for simulation of isotropic turbulence on GPUs, compared with a spectral method, Computers and Fluids, Vol. 80, pp. 17–27, Aug. 2012.
  24. Rio Yokota, Lorena Barba, Hierarchical N-body simulations with auto-tuning for heterogeneous systems, Computing in Science and Engineering, Vol. 14, No. 3, pp. 30–39, Jan. 2012.
  25. Rio Yokota, Lorena Barba, A Tuned and scalable fast multipole method as a preeminent algorithm for exascale systems, International Journal of High Performance Computing Applications, Vol. 26, No. 4, pp. 337-346, Jan. 2012.
  26. Jaydeep Bardhan, R. Yokota, Matthew Knepley, Lorena Barba, Tsuyoshi Hamada, Biomolecular electrostatics using a fast multipole BEM on up to 512 GPUs and a billion unknowns, Computer Physics Communications, Vol. 182, No. 6, pp. 1272–1283, Mar. 2011.
  27. Rio Yokota, Shinnosuke Obi, Vortex methods for the simulation of turbulent flows, Journal of Fluid Science and Technology, Vol. 6, No. 1, pp. 14–29, Jan. 2011.
  28. Rio Yokota, Lorena Barba, Comparing the treecode with FMM on GPUs for vortex particle simulations of a leapfrogging vortex ring, Computers and Fluids, Vol. 45, No. 1, pp. 155–161, Dec. 2010.
  29. Rio Yokota, Lorena Barba, Matthew Knepley, PetRBF–A parallel O(N) algorithm for radial basis function interpolation with Gaussians, Computer Methods in Applied Mechanics and Engineering, Vol. 199, No. 25-28, pp. 1793–1804, Mar. 2010.
  30. Rio Yokota, Shinnosuke Obi, Comparing vortex methods and finite difference methods in a homogeneous turbulent shear flow, International Journal for Numerical Methods in Fluids, Vol. 63, No. 7, pp. 828–846, Jul. 2009.
  31. Rio Yokota, Tetsu Narumi, Ryuji Sakamaki, Shun Kameoka, Shinnosuke Obi, Kenji Yasuoka, Fast multipole methods on a cluster of GPUs for the meshless simulation of turbulence, Computer Physics Communications, Vol. 180, No. 11, pp. 2066–2078, Jun. 2009.
  32. Rio Yokota, Tarun Kumar Sheel, Shinnosuke Obi, Calculation of isotropic turbulence using a pure Lagrangian vortex method, Journal of Computational Physics, Vol. 226, pp. 1589–1606, Jun. 2007.

解説

  1. 横田理央. 巨大行列とAI, 数学セミナー, Vol. 59, No. 2, pp. 29-33, Feb. 2020.
  2. 横田理央. スーパーコンピューティングコンテスト2019, 数学セミナー, Vol. 59, No. 1, pp. 44-49, Jan. 2020.

著書

  1. Mustafa AbdulJabbar, Rio Yokota, N-body methods, High Performance Parallelism Pearls, Morgan Kaufmann, Nov. 2014.
  2. Rio Yokota, Lorena Barba, Treecode and fast multipole method for N-body simulation with CUDA, GPU Computing Gems Emerald Edition, Morgan Kaufmann, Feb. 2011.

国際学会(査読あり)

  1. Muhammad Ridwan Apriansyah, Rio Yokota, Computing Eigenvalue of Symmetric H^2-Matrices in Linear Time with Slicing the Spectrum, International Conference on Parallel Processing (ICPP), Aug. 2023.
  2. Sameer Deshmukh, Rio Yokota, George Bosilca, O(N) Distributed Direct Factorization of Structured Dense Matrices Using Runtime Systems, International Conference on Parallel Processing (ICPP), Aug. 2023.
  3. Hiroyuki Ootomo, Rio Yokota, Mixed-Precision Random Projection for RandNLA on Tensor Cores, Platform for Advanced Scientific Computing (PASC), Jun. 2023.
  4. Sora Takashima, Ryoh Hayamizu, Nakamasa Inoue, Hirokatsu Kataoka, Rio Yokota, Visual Atoms: Pre-training Vision Transformers with Sinusoidal Waves, IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2023.
  5. Hiroyuki Ootomo, Hidetaka Manabe, Kenji Harada, Rio Yokota, Quantum Circuit Simulation by SGEMM Emulation on Tensor Cores and Automatic Precision Selection, ISC High Performance, May 2023.
  6. Hiroyuki Ootomo, Rio Yokota, Reducing Shared Memory Footprint to Leverage High Throughput on Tensor Cores and its Flexible API Extension Library, HPC Asia, Feb. 2023.
  7. Satoshi Ohshima, Akihiro Ida, Rio Yokota and Ichitaro Yamazaki, QR Factorization of Block Low-Rank Matrices on Multi-Instance GPU, The 23rd International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT’22), Dec. 2022.
  8. Hiroki Naganuma, Kartik Ahuja, Ioannis Mitliagkas, Shiro Takagi, Tetsuya Motokawa, Rio Yokota, Kohta Ishikawa, Ikuro Sato, Empirical Study on Optimizer Selection for Out-of-Distribution Generalization, NeurIPS Workshop Distshift, Dec. 2022.
  9. Kazuki Osawa, Satoki Ishikawa, Rio Yokota, Shigang Li, Torsten Hoefler, ASDL: A Unified Interface for Gradient Preconditioning in PyTorch, NeurIPS Workshop Order up! The Benefits of Higher-Order Optimization in Machine Learning, Dec. 2022.
  10. Qianxiang Ma, Sameer Deshmukh, Rio Yokota, Scalable Linear Time Dense Direct Solver for 3-D Problems Without Trailing Sub-Matrix Dependencies, The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC22), Nov. 2022.
  11. Hirokatsu Kataoka, Ryo Hayamizu, Ryosuke Yamada, Kodai Nakashima, Sora Takashima, Xinyu Zhang, Edgar Josafat Martinez-Noriega, Nakamasa Inoue, Rio Yokota, Replacing Labeled Real-image Datasets with Auto-generated Contours, IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 2022.
  12. Hana Hoshino, Kei Ota, Asako Kanezaki, Rio Yokota, OPIRL: Sample Efficient Off-Policy Inverse Reinforcement Learning via Distribution Matching, IEEE International Conference on Robotics and Automation, May 2022.
  13. Shun Iwase, Xingyu Liu, Rawal Khirodkar, Rio Yokota, Kris M. Kitani, RePOSE: Real-Time Iterative Rendering and Refinement for 6D Object Pose Estimation, International Conference on Computer Vision, Oct 2021.
  14. Hikaru Nakata, Nakamasa Inoue, Rio Yokota, Self-supervised Continual Pretraining for Class Incremental Image Classification, Proc. CVPR CLVISION Workshop (Findings), Jun 2021.
  15. Ryo Karakida and Kazuki Osawa, Understanding Approximate Fisher Information for Fast Convergence of Natural Gradient Descent in Wide Neural Networks, To appear in Advances in Neural Information Processing Systems (NeurIPS 2020), oral presentation.
  16. Yuichiro Ueno, Kazuki Osawa, Yohei Tsuji, Akira Naruse, Rio Yokota. Rich Information is Affordable: A Systematic Performance Analysis of Second-order Optimization Using K-FAC. Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. August 2020.
  17. Mikiya Shibuya, Shinya Sumikura, and Ken Sakurada. “Privacy Preserving Visual SLAM.” Proceedings of the European conference on computer vision (ECCV). 2020.
  18. Sameer Deshmukh and Rio Yokota. "Distributed Memory Task-Based Block Low Rank Direct Solver". ISC High Performance 2020, Germany (Research Poster), June 2020
  19. Hiroyuki Ootomo, Rio Yokota, Randomized SVD on TensorCores, ISC High Performance 2020, Germany (Research Poster), June 2020.
  20. Sameer Deshmukh, Rio Yokota, Distributed Memory Task-Based Block Low Rank Direct Solver, HPC Asia 2020 (poster), Jan. 2020.
  21. Muhammad Ridwan Apriansyah, Rio Yokota, QR Decomposition of Block Low-Rank Matrices, HPC Asia 2020 (poster), Jan. 2020.
  22. Kazuki Osawa, Siddarth Swaroop, Anirudh Jain, Runa Eschenhagen, Richard E. Turner, Rio Yokota, Mohammad Emtiyaz Khan. Practical Deep Learning with Bayesian Principles, The 33rd Conference on Neural Information Processing Systems, Dec. 2019.
  23. Qianxing Ma, Rio Yokota, Runtime System for GPU-based Hierarchical LU factorization, The International Conference for High Performance Computing, Networking, Storage, and Analysis (poster), Nov. 2019.
  24. Hiroyuki Ootomo, Rio Yokota, TSQR on TensorCores, The International Conference for High Performance Computing, Networking, Storage, and Analysis (poster), Nov. 2019.
  25. Hiroki Naganuma, Rio Yokota, On Empirical Analysis of Layer-wised Learning Rate Schedule, ACML 2019 Workshop on Statistics & Machine Learning Researchers (poster), Nov. 2019.
  26. Satoshi Ohshima, Ichitaro Yamazaki, Akihiro Ida, Rio Yokota. Optimization of Numerous Small Dense-Matrix–Vector Multiplications in H-matrix Arithmetic on GPU, Auto-Tuning for Multicore and GPU (ATMG) In conjunction with the IEEE MCSoC-19, Oct. 2019.
  27. Yohei Tsuji, Kazuki Osawa, Yuichiro Ueno, Akira Naruse, Rio Yokota, Satoshi Matsuoka. Performance Optimizations and Analysis of Distributed Deep Learning with Approximated Second-Order Optimization Method, International Conference on Parallel Processing: The 1st Workshop on Parallel and Distributed Machine Learning, Proceedings of the 48th International Conference on Parallel Processing: Workshops, No. 21, Aug. 2019.
  28. Kazuki Osawa, Yohei Tsuji, Yuichiro Ueno, Akira Naruse, Rio Yokota, Satoshi Matsuoka. Second-order Optimization Method for Large Mini-batch: Training ResNet-50 on ImageNet in 35 Epochs, IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019.
  29. Yuichiro Ueno, Rio Yokota. Exhaustive Study of Hierarchical AllReduce Patterns for Large Messages Between GPUs, 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), May. 2019.
  30. Hiroki Naganuma, Rio Yokota. A Performance Improvement Approach for Second-Order Optimization in Large Mini-batch Training, 2nd High Performance Machine Learning Workshop CCGrid2019 (HPML2019), May. 2019.
  31. Ichitaro Yamazaki, Ahmad Abdelfattah, Akihiro Ida, Satoshi Ohshima, Stanimire Tomov, Rio Yokota, Jack Dongarra. Analyzing Performance of BiCGStab with Hierarchical Matrix on GPU clusters, 32nd IEEE International Parallel & Distributed Processing Symposium, May. 2018.
  32. Satoshi Ohshima, Ichitaro Yamazaki, Akihiro Ida, Rio Yokota. Optimization of Hierarchical Matrix Computation on GPU, SC Asia, Mar. 2018.
  33. Hiroki Naganuma, Rio Yokota. Accelerating Convolutional Neural Networks Using Low Precision Arithmetic, HPC Asia, Jan. 2018.
  34. Kazuki Oosawa, Rio Yokota. Evaluating the Compression Efficiency of the Filters in Convolutional Neural Networks, The 26th International Conference on Artificial Neural Networks, Sep. 2017.
  35. Mustafa AbdulJabbar, Mohammed Al Farhan, Rio Yokota, David Keyes. Performance Evaluation of Computation and Communication Kernels of the Fast Multipole Method on Intel Manycore Architecture, 3rd International European Conference on Parallel and Distributed Computing, Aug. 2017.
  36. Kazuki Oosawa, Akira Sekiya, Hiroki Naganuma, Rio Yokota. Accelerating Matrix Multiplication in Deep Learning by Using Low-Rank Approximation, The 2017 International Conference on High Performance Computing & Simulation, Jul. 2017.
  37. Mustafa AbdulJabbar, George Markomanolis, Huda Ibeid, Rio Yokota, David Keyes. Communication Reducing Algorithms for Distributed Heirarchical N-Body Methods, 32nd International Conference, ISC High Performance, Lecture Notes in Computer Science, Vol. 10266, pp. 79--96, Jun. 2017.
  38. Keisuke Fukuda, Motohiko Matsuda, Naoya Maruyama, Rio Yokota, Kenjiro Taura, Satoshi Matsuoka. Tapas: An Implicitly Parallel ProgrammingFramework For Hierarchical N-body Algorithms, The 22nd IEEE International Conference on Parallel And Distributed Systems, The 22nd IEEE International Conference on Parallel And Distributed Systems, Page 1100-1109, Dec. 2016.
  39. Rio Yokota. Fast Multipole Method as a Matrix-free Hierarchical Low-rank Approximation, International Workshop on Eigenvalue Problems, Sep. 2016.
  40. Rio Yokota, Huda Ibeid, David Keyes. Preconditioning Sparse Matrices Using a Highly Scalable Fast Multipole Method, 3rd International Workshops on Advances in Computational Mechanics, Oct. 2015.
  41. Huda Ibeid, Rio Yokota, Jennifer Pestana, David Keyes. Fast Multipole Preconditioners for Sparse Linear Solvers, 11th World Congress on Computational Mechanics, Jul. 2014.
  42. Hatem Ltaief, Rio Yokota. High Performance Numerical Algorithms for Seismic and Reservoir Simulations, GPU Technology Conference, Mar. 2014.
  43. Rio Yokota. Fast N-body Methods as a Compute-Bound Preconditioner for Sparse Solvers on GPUs, GPU Technology Conference, Mar. 2014.
  44. Abdelhalim Amer, Naoya Maruyama, Miquel Pericas, Kenjiro Taura, Rio Yokota, Satoshi Matsuoka. Fork-Join and Data-Driven Execution Models on Multi-core Architectures: Case Study of the FMM, International Supercomputing Conference, Lecture notes in computer science, LNCS, Vol. 7905, pp. 255-266, Jun. 2013.
  45. Jennifer Pestana, Rio Yokota, Huda Ibeid, David Keyes. Fast Multipole Method Preconditioning, International Conference On Preconditioning Techniques For Scientific And Industrial Applications, Jun. 2013.
  46. Abdul Abdelfatteh, Hatem Ltaief, Rio Yokota. Investigating New Numerical Techniques for Reservoir Simulations on GPUs, GPU Technology Conference, Mar. 2013.
  47. Kenjiro Taura, Jun Nakashima, Rio Yokota, Naoya Maruyama. A Task Parallelism Meets Fast Multipole Methods, Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA), Nov. 2012.
  48. Rio Yokota. Petascale Fast Multipole Methods on GPUs, GPU Technology Conference Japan, Jul. 2012.
  49. Hatem Ltaief, Rio Yokota. Data-Driven Fast Multipole Method on Distributed Memory Systems with Hardware Accelerators, 21st International Conference on Domain Decomposition Methods, Jun. 2012.
  50. Enas Yunis, Rio Yokota, Aron Ahmadia. Scalable Force Directed Graph Layout Algorithms Using Fast Multipole Methods, The 11th International Symposium on Parallel and Distributed Computing, Jun. 2012.
  51. Rio Yokota, Lorena Barba. Recent Trends in Hierarchical N-body Methods on GPUs, GPU Technology Conference, May. 2012.
  52. Hoang Vu Nguyen, Rio Yokota, Georgiy Stenchikov. A Parallel Numerical Simulation of Dust Particles Using Direct Numerical Simulation, European Geosciences Union General Assembly, Apr. 2012.
  53. Tetsu Narumi, Rio Yokota, Lorena Barba, Kenji Yasuoka. Petascale Turbulence Simulation Using FMM, HOKKE-19, Nov. 2011.
  54. Rio Yokota, Lorena Barba. Parameter Tuning of a Hybrid Treecode-FMM on GPUs, The First International Workshop on Characterizing Applications for Heterogeneous Exascale Systems, Jun. 2011.
  55. Rio Yokota, Lorena Barba. Fast Multipole Method vs. Spectral Methods for the Simulation of Isotropic Turbulence on GPUs, 23rd International Conference on Parallel Computational Fluid Dynamics, May. 2011.
  56. Rio Yokota, Jaydeep Bardhan, Matthew Knepley, Lorena Barba. (Really) Fast Macromolecular Electrostatics -- Fast Algorithms, Open Software and Accelerated Computing, ACS Division of Physical Chemistry 240th National Meeting, Aug. 2010.
  57. Rio Yokota, Lorena Barba. Performance of the Fast Multipole Method on GPUs Using Various Kernels, 9th World Congress on Computational Mechanics, Jul. 2010.
  58. Rio Yokota, Lorena Barba. Comparing the Treecode with FMM on GPUs for Vortex Particle Simulations of a Leapfrogging Vortex Ring, 22nd International Conference on Parallel Computational Fluid Dynamics, May. 2010.
  59. Rio Yokota, Shinnosuke Obi. Lagrangian Simulation of Turbulence Using Vortex Methods, 2nd International Workshops on Advances in Computational Mechanics, Mar. 2010.
  60. Tsuyoshi Hamada, Rio Yokota, Keigo Nitadori, Tetsu Narumi, Kenji Yasuoka, Makoto Taiji, Kyoshi Oguri. 42 TFlops Hierarchical N-Body Simulation on GPUs with Applications in Both Astrophysics and Turbulence, Supercomputing, Nov. 2009.
  61. Rio Yokota, Koji Fukagata, Shinnosuke Obi. Lagrangian Vortex Methods in Turbulent Channel Flows, 12th EUROMECH European Turbulence Conference, Sep. 2009.
  62. Rio Yokota, Tetsu Narumi, Ryuji Sakamaki, Kenji Yasuoka, Shinnosuke Obi. Fast Multipole Methods on GPUs for the Meshfree Simulation of Turbulence, 10th US National Congress on Computational Mechanics, Jul. 2009.
  63. Rio Yokota, Tetsu Narumi, Ryuji Sakamaki, Shun Kameoka, Kenji Yasuoka, Shinnosuke Obi. DNS of Homogeneous Turbulence Using Vortex Methods Accelerated by the FMM on a Cluster of GPUs, 21st International Conference on Parallel Compuational Fluid Dynamics, May. 2009.
  64. Rio Yokota, Tetsu Narumi, Ryuji Sakamaki, Shun Kameoka, Kenji Yasuoka, Shinnosuke Obi. Meshfree Simulation of Turbulence Using the Fast Multipole Methods on GPUs, 22nd Symposium on Computational Fluid Dynamics, Dec. 2008.
  65. Rio Yokota, Shinnosuke Obi. Direct Numerical Simulation of Homogeneous Shear Flow Using Vortex Methods, 4th International Conference on Vortex Flows and Vortex Models, Apr. 2008.
  66. Rio Yokota, Shinnosuke Obi. Mesh-Free Simulation of the Homogeneous Shear Flow Using Vortex Methods, 23rd IIS Turbulence and Shear Flow Dynamics Symposium, Mar. 2008.
  67. Rio Yokota, Shinnosuke Obi. Pure Lagrangian Vortex Methods for the Simulation of Decaying Isotropic Turbulence, 5th International Symposium on Turbulence and Shear Flow Phenomena, Aug. 2007.
  68. Rio Yokota, Shinnosuke Obi. Vortex Flow Simulation Between Multipole Bridge Decks, Whither Turbulence Prediction and Control, Mar. 2006.
  69. Rio Yokota, Shinnosuke Obi. Vortex Flow Simulation of Multipole Bluff Bodies, 3rd International Conference on Vortex Flows and Vortex Models, Nov. 2005.

国内学会(査読あり)

  1. 高橋 秀弥, 井上 中順, 横田 理央, 片岡 裕雄, 前田 英作, 学習過程における形状・テクスチャ偏重度の推移と事前学習データセットとの関係について, 第26回画像の認識・理解シンポジウム (MIRU), ポスター, Jul. 2023.
  2. 大川 快, 櫻田 健, 横田 理央, 単眼カメラを用いたリアルタイムな3次元マップの変化検出を目的とした密なバンドル調整, 第26回画像の認識・理解シンポジウム (MIRU), ポスター, Jul. 2023.
  3. 山田 亮佑, 原 健翔, 片岡 裕雄, 牧原 昴志, 井上 中順, 横田 理央, 佐藤 雄隆, Formula-Supervised Visual-Geometric Pre-training, 第26回画像の認識・理解シンポジウム (MIRU), ロングオーラル, Jul. 2023.
  4. 篠田 理沙, 速水 亮, 中嶋 航大, 井上 中順, 横田 理央, 片岡 裕雄,  数式ドリブン教師あり学習によるセマンティックセグメンテーション, 第26回画像の認識・理解シンポジウム (MIRU), ロングオーラル, Jul. 2023.
  5. 浅倉 拓也, 井上 中順, 横田 理央, 篠田 浩一, 受容野の自動最適化によるモードに適応的な Transformer の開発, 人工知能学会全国大会, Jun. 2023.
  6. 中村 祥大, 横田 理央, ラージバッチ学習における汎化性能の低下を抑制する正則化手法, 人工知能学会全国大会, Jun. 2023.
  7. 高橋 秀弥, 井上 中順, 横田 理央, 片岡 裕雄, 前田 英作, 画像識別における形状・テクスチャ偏重度と二重降下現象の関係について, パターン認識・メディア理解研究会 (PRMU), Mar. 2023.
  8. 杉山 佳史, 片岡 裕雄, 横田 理央, 井上 中順, 敵対的距離学習モジュールを用いた特徴変動に頑健な画像認識のための対照学習, パターン認識・メディア理解研究会 (PRMU), Dec. 2022.
  9. 田所 龍, 片岡 裕雄, 川上 玲, 横田 理央, 井上 中順, 蒸留画像による事前学習効果についての検討, ビジョン技術の実利用ワークショップ(ViEW), Dec. 2022.
  10. 大友 広幸, 横田 理央, Tensorコアを用いたTSQR, 日本応用数理学会年会, Sept. 2019.
  11. 岩瀬 駿, 櫻田 健, 変化の属性を考慮した物体レベルのシーン変化検出, 第22回 画像の認識・理解シンポジウム(MIRU2019), Oct. 2019.
  12. 長沼 大樹, 横田 理央. ラージバッチ学習のための自然勾配学習法におけるSmoothingの有効性, The 3rd Cross-disciplinary Workshop on Computing Systems, Infrastructures, and Programming (xSIG), May. 2019.
  13. 長沼大樹, 岩瀬 駿, 郭 林昇, 中田 光, 横田 理央. 自然勾配近似法を用いた大規模並列深層学習におけるハイパーパラメータ最適化, 第17回情報科学技術フォーラム 2018, Sep. 2018.
  14. 長沼大樹, 横田理央. 畳み込みニューラル ネットワークにおける低精度演算を用いた高速化の検証, GTC Japan, Dec. 2017.
  15. 長沼大樹, 関谷翠, 大沢和樹, 大友広幸, 桑村裕二, 横田理央. 深層学習における低精度演算を用いた高速化及びアクセラレーターの性能評価, パターン認識・メディア理解研究会, Oct. 2017.
  16. 大沢和樹, 関谷翠, 長沼大樹, 横田理央. 低ランクテンソル分解を用いた畳み込みニューラルネットワークの高速化, パターン認識・メディア理解研究会, Oct. 2017.
  17. 長沼大樹, 大沢和樹, 関谷翠, 横田理央. 深層学習における半精度演算を用いた圧縮モデルの高速化, 日本応用数理学会年会, Sep. 2017.
  18. 大島 聡史, 山崎 市太郎, 伊田 明弘, 横田理央. GPUクラスタ上における階層型行列計算の最適化, Summer United Workshops on Parallel, Distributed and Cooperative Processing, Jul. 2017.
  19. 大沢和樹, 関谷翠, 長沼大樹, 横田理央. 畳み込みニューラルネットワークの低ランク近似を用いた高速化, 第22回計算工学講演会, 計算工学講演会論文集 Vol.22, May. 2017.
  20. 横田理央, 小尾晋之介. 平行平板間乱流における渦法の検証, 日本流体力学会年会, Sep. 2009.
  21. 横田理央, 小尾晋之介. 渦法を用いた平行平板間乱流の解析, 流体力学会年会, Sep. 2008.
  22. 佐藤 彰, 横田理央, 小尾晋之介. 三次元渦法による翼端渦の数値解析, 第 21 回数値流体力学シンポジウム, Dec. 2007.
  23. 横田 理央, 小尾 晋之介. 渦法による一様せん断流の解析, 流体力学会年会, Aug. 2007.
  24. 横田理央, 小尾晋之介. 渦法によるメッシュフリー乱流解析, 日本機械学会東海支部 第56期総会・講演会, Mar. 2007.
  25. 横田理央, 小尾晋之介. 3次元渦法・境界要素法による流体-固体連成解析, 日本機械学会流体工学部門講演会, Oct. 2006.
  26. 横田理央, 小尾晋之介. 渦法を用いた物体後流の3次元解析, 日本機械学会年次大会, Sep. 2006.
  27. 横田理央, 小尾晋之介. 複数の鈍い形状物体周りの渦流れシミュレーション, 第 19 回数値流体シンポジウム, Dec. 2005.

国際学会(査読なし)

  1. Satoshi Yui, Hiromichi Kobayashi, Makoto Tsubota, Tomokazu Saito, Rio Yokota, Vortex-Filament Bundle Induced by Normal-Fluid Turbulence in Turbulent Superfluid Helium-4, International Symposium on Quantum Fluids and Solids (QFS), Aug. 2023.
  2. Qianxiang Ma, Rio Yokota, O(N) Factorization of Dense Matrices on GPUs Without Trailing Submatrix Dependencies, SIAM Conference on Computational Science and Engineering (CSE), Feb. 2023.
  3. Muhammad Ridwan Apriansyah, Rio Yokota, Parallel QR Factorization of Block Low-Rank Matrices, SIAM Conference on Computational Science and Engineering (CSE), Feb. 2023.
  4. Rio Yokota, Matrices in Deep Neural Networks and How to Compute Them in Parallel, IEEE CLUSTER 2022 Keynote, Heidelberg, Germany, Sep. 2022.
  5. Sameer Deshmukh, Acceleration of O(N) Solvers for Large Dense Matrices, Conference on Advance Topics and Auto Tuning in High-Performance Scientific Computing (ATAT2022), Mar. 2022.
  6. Muhammad Ridwan Apriansyah,Parallel QR Factorization of Block Low-rank Matrices, Conference on Advance Topics and Auto Tuning in High-Performance Scientific Computing (ATAT2022), Mar. 2022.
  7. Thomas Spendlhofer, Iterative Refinement with Hierarchical Low-rank Preconditioners Using Mixed Precision, Conference on Advance Topics and Auto Tuning in High-Performance Scientific Computing (ATAT2022), Mar. 2022.
  8. Rio Yokota, Approximations of Natural Gradient Descent in Distributed Training, INFORMS Annual Meeting Session: Beyond first order methods in machine learning systems I, Oct. 2021.
  9. Rio Yokota, Overview of structured low-rank approximation methods, IUTAM Symposium on Computational Methods for Large-Scale and Complex Wave Problems, Jun. 2021.
  10. Rio Yokota, Overview of Distributed Memory Parallelism in Deep Learning, DD26 MS01, Learning, Algorithms, Domain Decomposition Methods, and Applications, Dec. 2020.
  11. Rio Yokota, Distributed Deep Learning with Second Order Information, SPCL_Bcast(COMM_WORLD), Oct. 2020.
  12. Rio Yokota, Degree of Approximation and Overhead of Computing Curvature, Information, and Noise Matrices, ICML Workshop “Beyond first order methods in machine learning systems”, July, 2020.
  13. Rio Yokota, Recent Trends in Hierarchical Low-Rank Approximation Methods, Tokyo Institute of Technology and Stony Brook University Joint Science and Technology Meeting, May. 2019.
  14. Rio Yokota, Yohei Tsuji, Kazuki Osawa, Second Order Optimization for Distributed Data-parallel Deep Learning on 4000 GPUs, I2R-TokyoTech Co-workshop on DL 2.0, Mar. 2019.
  15. Rio Yokota: Kronecker Factorization for Second Order Optimization in Deep Learning, SIAM CSE, Feb. 2019.
  16. Rio Yokota. Optimization Methods for Large Scale Distributed Deep Learning, IPAM Workshop I: Big Data Meets Large-Scale Computing, Sep. 2018.
  17. Rio Yokota. Early Application Results on TSUBAME 3, Smoky Mountains Computational Sciences and Engineering Conference, Aug. 2018.
  18. Rio Yokota. Scaling Deep Learning to Thousands of GPUs, HPC 2018, Jul. 2018.
  19. Rio Yokota. Energy Conserving Fast Multipole Methods for the Calculation of Long-range Interactions, Mathematics in Action: Modeling and analysis in molecular biology and electro- physiology, Jun. 2018.
  20. Rio Yokota. Can we use Hierarchical Low-Rank Approximation for Deep Learning?, HPC Saudi 2018, Mar. 2018.
  21. Rio Yokota. Hierarchical Low-Rank Approximations at Extreme Scale, 32nd International Conference, ISC High Performance, Jun. 2017.
  22. Rio Yokota. Compute-Memory Tradeoff in Hierarchical Low-Rank Approximation Methods, SIAM Conference on Computational Science and Engineering, Feb. 2017.
  23. Rio Yokota. Energy Conservation of Fast Multipole Methods in Classical Molecular Dynamics Simulations, 7th AICS International Symposium, Feb. 2017.
  24. Rio Yokota. Improving Data Locality of Fast Multipole Methods, Third Workshop on Programming Abstractions for Data Locality, Kobe, Oct. 2016.
  25. Huda Ibeid, Rio Yokota, David Keyes. A Matrix-Free Preconditioner for Elliptic Solvers Based on the Fast Multipole Method, SIAM Conference on Parallel Processing for Scientific Computing, Apr. 2016.
  26. Rio Yokota. A Common API for Fast Multipole Methods, Accelerate Data Analytics and Computing Workshop, Jan. 2016.
  27. Rio Yokota, Francois-Henri Rouet, Xiaoye Sherry Li. Comparison of FMM and HSS at Large Scale, SIAM Conference on Applied Linear Algebra, Oct. 2015.
  28. Rio Yokota. Various Implementations of FMM and Their Performance on Future Architectures, Multi-resolution Interactions Workshop, Aug. 2015.
  29. Rio Yokota. ExaFMM -- a Testbed for Comparing Various Implementations of the FMM, SIAM Conference on Computational Science and Engineering, Mar. 2015.
  30. Huda Ibeid, Jennifer Pestana, Rio Yokota, David Keyes. Fast Multipole Method as Preconditioner, SIAM Conference on Computational Science and Engineering, Mar. 2015.
  31. Rio Yokota, David Keyes. Communication Complexity of the Fast Multipole Method and its Algebraic Variants, CBMS-NSF Conference: Fast Direct Solvers for Elliptic PDEs, Jun. 2014.
  32. Rio Yokota. Advances in Fast Multipole Methods for Scalable Electrostatics Calculations, Workshop: Electrostatics methods in Molecular Simulation, May. 2013.
  33. Huda Ibeid, Rio Yokota, David Keyes. Fast Multipole Method as a Preconditioner, SIAM Conference on Computational Science and Engineering, Feb. 2013.
  34. Rio Yokota. Petascale Fast Multipole Methods on GPUs, The 11th International Symposium on Parallel and Distributed Computing, Jun. 2012.
  35. Rio Yokota, Tetsu Narumi, Lorena Barba, Kenji Yasuoka. Scaling Fast Multipole Methods up to 4000 GPUs, ATIP/A*CRC Workshop on Accelerator Technologies for High Performance Computing, May. 2012.
  36. Rio Yokota. Running Fast Multipole Method on the Full Node of TSUBAME and K computer, Scalable Hierarchical Algorithms for Extreme Computing, Apr. 2012.
  37. Rio Yokota. Fast N-body Methods on Many-core and Heterogenous Systems, International Workshop on Computational Science and Numerical Analysis, Mar. 2012.
  38. Rio Yokota. Petaflops Scale Turbulence Simulation on TSUBAME 2.0, GPU@BU Workshop, Nov. 2011.
  39. Rio Yokota, Lorena Barba. Large Scale Multi-GPU FMM for Bioelectrostatics, SIAM Conference on Computational Science and Engineering, Feb. 2011.
  40. Rio Yokota. 12 Steps to a Fast Multipole Method on GPUs, Pan-American Advanced Studies Institute, Jan. 2011.
  41. Rio Yokota, Lorena Barba. RBF Interpolation using Gaussians with Domain Decomposition on GPUs, SIAM Annual Meeting, Jul. 2010.
  42. Rio Yokota. Range of Applications for the Fast Multipole Method on GPUs, Accelerated Computing, Jan. 2010.

国内学会(査読なし)

  1. 中村 秋海, 横田 理央, GPUとA64FXにおけるTransformerの性能比較, 第188回ハイパフォーマンスコンピューティング研究発表会, Mar. 2023.
  2. 大友 広幸, 真鍋 秀隆, 原田 健自, 横田 理央, Tensorコアによる単精度行列積エミュレーションの自動精度選択を用いた量子回路シミュレーション, 第188回ハイパフォーマンスコンピューティング研究発表会, Mar. 2023.
  3. 齋藤 智和, 横田 理央, 量子渦計算の高速多重極展開法を用いた高速化, 情報処理学会全国大会, Mar. 2023.
  4. 石川 智貴, 横田 理央, 深層学習における勾配の前処理法に関する検討, 情報処理学会全国大会, Mar. 2023.
  5. 近江 俊樹, 中村 凌, 片岡 裕雄, 井上 中順, 横田 理央, ニュートンフラクタル画像による事前学習効果, 情報処理学会全国大会, Mar. 2023.
  6. 大友 広幸, 横田 理央, Tensorコアを用いた単精度行列積エミュレーションのアプリケーションでの評価, 第185回ハイパフォーマンスコンピューティング研究発表会(SWoPP2022), Jul. 2022.
  7. 大友広幸, 坂本亮, PEZY-SC3sプロセッサを用いたFull-state量子回路シミュレーション, 第187回ハイパフォーマンスコンピューティング研究発表会, Dec. 2022.
  8. 横田理央,「O(N)で並列性の高い密行列のLU分解」,RIMS 研究集会
    「数値解析が拓く次世代情報社会~エッジから富岳まで~」, Oct. 2022
  9. 横田理央, 「人工画像を用いたVision Transformerの大規模事前学習」, 社会的課題解決型データサイエンス・AI研究推進体シンポジウム, Sep. 2022
  10. 横田理央, 「人工画像を用いたVision Transformerの大規模事前学習」, DENSO IT LAB x TOKYO TECH Discussion Night in MIRU, Jul. 2022
  11. 高橋那弥, 八嶋晋吾, 石川康太, 佐藤育郎, 横田理央, 走行動画の大規模自己教師あり学習の検討と計画, 第25回 画像の認識・理解シンポジウム, Jul. 2022
  12. 大友 広幸, 横田 理央, Tensorコアを用いた単精度行列積エミュレーションのアプリケーションでの評価, 第185回ハイパフォーマンスコンピューティング研究発表会, Jun. 2022.
  13. 大友 広幸, 横田 理央, TensorCoreを用いた精度補正単精度行列積, 第180回ハイパフォーマンスコンピューティング研究発表会, Jun. 2021.
  14. 伊田 明弘, 荻田 武史, 横田 理央,「対称ブロック低ランク行列の精度保証付き固有値問題解法」,日本応用数理学会2022年度年会, Sep. 2022
  15. 石井央,横田理央,「深層学習における2次最適化の汎化性能の検証」,第84回情報処理学会全国大会、 Mar. 2022.
  16. 中村秋海,横田理央,「Vision Transformerにおけるバッチサイズの汎化性能への影響」,第84回情報処理学会全国大会、 Mar. 2022. (学生奨励賞)
  17. 横田 理央,「大規模並列深層学習の基礎」,Nvidia HPC Weeks,(基調講演), Oct. 2021.
  18. 横田 理央,「階層的低ランク近似法に関するレビュー」,第40回計算数理工学フォーラム,Sep. 2021.
  19. 横田 理央,「スパコンを用いた大規模並列分散深層学習」,IBISML,情報論的学習理論とコンピューティング基礎,Mar. 2021.
  20. 横田 理央,「深層学習におけるヘッセ行列,フィッシャー行列,共分散行列の高速近似解法」,ATマイクロワークショップ,Oct. 2020.
  21. 中田 光,横田 理央.「画像分類のための継続的な事前学習における教師なし表現学習の堅牢性に関する検証」.人工知能学会全国大会(第34回),Jun. 2020.
  22. 大友広幸,横田理央 「TensorコアのAPIの構造解析を用いた拡張ライブラリの開発」 第173回ハイパフォーマンスコンピューティング研究会(COVID-19のため情報処理学会電子図書館にて公開のみ) Mar. 2020.
  23. 所畑貴大, 長沼大樹, 横田理央, 「確率的重み付け平均法のラージバッチ学習における有用性の検証」, 情報処理学会全国大会, Mar. 2020.
  24. 横田理央, 「二次最適化を用いた巨大な言語モデルの学習およびFRNNを用いたプラズマ挙動予測」,ABCI グランドチャレンジ 2019 成果発表会, Feb. 2020.
  25. 横田理央, 「二次最適化を用いたImageNetの大規模分散深層学習」, 精密工学会 画像応用技術専門委員会, 第5回定例研究会「人工知能・データサイエンス」, Jan. 2020.
  26. 横田理央, 「近似行列分解と分散深層学習」, RIMS研究会 諸科学分野を結ぶ基礎学問としての数値解析学, Nov. 2019.
  27. 八島慶汰, 石川康太, 佐藤育郎, 野村哲弘, 横田理央, 松岡聡. 早期終了タイミングを予測する:深層学習における確率勾配の分布の変化点検出, 第22回情報論的学習理論ワークショップ (IBIS 2019), Nov. 2019.
  28. 横田理央,「深層学習に現れる密行列も構造」,自動チューニング研究会マイクロワークショップ,Oct. 2019.
  29. 横田理央,「深層学習の高速化と大規模並列化」,情報処理学会連続セミナー第3回:AIと歩む未来(2):画像・映像処理の最前線, Sep. 2019
  30. Peter Spalthoff, 横田 理央. Flexible and Simplistic Hierarchical Matrix-Based Fast Direct Solver, 第170回ハイパフォーマンスコンピューティング研究発表, Jul. 2019.
  31. 大友 広幸, 横田 理央. Tensorコアを用いたTSQRのGPU実装, 第170回ハイパフォーマンスコンピューティング研究発表, Jul. 2019.
  32. 横田理央,「ImageNetベンチマークの大規模並列深層学習」,第14回名工大・核融合研共同セミナー:流体シミュレーションとディープラーニング, Jul. 2019.
  33. 横田理央,「階層的低ランク近似法の最新研究動向と応用例」,第22回AT研究会オープンアカデミックセッション, May. 2019.
  34. 横田理央, 大沢和樹, 辻陽平, 上野裕一郎, 成瀬彰. 大規模並列深層学習における2次の最適化手法の効果, 電子情報通信学会総合大会, Mar. 2019.
  35. 長沼大樹, 横田理央. 大規模並列深層学習のための目的関数の平滑化, 第81回情報処理学会全国大会, Mar. 2019.
  36. 大友広幸, 横田理央. Tensorコアを用いたBatched QR分解, 第81回情報処理学会全国大会, Mar. 2019.
  37. 中田光, 大沢和樹, 横田理央. 自然勾配法に基づく変分深層学習, 第81回情報処理学会全国大会, Mar. 2019.
  38. 長沼 大樹, 横田 理央. ノイズ注入による平均化を用いたラージバッチ学習の汎化性能改善手法の検討, 電子情報通信学会総合大会, Mar. 2019.
  39. 大沢和樹, 横田理央, Chuan-Sheng Foo, Vijay Chandrasekhar. Fisher情報行列の解析に基づく大規模深層学習のための二次最適化手法, 第81回情報処理学会全国大会, Mar. 2019.
  40. 桑村祐二, 大沢和樹, 横田理央. 自然勾配法の近似手法における学習パラメータの調整, 情報処理学会全国大会, Mar. 2018.
  41. 大友広幸, 大沢和樹, 横田理央. フィッシャー情報行列のクロネッカー因子分解を用いた深層学習, 情報処理学会 全国大会, Mar. 2018.
  42. 大友 広幸, 大沢 和樹, 横田 理央. フィッシャー情報行列のクロネッカー因子分解を用いた深層ニューラルネットワークの分散学習, 第163回ハイパフォーマンスコンピューティング研究発表会, Mar. 2018.
  43. 本山 義史, 遠藤 敏夫, 松岡 聡, 横田 理央, 福田 圭祐, 佐藤 育郎. 低ランク近似行列によるCNNにおける畳み込み演算の最適化, 第158回ハイパフォーマンスコンピューティング研究発表会, 2017-HPC-158 No.25, Mar. 2017.
  44. 関谷翠, 大沢和樹, 長沼大樹, 横田理央. 低ランク近似を用いた深層学習の行列積の高速化, 第158回ハイパフォーマンスコンピューティング研究発表会, Mar. 2017.
  45. 横田理央. Fast Multipole Method を用いた多種アーキテクチャ向け スーパーコンピュータ用ライブラリの開発と 分子・流体シミュレーションでの評価, 学際大規模情報基盤共同利用・共同研究拠点 第8回シンポジウム, Jul. 2016.
  46. 横田理央. FMMの性能の可搬性, 第21回計算工学講演会, May. 2016.
  47. 横田理央. FMMの自動チューニング可能なパラメータについて, 第7回自動チューニング研究会, Dec. 2015.
  48. Rio Yokota, Tetsu Narumi, Kenji Yasuoka, Toshikazu Ebisuzaki, Shinnosuke Obi. MDGRAPE-3 を用いた渦法による乱流の直接数値シミュレーション, 次世代スーパーコンピューティング・シンポジウム, Oct. 2007.
  49. 横田 理央, 小尾 晋之介. 渦法による一様等方性乱流の解析, 第 20 回数値流体力学シン ポジウム, Dec. 2006.