Media Information Laboratory Past news | NTT Communication Science Laboratories

2025

July 2025

The paper entitled “Word Error Rate Definitions and Algorithms for Long-Form Multi-Talker Speech Recognition” has been accepted to IEEE Transactions on Audio, Speech and Language Processing.
https://ieeexplore.ieee.org/abstract/document/11082427
July 2025

The paper entitled “Generic speech enhancement with self-supervised representation space loss” has been accepted to Frontiers in Signal Processing.
https://doaj.org/article/421169964ec845128bfd1141dbb4a46e
July 2025

The paper entitled “Microphone Array Geometry Independent Multi-Talker Distant ASR: NTT System for the DASR Task of the CHiME-8 Challenge” has been accepted to Computer Speech and Language.
https://www.sciencedirect.com/science/article/pii/S0885230825000452
July 2025

The paper entitled “Maximizing Predicted Signal-to-Distortion Ratio: A New Microphone Selection Criterion for Beamforming in Acoustic Sensor Networks” has been accepted to IEEE Transactions on Audio, Speech and Language Processing.
https://ieeexplore.ieee.org/document/11017629
June 2025

The paper entitled “Acousto-Optic Reconstruction of Exterior Sound Field Based on Concentric Circle Sampling with Circular Harmonic Expansion” has been accepted to IEEE Transactions on Instrumentation and Measurement.
https://ieeexplore.ieee.org/document/11028067
June 2025

The report “Probin M2D: Technical Report for the ICME 2025 Audio Encoder Capability Challenge” has been accepted to the Audio Encoder Capability Challenge held at the IEEE International Conference on Multimedia & Expo (ICME 2025).
June 2025

The paper titled "Why Shape Matters: Experimental Evidence behind Sound of Musical Triangle," published in The Journal of the Acoustical Society of America Express Letters (JASA-EL), was selected as the cover image for the May 2025 issue of JASA-EL.
https://pubs.aip.org/asa/jel/issue/5/5
May 2025

The paper entitled “Three-Dimensional Sound Field Reconstruction from Optical Projections Using Physics-Informed Neural Networks” has been accepted to The Journal of the Acoustical Society of America Express Letters (JASA-EL).
May 2025

The following 14 papers from the Media Information Laboratory have been accepted to Interspeech2025:
https://group.ntt/en/topics/2025/08/07/interspeech2025.html
・Daisuke Niizumi, Daiki Takeuchi, Binh Thien Nguyen, Masahiro Yasuda, Yasunori Ohishi, Noboru Harada, "Towards Pre-training an Effective Respiratory Audio Foundation Models"
・Daiki Takeuchi, Binh Thien Nguyen, Masahiro Yasuda, Daisuke Niizumi, Yasunori Ohishi, Noboru Harada, "CLAP-ART: Automated Audio Captioning with Semantic-rich Audio Representation Tokenizer"
・Naoyuki Kamo, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, "MOVER: Combining Multiple Meeting Recognition Systems"
・Koharu Horii, Naohiro Tawara, Atsunori Ogawa, Shoko Araki, "Why is children's ASR so difficult? Analyzing children's phonological error patterns using SSL-based phoneme recognizers"
・Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Ryo Fukuda, Chen William (CMU), Shinji Watanabe (CMU), "Pick and Summarize: Integrating Extractive and Abstractive Speech Summarization"
・Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Yuto Kondo, "FasterVoiceGrad: Faster One-step Diffusion-Based Voice Conversion with Adversarial Diffusion Conversion Distillation"
・Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Yuto Kondo, "Vocoder-Projected Feature Discriminator"
・Yuto Kondo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko, "JIS: A Speech Corpus of Japanese Idol Speakers with Various Speaking Styles"
・Takafumi Moriya, Masato Mimura, Kiyoaki Matsui, Hiroshi Sato, Kohei Matsuura, "Attention-Free Dual-Mode ASR with Latency-Controlled Selective State Spaces"
・Takanori Ashihara, Marc Delcroix, Tsubasa Ochiai, Kohei Matsuura, Shota Horiguchi, "Analysis of Semantic and Acoustic Token Variability Across Speech, Music, and Audio Domains"
・Shota Horiguchi, Atsushi Ando, Marc Delcroix, Naohiro Tawara, "Pretraining Multi-Speaker Identification for Neural Speaker Diarization"
・Shota Horiguchi, Takanori Ashihara, Marc Delcroix, Atsushi Ando, Naohiro Tawara, "Mitigating Non-Target Speaker Bias in Guided Speaker Embedding"
・Marc Delcroix, "Advances in Conversational Speech Recognition" (Survey talk)
・Keigo Wakayama, Tomoko Kawase, Takafumi Moriya, Marc Delcroix, Hiroshi Sato, Tsubasa Ochiai, Masahiro Yasuda, Shoko Araki, "Real-time TSE demonstration via SoundBeam with KD" (Show and Tell)
May 2025

The paper entitled “Baseline Systems and Evaluation Metrics for Spatial Semantic Segmentation of Sound Scenes” has been accepted to European Signal Processing Conference (EUSIPCO2025).
May 2025

The paper entitled “Extension of Deep Sound-Field Denoiser to High-Frequency Sound Fields Considering Wavenumber Spectral Loss” has been accepted to the IEEE International Conference on Image Processing (ICIP2025).
May 2025

The paper titled "Why Shape Matters: Experimental Evidence behind Sound of Musical Triangle," which was accepted for publication in The Journal of the Acoustical Society of America Express Letters (JASA-EL), was selected for a feature article on the AIP Publishing website, and an article based on an interview has been published.
https://publishing.aip.org/publications/latest-content/would-a-musical-triangle-of-any-other-shape-sound-as-sweet/
April 2025

The paper entitled “Flexible Source-Free Domain Generalization via Domain Prompt-Discriminator Collaborative Learning” has been accepted to the International Joint Conference on Neural Networks (IJCNN2025).
April 2025

The review paper entitled “Target Sound Information Extraction: Speech and Audio Processing With Neural Networks Conditioned on Target Clues” has been published in Acoustical Science and Technology.
https://www.jstage.jst.go.jp/article/ast/46/3/46_e24.124/_article/-char/ja
April 2025

The paper entitled “Assessing the Utility of Audio Foundation Models for Heart and Respiratory Sound Analysis” has been accepted to IEEE Internatinal conference of the Engineering in Medicine and Biology Society (EMBC2025).
https://arxiv.org/pdf/2504.18004
March 2025

The paper entitled “Structure From Collision” has been accepted to IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR2025).
https://cvpr.thecvf.com/virtual/2025/poster/34297
March 2025

The paper entitled “Module-Based End-to-End Distant Speech Processing: A case study of far-field automatic speech recognition” has been published in IEEE Signal Processing Magazine.
https://ieeexplore.ieee.org/document/10819672
March 2025

The paper entitled “Why Shape Matters: Experimental Evidence behind Sound of Musical Triangle” has been accepted to The Journal of the Acoustical Society of America Express Letters (JASA-EL).
March 2025

The paper entitled “Hybrid Dynamics of Henon Maps” has been accepted to Mathematical Zeitschrift.
https://arxiv.org/pdf/2212.10851
March 2025

The paper entitled “Equivalence Between Non-Commutative Harmonic Oscillators and Two-Photon Quantum Rabi Models” has been accepted to International Mathematics Research Notices.
https://arxiv.org/pdf/2405.19814
January 2025

[Award] The paper entitled “Cross-Action Cross-Subject Action Recognition Via Simultaneous Action-Subject Learning with Two-Step Feature Removal” won the Best Paper Award 1st Runner-up at the IEEE International Conference on Image Processing (ICIP2024).

2024

December 2024

[Award] The paper entitled “Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings” won the Best paper honorable mention at the IEEE Spoken Language Technology Workshop (SLT2024).
December 2024

The following 20 papers from the Media Information Laboratory have been accepted to the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2025):
https://group.ntt/en/topics/2025/03/31/icassp2025.html
・Yuto Kondo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko, "Core Aggregation via Quantized Distribution Fitting and ITS Application to Predictor Learning"
・Binh Thien Nguyen, Daiki Takeuchi, Masahiro Yasuda, Daisuke Niizumi, Noboru Harada, "Negative and Balanced Sampling for Language-query Audio Source Separation"
・Stefan Bruhn, Tomas Toftgår, Stefan Döhla, Huan-yu Su, Lasse Laaksonen, Takehiro Moriya, Stéphane Ragot, Hiroyuki Ehara, Marek Szczerba, Imre Varga, Andrey Schevciw, Milan Jerinec, "3GPP IVAS Codec – Perspectives on Development, Testing and Standardization"
・Takehiro Moriya, Stephane Ragot, Arnaud Lefort, Alexandre Guerin, Noboru Harada, Ryosuke Sugiura, Yutaka Kamamoto, "EVS-Compatible Downmix in 3GPP IVAS"
・Masahiro Nakano, Hiroki Sakuma, Ryo Nishikimi, Kenji Komiya, Tomoharu Iwata, Kunio Kashino, "Hyperbolic PHATE: Visualizing Continuous Hierarchy of Latent Differentiation Structures"
・Nao Sato, Masahiro Yasuda, Shoichiro Saito, Noboru Harada, "Sound Source Distance Estimation Utilizing Physics-informed Prior for Sound Event Localization and Detection"
・Masahiro Yasuda, Shoichiro Saito, Nao Sato, Noboru Harada, "Spatial Annotation-free Training for Sound Event Localization and Detection"
・Junpei Honma, Akisato Kimura, Go Irie, "Multi-Task Learning for Ultrasonic Echo-based Depth Estimation with Audible Frequency Recovery"
・Tomohiro Nakatani, Naoyuki Kamo, Marc Delcroix, Shoko Araki, "A Hybrid Probabilistic-Deterministic Model Recursively Enhancing Speech"
・Naohiro Tawara, Atsushi Ando, Shota Horiguchi, and Marc Delcroix, "Multi-channel Speaker Counting for EEND-VC-based Speaker Diarization on Multi-domain Conversation"
・Takatomo Kano, Atsunori Ogawa, Marc Delcroix, William Chen, Ryo Fukuda, Kohei Matsuura, Takanori Ashihara, Shinji Watanabe, "Bridging Speech and Text Foundation Models with ReShape Attention"
・Ryo Fukuda, Takatomo Kano, Atushi Ando, Atunori Ogawa, "Whisper-ER: Speech Emotion Recognition Based on Large-Scale Automatic Speech Recognizer"
・Shoko Araki, Nobutaka Ito, Reinhold Haeb-Umbach, Gordon Wichern, Zhong-Qiu Wang, Yuki Mitsufuji, "30+ Years of Source Separation Research: Achievements and Future Challenges"
・Takafumi Moriya, Shota Horiguchi, Marc Delcroix, Ryo Masumura, Takanori Ashihara, Hiroshi Sato, Kohei Matsuura, Masato Mimura, "Alignment-Free Training for Transducer-based Multi-Talker ASR"
・Carlos Hernandez-Olivan, Marc Delcroix, Tsubasa Ochiai, Daisuke Niizumi, Naohiro Tawara, Tomohiro Nakatani, Shoko Araki, "SoundBeam meets M2D: Target Sound Extraction with Audio Foundation Model"
・Alexis Plaquet, Naohiro Tawara, Marc Delcroix, Atsushi Ando, Shota Horiguchi, Shoko Araki, "Mamba-based Segmentation Model for Speaker Diarization"
・Junyi Peng, Takanori Ashihara, Marc Delcroix, Tsubasa Ochiai, Oldrich Plchot, Shoko Araki, Jan Cernock, "TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning Models"
・Shota Horiguchi, Takafumi Moriya, Atsushi Ando, Takanori Ashihara, Hiroshi Sato, Naohiro Tawara, Marc Delcroix, "Guided Speaker Embedding"
・Tsubasa Ochiai, Kazuma Iwamoto, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, "Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance" (Journal Paper Presentation)
・Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, Kunio Kashino, "Masked Modeling Duo: Towards Universal Audio Pre-training Framework" (Journal Paper Presentation)
November 2024

The following two papers have been accepted to the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV2025):
・Shogo Sato, Takuhiro Kaneko, Kazuhiko Murasaki, Taiga Yoshida, Ryuichi Tanida, Akisato Kimura, "Unsupervised Single-Image Intrinsic Image Decomposition With LiDAR Intensity Enhanced Training"
・Risako Tanigawa, Kenji Ishikawa, Noboru Harada, Yasuhiro Oikawa, "SoundSil-DS: Deep Denoising and Segmentation of Sound-Field Images with Silhouettes"
November 2024

The paper entitled “Sound Field Reconstruction Using Optical Sound Measurement and Neural Fields” has been accepted to the IEEE International Workshop on Machine Learning for Signal Processing (MLSP2024).
November 2024

The paper entitled “Direct Moment Estimation of Intensity Distribution of Magnetic Fields with Quantum Sensing Network” has been accepted to New Journal of Physics.
https://iopscience.iop.org/article/10.1088/1367-2630/ad93f4
November 2024

The paper entitled “Wolstenholme Primes and Group Determinants of Cyclic Groups” has been accepted to Proceedings of the Japan Academy, Ser. A.
https://projecteuclid.org/journals/proceedings-of-the-japan-academy-series-a-mathematical-sciences/volume-100/issue-9/Wolstenholme-primes-and-group-determinants-of-cyclic-groups/10.3792/pjaa.100.011.full?tab=ArticleLink
November 2024

[Award] The paper entitled “Cross-Action Cross-Subject Skeleton Action Recognition Via Simultaneous Action-Subject Learning with Two-Step Feature Removal” won the Best Paper Award 1st Runner-up at the IEEE International Conference on Image Processing (ICIP2024).
https://ieeexplore.ieee.org/document/10647253
October 2024

The following two papers have been accepted to Asia Pacific Signal and Information Processing Association (APSIPA) Annual Summit and Conference (APSIPA ASC2024):
・Chihiro Watanabe, Hirokazu Kameoka, "GE2E-AC: Generalized End-to-End Loss Training for Accent Classification"
・Xiao Zhang, Haoran Xing, Mingxue Song, Daiki Takeuchi, Noboru Harada, Shoji Makino, "Prediction-Error-Based Adaptive SpecAugment for Fine-Tuning the Masked Model on Audio Classification Tasks"
October 2024

The paper entitled “Rewindable Quantum Computation and Its Equivalence to Cloning and Adaptive Postselection” has been accepted to Theory of Computing Systems.
https://link.springer.com/article/10.1007/s00224-024-10208-5
September 2024

The following paper has been accepted to IEEE Signal Processing Magazine:
Reinhold Haeb-Umbach, Tomohiro Nakatani, Marc Delcroix, Christoph Boeddeker, Tsubasa Ochiai, "Microphone Array Signal Processing and Deep Learning for Speech Enhancement"
https://ieeexplore.ieee.org/document/10819706
September 2024

The following paper has been accepted to EURASIP Journal on Audio, Speech, and Music Processing:
Tetsuya Ueda, Tomohiro Nakatani, Rintaro Ikeshita, Shoko Araki & Shoji Makino, "DOA-Informed Switching Independent Vector Extraction and Beamforming for Speech Enhancement in Underdetermined Situations"
https://asmp-eurasipjournals.springeropen.com/articles/10.1186/s13636-024-00373-3
September 2024

The paper entitled “Masked Modeling Duo: Towards Universal Audio Pre-Training Framework” has been accepted to IEEE Transactions on Audio, Speech and Language Processing (TASLP).
https://ieeexplore.ieee.org/document/10502167
September 2024

The paper entitled “Exploring Pre-Trained General-Purpose Audio Representations for Heart Murmur Detection” has been accepted to IEEE Engineering in Medicine and Biology Society（EMBC2024).
https://arxiv.org/pdf/2404.17107
September 2024

The following three papers have been accepted to the workshop on Detection and Classification of Acoustic Scenes and Events（DCASE2024):
・Daiki Takeuchi, Masahiro Yasuda, Daisuke Niizumi, Noboru Harada, "Towards Learning a Difference-Aware General-Purpose Audio Representation"
・Tomoya Nishida, Noboru Harada, Daisuke Niizumi, Davide Albertini, Roberto Sannino, Simone Pradolini, Filippo Augusti, Keisuke Imoto, Kota Dohi, Harsh Purohit, Takashi Endo, Yohei Kawaguchi, "Description and Discussion on DCASE 2024 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring"
・Daisuke Niizumi, Noboru Harada, Yasunori Ohishi, Daiki Takeuchi, Masahiro Yasuda, "ToyADMOS2#: Yet Another Data for the DCASE2024 Challenge Task 2 First-Shot Anomalous Sound Detection"
September 2024

The paper entitled “Probabilistic Unitary and State Synthesis with Optimal Accuracy” has been accepted to the 6th International Workshop on Quantum Compilation (IWQC 2024).
https://dl.acm.org/doi/pdf/10.1145/3663576
September 2024

The paper entitled “Zeta Limits for The Spectrum of Quantum Rabi Models” has been accepted to Journal of Mathematical Physics.
https://arxiv.org/pdf/2304.08943
July 2024

The following paper has been accepted to the IEEE/ACM Transactions on Audio, Speech, and Language Processing:
Tsubasa Ochiai, Kazuma Iwamoto, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri, "Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance" IEEE/ACM Transactions on Audio, Speech, and Language Processing.
https://ieeexplore.ieee.org/document/10606400
July 2024

The paper entitled “Acoustic-Based 3D Human Pose Estimation Robust to Human Position” has been accepted to the British Machine Vision Conference（BMVC2024).
https://bmva-archive.org.uk/bmvc/2024/papers/Paper_135/paper.pdf
July 2024

The following paper has been accepted to IEEE Access:
Takanori Ashihara, Marc Delcroix, Yusuke Ijima, Makio Kashino, "Unveiling the Linguistic Capabilities of a Self-Supervised Speech Model Through Cross-Lingual Benchmark and Layer-Wise Similarity Analysis"
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10597571
July 2024

The following paper has been accepted to the EURASIP Journal on Audio, Speech, and Music Processing:
Daiki Mori, Kengo Ohta, Ryota Nishimura, Atsunori Ogawa, Norihide Kitaoka, "Recognition of Target Domain Japanese Speech Using Language Model Replacement"
https://asmp-eurasipjournals.springeropen.com/articles/10.1186/s13636-024-00360-8
July 2024

The paper entitled “Northcott Numbers for Generalized Weighted Weil Heights” has been accepted to Acta Arithmetica.
https://arxiv.org/pdf/2308.03981
July 2024

The paper entitled “Finite-Key Security of Differential-Phase-Shift QKD” has been accepted to Asian Quantum Information Science Conference（AQIS2024).
July 2024

The paper entitled “Spacing Distribution for Quantum Rabi Models” has been accepted to Journal of Physics A: Mathematical and Theoretical.
https://arxiv.org/pdf/2310.09811
July 2024

The paper entitled “Activity Measures of Dynamical Systems Over Non-Archimedean Fields” has been accepted to Discrete and Continuous Dynamical Systems.
https://arxiv.org/pdf/1901.01075
June 2024

The following 7 papers have been accepted to Interspeech2024.
・Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Tomohiro Tanaka, Masato Mimura, Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Taichi Asami, " Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation"
・Hiroshi Sato, Takafumi Moriya, Masato Mimura, Shota Horiguchi, Tsubasa Ochiai, Takanori Ashihara, Atsushi Ando, Kentaro Shinayama, Marc Delcroix, "SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling"
・Kenichi Fujita, Takanori Ashihara, Marc Delcroix, Yusuke Ijima, " Lightweight Zero-shot Text-to-Speech with Mixture of Adapters"
・Marvin Tammen, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Shoko Araki, Simon Doclo, "Array Geometry-Robust Attention-Based Neural Beamformer for Moving Speakers"
・Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Yuto Kondo, " FastVoiceGrad: One-step Diffusion-Based Voice Conversion with Adversarial Conditional Diffusion Distillation”
・Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Yuto Kondo, “PRVAE-VC2: Non-Parallel Voice Conversion by Distillation of Speech Representations”
・Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, Masahiro Yasuda, Shunsuke Tsubaki, Keisuke Imoto, "M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation"
June 2024

The followin two papers have been accepted to European Signal Processing Conference (EUSIPCO2024).
・Yuto Kondo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko, Noboru Harada, “Learning to Assess Subjective Impressions Conveyed Through Speech”
・Shunsuke Tsubaki, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, Keisuke Imoto, “Refining Knowledge Transfer on Audio-Image Temporal Agreement for Audio-Text Cross Retrieval”
May 2024

The paper entitled “Detection of Acute Myeloid Leukemia without Labeling Individual Boold Cells” has been accepted to Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC2024).
May 2024

The paper entitled “Probabilistic Unitary Synthesis with Optimal Accuracy” has been accepted to ACM Transactions on Quantum Computing.
https://arxiv.org/html/2301.06307v2
May 2024

The paper “Non-Locality of Conjugation Symmetry: Characterization and Examples in Quantum Network Sensing” has been accepted to New Journal of Physics.
https://arxiv.org/html/2309.12523v2
April 2024

The following two papers have been accepted to IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP).
・Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo, Shogo Seki, "VoiceGrad: Non-Parallel Any-to-Many Voice Conversion with Annealed Langevin Dynamics"
・Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, Kunio Kashino, "Masked Modeling Duo: Towards a Universal Audio Pre-training Framework"
April 2024

The paper entitled “Partition Functions for Non-Commutative Harmonic Oscillators and Related Divergent Series” has been accepted to Indagationes Mathematicae.
https://www.sciencedirect.com/science/article/abs/pii/S0019357724000612?via%3Dihub
April 2024

The following two papers have been accepted to Mathematicsl Foundations for Post-Quantum Cryptography.
・Ryosuke Nakahama, “Representation Theory of sl(2,R)=su(1,1) and a Generalization of Non-commutative Harmonic Oscillators”
・Cid Reyes-Bustos, “Towards Hash Functions Based on Group-subgroup Pair Graphs”
March 2024

The following two papers have been accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR2024).
・Yu Mitsuzumi, Akisato Kimura, Hisashi Kashima, "Understanding and Improving Source-free Domain Adaptation from a Theoretical Perspective"
・Takuhiro Kaneko, "Improving Physics Augmented Continuum Neural Radiance Fileds-Based Geometry-Agnostic System Identification with Lagrangian Particle Optimization"
March 2024

Our paper “Geometrically-regularized fast independent vector extraction by pure majorization-minimization” has been accepted to IEEE Transactions on Signal Processing.
https://ieeexplore.ieee.org/document/10466407
February 2024

The following 5 papers have been accepted to satellite workshops in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2024).
・Junyi Peng, Marc Delcroix, Tsubasa Ochiai, Oldrich Plchot, Takanori Ashihara, Shoko Araki, Jan Cernocky, "Probing Self-supervised Learning Models with Target Speech Extraction"
・Thilo von Neumann, Christoph Boeddeker, Tobias Cord-Landwehr, Marc Delcroix, Reinhold Haeb-Umbach, "Meeting Recognition with Continuous Speech Separation and Transcription-Supported Diarization"
・Rino Kimura, Tomohiro Nakatani, Naoyuki Kamo, Marc Delcroix, Shoko Araki, Tetsuya Ueda, Shoji Makino, "Diffusion model-based MIMO speech denoising and dereverberation"
・Hao Shi, Naoyuki Kamo, Marc Delcroix, Tomohiro Nakatani and Shoko Araki, "ENSEMBLE INFERENCE FOR DIFFUSION MODEL-BASED SPEECH ENHANCEMENT"
・Bo He, Shiqi Zhang, Xianrui Wang, Zheng Qiu, Daiki Takeuchi, Daisuke Niizumi, Noboru Harada, Shoji Makino, “Light Gated Multi Mini-patch Extractor for Audio Classification”
Also, the following 2 papers have been accepted to Show and Tell Demos in ICASSP2024.
・Yasunori Ohishi, Marc Delcroix, Tsubasa Ochiai, Shoko Araki, Daiki Takeuchi, Daisuke Niizumi, Akisato Kimura, Noboru Harada, Kunio Kashino “Target Speech Spotting and Extraction Based on ConceptBeam”
・Thilo von Neumann, Christoph Boeddeker, Marc Delcroix, Reinhold Haeb-Umbach, "MeetEval, Show Me the Errors! Interactive Visualization of Transcript Alignments for the Analysis of Conversational ASR"
February 2024

Our paper “Warped diffusion for laten differentiation inference” has been accepted to International Conference on Artificial Intelligence and Statistics (AISTATS2024).
https://proceedings.mlr.press/v238/nakano24a.html
January 2024

Our paper “A motivic construction of the de Rham-Witt complex” has been accepted to Journal of Pure and Applied Algebra. This is a joint work with the University of Tokyo.
https://www.sciencedirect.com/science/article/pii/S0022404923002840

2023

December 2023

Our paper “Efficient algorithm for K-multiple-means” has been accepted to ACM SIGMOD International Conference on Management of Data (SIGMOD2024). This is a joint work with NTT Computer and Data Science Laboratories and NTT Human Informatics Laboratories.
https://dl.acm.org/doi/10.1145/3639273
December 2023

The following 13 papers have been accepted to IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2024).
・Naohiro Tawara, Marc Delcroix, Atsushi Ando, Atsunori Ogawa, “NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization”
・Dominik Klement, Mireia Diez, Federico Landini, Lukas Burget, Anna Silnova, Marc Delcroix, Naohiro Tawara, “Discriminative Training of VBx Diarization”
・Junyi Peng, Marc Delcroix, Tsubasa Ochiai, Oldrich Plchot, Shoko Araki, Jan Cernocky, “Target Speech Extraction with Pre-Trained Self-Supervised Learning Models”
・William Chen, Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Shinji Watanabe, “Train Long and Test Long: Leveraging Full Document Contexts in Speech Processing”
・Hanako Segawa, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Shoko Araki, Takeshi Yamada, Shoji Makino, “Neural Network-Based Virtual Microphone Estimation with Virtual Microphone and Beamformer-Level Multi-Task Loss”
・Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri, “How Does End-To-End Speech Recognition Training Impact Speech Enhancement Artifacts?”
・Keigo Wakayama, Tsubasa Ochiai, Marc Delcroix, Masahiro Yasuda, Shoichiro Saito, Shoko Araki, Akira Nakayama, “Online Target Sound Extraction with Knowledge Distillation from Partially Non-Causal Teacher”
・Takanori Ashihara, Marc Delcroix, Takafumi Moriya, Kohei Matsuura, Taichi Asami, Yusuke Ijima, “What Do Self-Supervised Speech and Speaker Models Learn? New Findings From a Cross Model Layer-Wise Analysis”
・Kenichi Fujita, Hiroshi Sato, Takanori Ashihara, Hiroki Kanagawa, Marc Delcroix, Takafumi Moriya, Yusuke Ijima, “Noise-Robust Zero-Shot Text-to-Speech Synthesis Conditioned on Self-Supervised Speech-Representation Model with Adapters”
・Shiqi Zhang, Daiki Takeuchi, Noboru Harada, Shoji Makino, “Unrestricted Global-Phase-Bias Aware Single-channel Speech Enhancement with Conformer-based Metric GAN”
・Yuto Kondo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko, “Selecting N-Lowest Scores for Training MOS Prediction Models“
・Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, “Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional Discriminator”
・Masahiro Nakano, Ryohei Shibue, Kunio Kashino, “Sunflower Strategy for Bayesian Relational Data Analysis”
December 2023

Our paper “blind and spatially-regularized online joint optimization of source seperation, dereverberation, and noise reduction” has been accepted to IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP).
https://ieeexplore.ieee.org/document/10384838
December 2023

Our paper “Variational autoencoder-based neural electrocardiogram synthesis trained by FEM-based heart simulator” has been accepted to Cardiovascular Digital Health Journal.
https://www.cvdigitalhealthjournal.com/article/S2666-6936(23)00110-X/fulltext
December 2023

Our paper “Gene correction and overexpression of TNNI3 improve impaired relaxation in engineered heart tissue model of pediatric restrictive cardiomyopathy” has been accepted to Developemtn, Growth & Differentiation. This is a joint work with Osaka University.
https://onlinelibrary.wiley.com/doi/10.1111/dgd.12909
December 2023

Our paper “Probabilistic state synthesis based on optimal convex approximation” has been accepted to Quantum Information.
https://www.nature.com/articles/s41534-023-00793-7
December 2023

Our paper “Fidelity-estimation method for graph states with depolarizing noise” has been accepted to Physical Review Research. This is a joint work with Chuo University.
https://journals.aps.org/prresearch/abstract/10.1103/PhysRevResearch.5.043260
November 2023

Our paper “Effective detection of variable celestial objects using machine learning-based periodic analysis” has been accepted to Astronomy and Computing.
N. Chihara, T. Takata, Y. Fujiwara, K. Noda, K. Toyoda, K. Higuchi, M. Onizuka, “Effective detection of variable selestial objects using machine learning-based periodic analysis,” Astronomy and Computing. 2023.
https://www.sciencedirect.com/science/article/pii/S221313372300080X
November 2023

Our paper “Comprehensive noise analysis for acoustro-optic measurement of airborne sound” has been accepted to IEEE Transactions on Instrumentation and Measurement.
Kenji Ishikawa, Yoshifumi Shiraki, Takehiro Moriya, Atsushi Ishizawa, Kenichi Hitachi, Katsuya Oguri, “Comprehensive noise analysis for acoustro-optic measurement of airborne sound,” IEEE Transactions on Instrumentation and Measurement, 2023.
Comprehensive Noise Analysis for Acousto-Optic Measurement of Airborne Sound | IEEE Journals & Magazine | IEEE Xplore
November 2023

Our paper “Physical-model-based reconstruction of three-dimensional sound field from multi-directional measurement by parallel phase-shift interferometry” has been accepted to Journal of Acoustical Society of America (JASA).
Haruka Nozawa, Mayuko Imanishi, Yasuhiro Oikawa, Keji Ishikawa, “Physical-model-based reconstruction of three-dimensional sound field from multi-directional measurement by parallel phase-shift interferometry,” Journal of Acoustical Society of America (JASA), 2023.
Physical-model-based reconstruction of three-dimensional sound field from multi-directional measurement by parallel phase-shift interferometry | The Journal of the Acoustical Society of America | AIP Publishing
November 2023

Ryosuke Nakahama presented his work “Holographic and symmetry breaking operators of holomor-phic discrete series representations for (SU(3,3),SO*(6))” on Tunisian-Japanese Conference: Geometric and Harmonic Analysis on Homogeneous Spacesand Applications as an invited speaker.
Tunisian-Japanese Conference - 2023 (7th)
November 2023

Seiseki Akibue presented his work “Optimal convex approximation of quantum superposition and its application in reshaping compilation errors” on Quantum Innovation 2023 as an invited speaker.
quantum innovation 2023
November 2023

Yuki Takeuchi presented his work “Quantum Computation And Sensing On Network” on International Symposium on Wireless Personal Multimedia Communications (WPMC2023).
Tutorials wpmc2023 – WPMC-Home
October 2023

Our paper “Phase randomization: A data augmentation for domain adaptation in human action recognition” has been accepted to Pattern Recognition.
Yu Mitsuzumi, Go Irie, Akisato Kimura, Atsushi Nakazawa, “Phase randomization: A data augmentation for domain adaptation in human action recognition,” Pattern Recognition, 2023.
https://doi.org/10.1016/j.patcog.2023.110051
October 2023

Our paper “General form of almost instantaneous fixed-to-variable-length codes and optimal code tree construction” has been accepted to IEEE Transaction on Information Theory.
Ryosuke Sugiura, Yutaka Kamamoto, Takehiro Moriya, “General form of almost instantaneous fixed-to-variable-length codes and optimal code tree construction,” IEEE Transactions on Information Theory ( Volume: 69, Issue: 12, December 2023.
DOI: 10.1109/TIT.2023.3314812
October 2023

Our paper “Hodge cohomology with a ramification filtration, I” has been accepted to Mathematische Zeitschrift.
Shane Kelly, Hiroyasu Miyazaki, “Hodge cohomology with a ramification filtration, I,” Mathematische Zeitschrift, 12 June 2023.
October 2023

Our paper “Cuspidal components of Siegel modular forms for large discrete series representations of Sp_4(R)” has been accepted to Manuscripta Mathematica.
Shuji Horinaga, Hiroaki Narita, “Cuspidal components of Siegel modular forms for large discrete series representations of Sp4(R),” Manuscripta Mathematica, 2023.
https://arxiv.org/abs/2301.11552v1
October 2023

Our paper “Anonymous quantum sensing” has been accepted to International Conference on Quantum, Nano/Bio, and Micro Technologies (ICQNM).
Hiroto Kasai, Yuki Takeuchi, Hideaki Hakoshima, Yuichiro Matsuzaki, Yasuhiro Tokura, “Anonymous quantom sensing,” International Conference on Quantum, Nano/Bio and Micro Technologies (ICQNM), 2023.
Journal of the Physical Society of Japan 91, 074005 (2022)
September 2023

Our paper “Non-parallel whisper-to-normal speaking style conversion using auxiliary classifier variational autoencoder” has been accepted to IEEE Access.
Shogo Seki, Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, “Non-parallel whisper-to-normal speaking style conversion using auxiliary classifier variational autoencoder,” IEEE Access, Vol.11, pp. 44590 - 44599, 2023.
https://ieeexplore.ieee.org/document/10109017
September 2023

Our paper “A century of acousto-optics: From early discoveries to modern sensing of sound with light” has been accepted to Acoustics Today.
Acoustics Today, Vol. 19, Iss. 3, pgs. 54-62
September 2023

Our paper “Heat kernel for the quantum Rabi model: II. Propagators and spectral determinants” has been accepted to Journal of Physics A: Mathematicsl and Theoretical.
Cid Reyes-Bustos. “Heat kernel for the quantum Rabi model: II. Propagators and spectral determinants,56 (2023) 425302.
August 2023

Our paper “Towards defensive letter design” has been accepted to IAPR Asian Conference on Pattern Recognition (ACPR2023).
Rentato Katakoka, Akisato Kimura, Seiichi Uchida, “Towards defensive letter design,” IAPR Asian Conference on Pattern Recognition (ACPR), 2023.
https://link.springer.com/chapter/10.1007/978-3-031-47634-1_9
August 2023

Our paper “Towards defensive letter design” has been accepted to IAPR Asian Conference on Pattern Recognition (ACPR2023).
Hayato Mitani, Akisato Kimura, Seiichi Uchida, “Selective scene text removal,” British Machine Vision Conference (BMVC), 2023.
https://proceedings.bmvc2023.org/521/
August 2023

Our paper “Efficient network representation learning via cluster similarity” has been accepted to Data Science and Engineering.
Yasuhiro Fujiwara, Yasutoshi Ida, Atsutoshi Kumagai, Masahiro Nakano, Akisato Kimura, Naonori Ueda, "Efficient network representation learning via cluster similarity," Data Science and Engineering, 2023.
https://link.springer.com/article/10.1007/s41019-023-00222-x
August 2023

The following 6 papers have been accepted to APSIPA Annual Summit and Conference (APSIPA-ASC).
・Yuki Kitagishi, Hosana Kamiyama, Naohiro Tawara, Atsunori Ogawa, Noboru Miyazaki, and Taichi Asami,”Coarse-age loss: A new training method using coarse-age labeled data for speaker age estimation.”
・Koharu Horii, Kengo Ohta, Ryota Nishimura, Atsunori Ogawa, and Norihide Kitaoka,”Language modeling for spontaneous speech recognition based on disfluency labeling and generation of disfluent text.”
・Keigo Hojo, Daiki Mori, Yukoh Wakabayashi, Kengo Ohta, Atsunori Ogawa, and Norihide Kitaoka,”Combining multiple end-to-end speech recognition models based on density ratio approach.”
・Tatsunari Takagi, Atsunori Ogawa, Norihide Kitaoka, and Yukoh Wakabayashi,”Streaming end-to-end speech recognition using a CTC decoder with substituted linguistic information.”
・Chihiro Watanabe, Hirokazu Kameokay, “DisC-VC: Disentangled and F0-Controllable Neural Voice Conversion.”
・Keisuke Takazawa, Hirokazu Kameokay, Masahiro Yukawa, “Multiple sound source tracking based on generative modeling and recursive Bayesian filtering of spatial gradient spectra.”
August 2023

The following 4 papers have been accepted to Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE2023).
・Boxin Liu, Shiqi Zhang, Daiki Takeuchi, Daisuke Niizumi, Noboru Harada, Shoji Makino, ”Masked modeling duo vision transformer with multi-layer feature fusion on respiratory sound classification”
・Daiki Takeuchi, Yasunori Ohishi, Daisuke Niizumi, Noboru Harada, Kunio Kashino, ” Similarity-discrepancy disentanglement for audio difference captioning”
・Kota Dohi, Keisuke Imoto, Noboru Harada, Daisuke Niizumi, Yuma Koizumi, Tomoya Nishida, Harsh Purohit, Ryo Tanabe, Takashi Endo, Yohei Kawaguchi, ” Description and Discussion on DCASE 2023 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring”
・Noboru Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Masahiro Yasuda, ” ToyADMOS2+: New Toyadmos Data and Benchmark Results of the First-Shot Anomalous Sound Event Detection Baseline”
August 2023

Our paper “Covering Families of the Asymmetric Quantum Rabi Model: η-Shifted Non-commutative Harmonic Oscillators” has been accepted to Communication in Mathematical Physics.
Cid Reyes-Bustos, Masato Wakayama, “Covering Families of the Asymmetric Quantum Rabi Model: η-Shifted Non-commutative Harmonic Oscillators,” Communications in Mathematical Physics volume 403, pages1429–1476 (2023)
https://link.springer.com/article/10.1007/s00220-023-04825-3
July 2023

Our paper “MIMO-NeRF: Fast rendering with multi-input multi-output neural radiance fields” has been accepted to IEEE/CVF International Conference on Computer Vision (ICCV2023).
Takuhiro Kaneko, “MIMO-NeRF: Fast Neural Rendering with Multi-Input Multi-Output Neural Radiance Fields,” IEEE/CVF International Conference on Computer Vision (ICCV2023), 2023.
https://openaccess.thecvf.com/content/ICCV2023/html/Kaneko_MIMO-NeRF_Fast_Neural_Rendering_with_Multi-input_Multi-output_Neural_Radiance_Fields_ICCV_2023_paper.html
July 2023

Our paper “Frame-level event representation learning for semantic-level generation and editing of avatar motion” has been accepted to ACM International Conference on Multimodal Interaction (ICMI2023).
Ayaka Ideno, Takuhiro Kaneko, Tatsuya Harada, “Frame-Level Event Representation Learning for Semantic-Level Generation and Editing of Avatar Motion” ACM International Conference on Multimodal Interaction (ICMI), 2023.
https://dl.acm.org/doi/abs/10.1145/3577190.3614175
July 2023

Our paper “Divide-and-conquer verification method for noisy intermediate-scale quantum computation” has been accepted to Asian Quantum Information Science Conference (AQIS2023).
Yuki Takeuchi, Yasuhiro Takahashi, Tomoyuki Morimae, and Seiichiro Tani , “Divide-and-conquer verification method for noisy intermediate-scale quantum computation,” Asian Quantim Information Science Conference (AQIS), 2023.
https://doi.org/10.22331/q-2022-07-07-758
June 2023

Our paper “First-shot anomaly sound detection for machine condition monitoring: A domain generalization baseline” has been accepted to European Signal Processing Conference (EUSIPCO2023).
Noboru Harada, Daisuke Niizumi, Yasunori Ohishi, Daiki Takeuchi, Masahiro Yasuda, “First-Shot Anomaly Sound Detection for Machine Condition Monitoring: A Domain Generalization Baseline,” European Signal Processing Conference (EUSIPCO), 2023.
DOI：10.23919/EUSIPCO58844.2023.10289721
https://ieeexplore.ieee.org/document/10289721
June 2023

Our paper “W2N-AVSC: Audiovisual Extension For Whisper-To-Normal Speech Conversion” has been accepted to European Signal Processing Conference (EUSIPCO2023).
Shogo Seki, Kanami; Imamura, Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Noboru Harada, “W2N-AVSC: Audiovisual Extension For Whisper-To-Normal Speech Conversion,” European Signal Processing Conference (EUSIPCO), 2023
DOI：10.23919/EUSIPCO58844.2023.10289823
https://ieeexplore.ieee.org/document/10289823
June 2023

Our paper “PRVAE-VC: Non-parallel many-to-many voice conversion with perturbation-resistant variational autoencoder” has been accepted to ISCA Speech Synthesis Workshop (SSW2023).
Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, “PRVAE-VC: Non-parallel many-to-many voice conversion with perturbation-resistant variational autoencoder,” ISCA Speech Synthesis Workshop (SSW), 2023.
https://www.isca-archive.org/ssw_2023/tanaka23_ssw.html
DOI:10.21437/SSW.2023-14
May 2023

The following 12 papers have been accepted to Interspeech 2023.
・Marc Delcroix, Naohiro Tawara, Mireia Diez, Federico Landini, Anna Silnova, Atsunori Ogawa, Tomohiro Nakatani, Lukas Burget, Shoko Araki, ” Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization”
・Naoyuki Kamo, Marc Delcroix, Tomohiro Nakatani, ” Target Speaker Extraction with Conditional Diffusion Model”
・Shoko Araki, Ayako Yamamoto, Tsubasa Ochiai, Kenichi Arai, Atsunori Ogawa, Tomohiro Nakatani, Toshio Irino,” Impact of Residual Noise and Artifacts in Speech Enhancement Errors on Intelligibility of Human and Machine”
・Hiroshi Sato, Ryo Masumura, Tsubasa Ochiai, Marc Delcroix, Takafumi Moriya, Takanori Ashihara, Kentaro Shinayama, Saki Mizuno, Mana Ihori, Tomohiro Tanaka, Nobukatsu Hojo,” Downstream Task Agnostic Speech Enhancement Conditioned on Self-Supervised Representation Loss”
・Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Takanori Ashihara, Kohei Matsuura, Tomohiro Tanaka, Ryo Masumura, Atsunori Ogawa, Taichi Asami,” Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data”
・Takanori Ashihara, Takafumi Moriya, Kohei Matsuura, Tomohiro Tanaka, Yusuke Ijima, Taichi Asami, Marc Delcroix, Yukinori Honma, ” SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?”
・Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Tomohiro Tanaka, Takatomo Kano, Atsunori Ogawa, Marc Delcroix, ” Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization”
・Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Shogo Seki,” iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN”
・Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, Kunio Kashino,” Masked Modeling Duo for Speech: Specializing General-Purpose Audio Representation to Speech using Denoising Distillation”
・Kou Tanaka, Takuhiro Kaneko, Hirokazu Kameoka, Shogo Seki,” CFVC: Conditional Filtering for Controllable Voice Conversion”
・Hikaru Yanagida, Yusuke Ijima, Naohiro Tawara, "Influence of Personal Traits on Impressions of One's Own Voice"
・Yuki Kitagishi, Naohiro Tawara, Atsunori Ogawa, Ryo Masumura, Taichi Asami, "What are differences? Comparing DNN and human by their performance and characteristics in speaker age estimation"
May 2023

Our paper “Finite-key security analysis of differential-phase-shift quantum key distribution” has been accepted to Physical Review Research.
Akihiro Mizutani, Yuki Takeuchi, Kiyoshi Tamaki, ”Finite-key security analysis of differential-phase-shift quantum key distribution”, Physical Review Research, 5, 023132 – Published 30 May 2023
Phys. Rev. Research 5, 023132 (2023) - Finite-key security analysis of differential-phase-shift quantum key distribution (aps.org)
April 2023

Our paper “Uncovering the largest community in social networks at scale” has been accepted to International Joint Conference on Artificial Intelligence (IJCAI2023).
Shohei Matsugu, Yasuhiro Fujiwara, Hiroaki Shiokawa, “Uncovering the Largest Community in Social Networks at Scale,” International Joint Conference on Artificial Intelligence (IJCAI2023), 2023.
https://www.ijcai.org/proceedings/2023/0250
April 2023

Our paper “Rewindable Quantum Computation and Its Equivalence to Cloning and Adaptive Postselection” has been accepted to Conference on the Theory of Quantum Computation, Communication and Cryptography (TQC 2023).
Ryo Hiromasa, Akihiro Mizutani, Yuki Takeuchi, Seiichiro Tani, “Rewindable Quantum Computation and Its Equivalence to Cloning and Adaptive Postselection”
https://doi.org/10.48550/arXiv.2206.05434
March 2023

Our paper “Listening human behavior: 3D human pose estimation with acoustic signals” has been accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR2023).
Yuto Shibata, Yutaka Kawashima, Mariko Isogawa, Go Irie, Akisato Kimura, Yoshimitsu Aoki, “Listening human behavior: 3D human pose estimation with acoustic signals,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
https://openaccess.thecvf.com/content/CVPR2023/html/Shibata_Listening_Human_Behavior_3D_Human_Pose_Estimation_With_Acoustic_Signals_CVPR_2023_paper.html
March 2023

Our paper “Unsupervised intrinsic image decomposition with LiDAR intensity” has been accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR2023).
Shogo Sato, Yasuhiro Yao, Taiga Yoshida, Takuhiro Kaneko, Shingo Ando, Jun Shimamura, “Unsupervised intrinsic image decomposition with LiDAR intensity,” IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
https://openaccess.thecvf.com/content/CVPR2023/html/Sato_Unsupervised_Intrinsic_Image_Decomposition_With_LiDAR_Intensity_CVPR_2023_paper.html
February 2023

The following 9 papers have been accepted to IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2023).
・Xiaomeng Wu, Yongqing Sun, Akisato Kimura, “Deep quantigraphic image enhancement via comparametric equations.”
・Atsunori Ogawa, Takafumi Moriya, Naoyuki Kamo, Naohiro Tawara, Marc　Delcroix,　“Iterative shallow fusion of backward language model for end-to-end speech recognition”
・Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Roshan Sharma, Kohei Matsuura, Shinji Watanabe, “Speech summarization of long spoken document: Improving memory efficiency of speech/text encoders”
・Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Tomohiro Tanaka, Atsunori Ogawa, Marc Delcroix, Ryo Masumura, “LEVERAGING LARGE TEXT CORPORA FOR END-TO-END SPEECH SUMMARIZATION”
・Thilo von Neumann, Christoph Boeddeker, Keisuke Kinoshita, Marc Delcroix, Reinhold Haeb-Umbach, “On Word Error Rate Definitions and their Efficient Computation for Multi-Speaker Speech Recognition Systems”
・Taishi Nakashima, Rintaro Ikeshita, Nobutaka Ono, Shoko Araki, Tomohiro Nakatani, ” Fast Online Source Steering Algorithm for Tracking Single Moving Source Using Online Independent Vector Analysis”
・Shogo Seki, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko, ” JSV-VC: JOINTLY TRAINED SPEAKER VERIFICATION AND VOICE CONVERSION MODELS”
・Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, Kunio Kashino,” Masked modeling duo: Learning Representations by Encouraging Both Networks to Model the Input”
・Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Shogo Seki,” Wave-U-Net Discriminator: Fast and Lightweight Discriminator for Generative Adversarial Network-Based Speech Synthesis”
February 2023

Our paper “Deep attentive time warping” has been accepted to Pattern Recognition.
Shinnosuke Matsuo, Xiaomeng Wu, Guntag Atarsaikhan, Akisato Kimura, Kunio Kashino, Brian Kenji Iwana, Seiichi Uchida, “Deep attentive time warping,” Pattern Recognition, 2023.
https://doi.org/10.1016/j.patcog.2022.109201
February 2023

Our paper “Streaming end-to-end target speaker automatic speech recognition and activity detection” has been accepted to IEEE Access.
T. Moriya, H. Sato, T. Ochiai, M. Delcroix and T. Shinozaki, "Streaming End-to-End Target-Speaker Automatic Speech Recognition and Activity Detection," in IEEE Access, 2023. doi: 10.1109/ACCESS.2023.3243690.
https://ieeexplore.ieee.org/document/10041133
February 2023

Our paper “Determination of microphone acoustica center from sound field projection measured by optical interferometry” has been accepted to The Journal of the Acoustical Society of America (JASA).
Denny Hermawanto, Kenji Ishikawa, Kohei Yatabe, Yasuhiro Oikawa, “Determination of microphone acoustic center from sound field projection measured by optical interferometry,” The Journal of the Acoustical Society of America, 2023.
https://doi.org/10.1121/10.0017246 J. Acoust. Soc. Am. 153, 1138–1146 (2023)
February 2023

Our paper “I/Q demodulator based optical camera communication” has been accepted to IEEE Photonics Journal.
Hiroaki Matsunaga, Tomohiro Yendo, Wataru Kihara, Yoshifumi Shiraki, Takashi G. Sato, Takehiro Moriya, “I/Q Demodulator Based Optical Camera Communications,” IEEE Photonics Journal, 2023.
June 2022 IEEE Photonics Journal 14(3):1-1
DOI:10.1109/JPHOT.2022.3166283
February 2023

Our paper “Decoding selective attention from EEG during simultaneous presentation of two melodies” has been accepted to Neuroscience2021.
January 2023

Our paper “Efficient network representation learning via cluster similarity” has been accepted to International Conference on Databased Systems for Adcvanced Applications (DASFAA).
Yasuhiro Fujiwara, Yasutoshi Ida, Atsutoshi Kumagai, Masahiro Nakano, Akisato Kimura, Naonori Uede, “Efficient Network Representation Learning via Cluster Similarity,” in Proc. International Conference on Database Systems for Advanced Applications (DASFAA), 2023.
January 2023

Our paper “Segment-less continuous speech separation of meetings: Training and evaluation criteria” has been accepted to IEEE/ACM Transactions on Audio, Speech and Language Processing.
T. von Neumann, K. Kinoshita, C. Boeddeker, M. Delcroix and R. Haeb-Umbach, "Segment-less Continuous Speech Separation of Meetings: Training and Evaluation Criteria," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023, doi: 10.1109/TASLP.2022.3228629.
https://ieeexplore.ieee.org/abstract/document/9982413
January 2023

Our paper “Neural target speech extraction: An overview” has been accepted to IEEE Signal Processing Maganize.
Katerina Zmolikova, Marc Delcroix, Tsubasa Ochiai, Keisuke Kinoshita, Jan Cernocky, Dong Yu, "Neural target speech extraction: An overview," IEEE Signal Processing Magazine, 2023. DOI: 10.1109/MSP.2023.3240008.
https://ieeexplore.ieee.org/abstract/document/10113382
January 2023

Our paper “Mask-based neural beamforming for moving speakers with self-attention-based tracking” has been accepted to IEEE/ACM Transactions on Audio, Speech and Language Processing.
Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Shoko Araki, ”Mask-based Neural Beamforming for Moving Speakers with Self-Attention-based Tracking," IEEE/ACM Transactions onAudio Speech and Language Processing, 2023, DOI: 10.1109/TASLP.2023.3237172.
https://ieeexplore.ieee.org/document/10017367
January 2023

Our paper “Distribution matching for dimming control in visible-light region-of-interest signaling“ has been accepted to IEEE Photonics Journal.
Phuc Duc Nguyen, Yoshifumi Shiraki, Kenji Ishikawa, Jun Muramatsu, Noboru Harada, Takehiro Moriya, “Distribution matching for dimming control in visible-light region-of-interest signaling,” IEEE Photonics Journal, 2023. DOI: 10.1109/JPHOT.2022.3233092
January 2023

Naohiro Tawara has received the Best Reviewer Award in IEEE Spoken Language Technology Workshop (SLT 2022). https://www.slt2022.org/best-papers.php

2021

03/11/2021

[Award] Tsubasa Ochiai has received The 16th Itakura Prize Innovative Young Researcher Award from the Acoustical Society of Japan.
"Joint Optimization of Microphone Array Signal Processing and Speech Recognition"

https://acoustics.jp/awards/itakura/
02/18/2021

[Award] Hirokazu Kameoka has received The 10th RIEC Award from Research Institute of Electrical Communication Tohoku University.
"Audio Signal Decomposition and Scene Analysis"

https://www.riec.tohoku.ac.jp/ja/info/riec-award/r2/
01/28/2021

Rintaro Ikeshita has received the 49th Awaya Kiyoshi Science Promotion Award from the Acoustical Society of Japan.
Rintaro Ikeshita and Tomohiro Nakatani, "Multiplicative update algorithms for independent vector analysis," 2020 Autumn meeting of Acoustical Society of Japan, 1-1-13, 2020.
01/21/2021

Onkar Krishna, Go Irie, Xiaomeng Wu, Takahito Kawanishi and Kunio Kashino has received a "Best Research Paper Award Honorable Mention" at the 26th Symposium on Sensing via Image Information.
Onkar Krishna, Go Irie, Xiaomeng Wu, Takahito Kawanishi and Kunio Kashino(2020). "Adaptive Spotting: 3D Point Cloud Object Search Based on Deep Reinforcement Learning," The 26th Symposium on Sensing via Image Information.