Full Program
Asia Pacific Signal and Information Processing Association Annual Summit and Conference 2022
All conference programs will take place over 3 days on November 8-10, 2022 at Empress Convention Center.
Session | Room | Chair | |
TuAM1-1 (SS13:Advanced Topics on Sound Event and Scene Analysis) | Chiang Mai 1 | Nobutaka Ono, Keisuke Imoto, Tatsuya Komatsu | |
Date | Time | Title | Authors |
8 November 2022 | 10.35-10.55 | On Sorting and Padding Multiple Targets for Sound Event Localization and Detection With Permutation Invariant and Location-Based Training | Robin Scheibler; Tatsuya Komatsu; Yusuke Fujita; Michael Hentschel |
10.55-11.15 | How Information on Acoustic Scenes and Sound Events Mutually Benefits Event Detection and Scene Classification Tasks | Ami Igarashi; Keisuke Imoto; Yuka Komatsu; Shunsuke Tsubaki; Shuto Hario; Tatsuya Komatsu | |
11.15-11.35 | Compressed Sensing of Sparse Spectrum Using Distributed Sound-To-Light Conversion Device Blinkies | Satoshi Motoyama; Natsuki Ueno; Yuma Kinoshita; Nobutaka Ono | |
11.35-11.55 | CochlScene: Acquisition of Acoustic Scene Data Using Crowdsourcing | Il-Young Jeong; Jeongsoo Park | |
11.55-12.15 | Vision Transformer Based Audio Classification Using Patch-Level Feature Fusion | Juan Luo; Jielong Yang; Eng Siong Chng; Xionghu Zhong | |
12.15-12.35 | Self-Consistency Training With Hierarchical Temporal Aggregation for Sound Event Detection | Yunlong Li; Xiujuan Zhu; Mingyu Wang; Ying Hu | |
Session | Room | Chair | |
TuAM1-2 (Speech, Language, and Audio 1) | Chiang Mai 2 | Tomoki Toda | |
Date | Time | Title | Authors |
8 November 2022 | 10.35-10.55 | Music Similarity Calculation of Individual Instrumental Sounds Using Metric Learning | Yuka Hashizume; Li Li; Tomoki Toda |
10.55-11.15 | Investigation of Noise-Reverberation-Robustness of Modulation Spectral Features for Speech-Emotion Recognition | Taiyang Guo; Sixia Li; Masashi Unoki; Shogo Okada | |
11.15-11.35 | Combine Waveform and Spectral Methods for Single-Channel Speech Enhancement | Miao Li; Hui Zhang; Xueliang Zhang | |
11.35-11.55 | Perceptual Loss Function for Speech Enhancement Based on Generative Adversarial Learning | Xin Bai; Xueliang Zhang; Hui Zhang; Haifeng Huang | |
11.55-12.15 | Joint Speech Activity and Overlap Detection With Multi-Exit Architecture | Ziqing Du; Kai Liu; Xucheng Wan; Huan Zhou | |
Session | Room | Chair | |
TuAM1-3 (Human Biometrics and Security Systems) | Chiang Mai 3 | Kasemsit Teeyapan | |
Date | Time | Title | Authors |
8 November 2022 | 10.35-10.55 | On Wrist Vein Recognition for Human Biometrics | Felix Marattukalam; David Cole; Pranav Gulati; Waleed H. Abdulla |
10.55-11.15 | Continuous Authentication on Unconstrained Activities Using Window and Cycle Based Segmentation | Lina Septiana; Narishige Abe; Tomoaki Matsunami; Hidetsugu Uchida; Kazuki Osamura; Shigefumi Yamada | |
11.15-11.35 | Smoothed Teager Energy Cepstral Feature for Replay Attack Detection on Voice Assistants | Madhu R Kamble; Anand Therattil; Hemant A. Patil; M. Ali Basha Shaik; Vikram Vij | |
11.35-11.55 | Disentangled Speaker Representation Learning via Mutual Information Minimization | Sung Hwan Mun; Min Hyun Han; Minchan Kim; Dongjune Lee; Nam Soo Kim | |
11.55-12.15 | Contribution of Timbre and Shimmer Features to Deepfake Speech Detection | Anuwat Chaiwongyen; Norranat Songsriboonsit; Suradej Duangpummet; Jessada Karnjana; Waree Kongprawechnon; Masashi Unoki | |
12.15-12.35 | Combined 2D and 3D Convolution Residual Attention Network for Hand Gesture Recognition | Chang-Ting Tsai; Jian-Jiun Ding | |
10.35-10.55 | On Wrist Vein Recognition for Human Biometrics | Felix Marattukalam; David Cole; Pranav Gulati; Waleed H. Abdulla | |
Session | Room | Chair | |
TuAM1-4 (Signal Image and Information Processing Theory and Methods) | Board Room 2 | Daranee Hormdee | |
Date | Time | Title | Authors |
8 November 2022 | 10.35-10.55 | Investigate Bidirectional Functional Brain Networks Using Directed Information | Qiang Li |
10.55-11.15 | Effective ASR Error Correction Leveraging Phonetic, Semantic Information and N-Best Hypotheses | Hsin-Wei Wang; Bi-Cheng Yan; Yi-Cheng Wang; Berlin Chen | |
11.15-11.35 | A Lossless Audio Codec Based on Hierarchical Residual Prediction | Taiyo Mineo; Hayaru Shouno | |
11.35-11.55 | Investigating Low-Distortion Speech Enhancement With Discrete Cosine Transform Features for Robust Speech Recognition | Yu-Sheng Tsao; Jeih-weih Hung; Kuan-Hsun Ho; Berlin Chen | |
11.55-12.15 | Consistent MDT-Tucker: A Hankel Structure Constrained Tucker Decomposition in Delay Embedded Space | Ryuki Yamamoto; Hidekata Hontani; Akira Imakura; Tatsuya Yokota | |
12.15-12.35 | Sound Reproduction With a Circular Loudspeaker Array Using Differential Beamforming Method | Yankai Zhang; Jiayi Mao; Yefeng Cai; Chao Ye | |
Session | Room | Chair | |
TuAM1-5 (SS01: Reconfigurable Computing and Performance Evaluation) | Board Room 3 | Ukrit Mankong | |
Date | Time | Title | Authors |
8 November 2022 | 10.35-10.55 | Design and System Implementation of a Configurable Optical Interconnection Network | Bowen Yang; Junyong Deng; Jiaying Luo; Yu Feng |
10.55-11.15 | 2S-AGCN Human Behavior Recognition Based on New Partition Strategy | Jin Wu; Lei Wang; Gege Chong; Haoran Feng | |
11.15-11.35 | Design of Optimal FIR Digital Filter by Swarm Optimization Technique | Jin Wu; Yaqiong Gao; Ling Yang; Zhengdong Su | |
11.35-11.55 | Design and Implementation of Reconfigurable Array Structure for Convolutional Neural Network Supporting Data Reuse | Rui Shan; Ziqing Huo; Xiaoshuo Li; Huan Chang; Rui Qin | |
11.55-12.15 | DBR: A Depth-Branch-Resorting Algorithm for Locality Exploration in Graph Processing | Lin Jiang; Ru Feng; Junjie Wang; Junyong Deng | |
12.15-12.35 | Performance Evaluation of Popularity-Aware Dynamic Clustering Scheme for Distributed Caching in ICN | Mikiya Yoshida; Yusuke Ito; Yurino Sato; Hiroyuki Koga | |
Session | Room | Chair | |
TuAM1-6 (SS03: Security Techniques of Speaker Recognition) | Chiang Mai 4 | Xiao-Lei Zhang (online)/Navadon Khunlertgit | |
Date | Time | Title | Authors |
8 November 2022 | 10.35-10.55 | Masking Speech Feature to Detect Adversarial Examples for Speaker Verification | Xing Chen; Jiadi Yao; Xiao-Lei Zhang |
10.55-11.15 | F0 Modification via PV-TSM Algorithm for Speaker Anonymization Across Gender | Candy Olivia Mawalim; Shogo Okada; Masashi Unoki | |
11.15-11.35 | Pay Attention to Hard Trials | Lantian Li; Di Wang; Dong Wang | |
11.35-11.55 | A Multi-Task Framework of Speaker Recognition With TTS Data Augmentation | Xingjia Xie; Yiming Zhi; Beibei Ouyang; Qingyang Hong; Lin Li | |
11.55-12.15 | Source Tracing: Detecting Voice Spoofing | Tinglong Zhu; Xingming Wang; Xiaoyi Qin; Ming Li | |
12.15-12.35 | Replay Attack Detection Based on Voice and Non-Voice Sections for Speaker Verification | Ananda Garin Mills; Patthranit Kaewcharuay; Pannathorn Sathirasattayanon; Suradej Duangpummet; Kasorn Galajit; Jessada Karnjana; Pakinee Aimmanee | |
Session | Room | Chair | |
TuAM1-7 (Speech, Language, and Audio 2) | Chiang Mai 5 | Natthanan Promsuk | |
Date | Time | Title | Authors |
8 November 2022 | 10.35-10.55 | Learning Emotion Information for Expressive Speech Synthesis Using Multi-Resolution Modulation-Filtered Cochleagram | Kaili Zhang; Masashi Unoki |
10.55-11.15 | VocEmb4SVS: Improving Singing Voice Separation With Vocal Embeddings | Chenyi Li; Yi Li; Xuhao Du; Yaolong Ju; Shichao Hu; Zhiyong Wu | |
11.15-11.35 | Dialect-Aware Semi-Supervised Learning for End-To-End Multi-Dialect Speech Recognition | Sayaka Shiota; Ryo Imaizumi; Ryo Masumura; Hitoshi Kiya | |
11.35-11.55 | Design and Construction of Japanese Multimodal Utterance Corpus With Improved Emotion Balance and Naturalness | Daisuke Horii; Akinori Ito; Takashi Nose | |
11.55-12.15 | Non-Parallel Voice Conversion Based on Free-Energy Minimization of Speaker-Conditional Restricted Boltzmann Machine | Takuya Kishida; Toru Nakashika | |
12.15-12.35 | The TNT Team System Descriptions of Cantonese, Mongolian and Kazakh for IARPA OpenASR21 Challenge | Kai Tang; Jing Zhao; Jinghao Yan; Jian Kang; Haoyu Wang; Jinpeng Li; Shuzhou Chai; Guan-Bo Wang; Shen Huang; Guoguo Chen; Pengfei Hu; Wei-Qiang Zhang | |
Session | Room | Chair | |
TuAM1-8 (SS10: Real-world sensing technologies of human function) | Board Room 4 | Yumie Ono/Toshihisa Tanaka | |
Date | Time | Title | Authors |
8 November 2022 | 10.35-10.55 | Evaluation of Cognitive Test Results Using Concentration Estimation From Facial Videos | Terumi Umematsu; Masanori Tsujikawa; Hideyuki Sawada |
10.55-11.15 | Clustering of Advertising Images Using Electroencephalogram | Ingon Chanpornpakdi; Motoi Noda; Toshihisa Tanaka; Yuval Harpaz; Amir B. Geva | |
11.15-11.35 | Evaluation of Influence of Positions and Numbers of EEG Electrodes on Quantification of Independent Component Matrix | Ingon Chanpornpakdi; Ryohei Mizuochi; Maro G Machizawa | |
11.35-11.55 | Wearable Microfluidic Biosensor for Real-Time Sweat Content Monitoring | Hiroyuki Kudo; Yuto Goto | |
11.55-12.15 | Ear-EEG Based Eye State Classification Using Convolutional Neural Network | Chang-Hee Han; Han-Jeong Hwang | |
12.15-12.35 | Development of Virtual-Reality-Based Exergame for Lower-Extremity Rehabilitation of Stroke Patients | Mamiko Sasakawa; Daigo Ito; Ryo Ogura; Takanori Tominaga; Yumie Ono | |
Session | Room | Chair | |
TuPM1-1 ( Speech, Language, and Audio 1) | Chiang Mai 1 | Rohan Kumar Das | |
Date | Time | Title | Authors |
8 November 2022 | 15.20-15.40 | Is Your Baby Fine at Home? Baby Cry Sound Detection in Domestic Environments | Tanmay Khandelwal; Rohan Kumar Das; Eng-Siong Chng |
15.40-16.00 | Acoustic Echo and Noise Canceller Using Shared-Error Normalized Least Mean Square Algorithm | Kenta Iwai; Takanobu Nishiura | |
16.00-16.20 | Subband-Based Spectrogram Fusion for Speech Enhancement by Combining Mapping and Masking Approaches | Hao Shi; Longbiao Wang; Sheng Li; Jianwu Dang; Tatsuya Kawahara | |
16.20-16.40 | Neural Virtual Microphone Estimator: Application to Multi-Talker Reverberant Mixtures | Hanako Segawa; Tsubasa Ochiai; Marc Delcroix; Tomohiro Nakatani; Rintaro Ikeshita; Shoko Araki; Takeshi Yamada; Shoji Makino | |
16.40-17.00 | SE-Mixer: Towards an Efficient Attention-Free Neural Network for Speech Enhancement | Kai Wang; Bengbeng He; Wei-Ping Zhu | |
17.00-17.20 | How Should We Evaluate Synthesized Environmental Sounds | Yuki Okamoto; Keisuke Imoto; Shinnosuke Takamichi; Takahiro Fukumori; Yoichi Yamashita | |
17.20-17.40 | FeatureCut: An Adaptive Data Augmentation for Automated Audio Captioning | Zhongjie Ye; Yuqing Wang; Helin Wang; Dongchao Yang; Yuexian Zou | |
Session | Room | Chair | |
TuPM1-2 (Signal Processing Systems: Design and Implementation) | Chiang Mai 2 | Kasemsit Teeyapan | |
Date | Time | Title | Authors |
8 November 2022 | 15.20-15.40 | Robust Steerable Differential Beamformer for Concentric Circular Array With Directional Microphones | Weilong Huang; Jinwei Feng |
15.40-16.00 | A Deep Proximal-Unfolding Method for Monaural Speech Dereverberation | Meihuang Wang; Minmin Yuan; Andong Li; Chengshi Zheng; Xiaodong Li | |
16.00-16.20 | Speech Enhancement Using Self-Supervised Pre-Trained Model and Vector Quantization | Xiao-Ying Zhao; Qiu-Shi Zhu; Jie Zhang | |
16.20-16.40 | HouseX: A Fine-Grained House Music Dataset and Its Potential in the Music Industry | Xinyu Li | |
16.40-17.00 | Interpretable Control for Emotional Text-To-Speech System Toward Development of Sympathetic Educational-Support Robots | Jingyi Feng; Tomohiro Yoshikawa; Tomoki Toda | |
17.00-17.20 | Direction-Aware Target Speaker Extraction With a Dual-Channel System Based on Conditional Variational Autoencoders Under Underdetermined Conditions | Rui Wang; Li Li; Tomoki Toda | |
17.20-17.40 | LCN: Label Correction Based on Network Prediction for Cross-Modal Retrieval With Noisy Labels | Daiki Okamura; Ryosuke Harakawa; Masahiro Iwahashi | |
Session | Room | Chair | |
TuPM1-3 (Signal Image and Information Processing Theory and Methods) | Chiang Mai 3 | Tatsuya Yokota | |
Date | Time | Title | Authors |
8 November 2022 | 15.20-15.40 | Using Self-Learning Representations for Objective Assessment of Patient Voice in Dysphonia | Shaoxiang Dang; Tetsuya Matsumoto; Yoshinori Takeuchi; Hiroaki Kudo; Takashi Tsuboi; Yasuhiro Tanaka; Masahisa Katsuno |
15.40-16.00 | Fast Signal Completion Algorithm With Cyclic Convolutional Smoothing | Hiromu Takayama; Tatsuya Yokota | |
16.00-16.20 | Single-Channel Speech Enhancement Student Under Multi-Channel Speech Enhancement Teacher | Yuzhu Zhang; Hui Zhang; Xueliang Zhang | |
16.20-16.40 | Distance-Based Dynamic Weight: A Novel Framework for Multi-Source Information Fusion | Cuiping Cheng; Xiaoning Zhang; Taihao Li | |
16.40-17.00 | Improvement of the Direction-Of-Arrival Estimation Method Using a Single Channel Microphone by Correcting a Spectral Slope of Speech | Masaki Ikeuchi; Hiroki Tanji; Takahiro Murakami | |
17.00-17.20 | Studying Human-Based Speaker Diarization and Comparing to State-Of-The-Art Systems | Simon W. McKnight; Aidan O. T. Hogg; Vincent W. Neo; Patrick A. Naylor | |
17.20-17.40 | Optimization of CU Partition Based on Texture Degree in H.266/VVC | Jingyuan Tang; Songlin Sun | |
Session | Room | Chair | |
TuPM1-4 (SS02: Deep Learning Systems and Applications for Cloud, Fog, and Edge) | Board Room 2 | Jia-Ching Wang | |
Date | Time | Title | Authors |
8 November 2022 | 15.20-15.40 | Selection of Supplementary Acoustic Data for Meta-Learning in Under-Resourced Speech Recognition | I-Ting Hsieh; Chung-Hsien Wu; Zhe-Hong Zhao |
15.40-16.00 | Using Prosodic Phrase-Based VQVAE on Audio ALBERT for Speech Emotion Recognition | Jia-Hao Hsu; Chung-Hsien Wu; Tsung-Hsien Yang | |
16.00-16.20 | ESPnet-ONNX: Bridging a Gap Between Research and Production | Masao Someki; Yosuke Higuchi; Tomoki Hayashi; Shinji Watanabe | |
16.20-16.40 | Multi-Loss Function in Robust Convolutional Autoencoder for Reconstruction Low-Quality Fingerprint Image | Farchan Hakim Raswa; Franki Halberd; Agus Harjoko; Wahyono; Chung-Ting Lee; Yung-Hui Li; Jia Ching Wang | |
Session | Room | Chair | |
TuPM1-5 (Research Review) | Board Room 3 | Jesin James | |
Date | Time | Title | Authors |
8 November 2022 | 15.20-15.40 | EmotionGUI: Visualisation and Annotation of Emotions in a 2D Space for Multi-Modal Signals | Jesin James; Felix Marattukalam; Owen Eng; Aron Jeremiah |
15.40-16.00 | Enhancing the Performance of Automatic Speech Recognition With Optical Microphone Technology Through Data Augmentation Approach: A Pilot Study | Ruei-Ci Shen; Ji-Yan Han; Ying-Hui Lai | |
16.00-16.20 | Process Monitoring Based on Nearest Correlation and Variational Graph Auto-Encoder and Its Application to Tennessee Eastman Process | Yoshiaki Uchida; Koichi Fujiwara | |
16.20-16.40 | Decoding of Individual Emotions Induced During Interaction With Voice-User Interface Using Electroencephalography | Jun-Seok Lee, Ga-Young Choi, Ji-Yoon Lee, Jong-Gyu Shin, Sang-Ho Kim, Han-Jeong Hwang | |
16.40-17.00 | Leverage Limited Features of Partial Fingerprint Recognition Using Improved Siamese Network With Self-Spatial Attention | Farchan Hakim Raswa, Franki Halberd, Agus Harjoko, Chung-Ting Lee, Yung-Hui Li, Pao-Chi Chang, Jia-Ching Wang | |
17.00-17.20 | Design and Signal Analysis of a Compact Antenna for UWB MIMO Systems | Long Jin; Yangmiao Lin; Iickho Song; Ruohan Zhang | |
17.20-17.40 | A Filtered-x Active Noise Control Algorithm Robust to Impulsive Noise Using Novel Subband Adaptive Filter Algorithm | Chan Park; Minho Lee; PooGyeon Park | |
Session | Room | Chair | |
TuPM1-6 (Speech, Language, and Audio 2) | Chiang Mai 4 | Christian H Ritz | |
Date | Time | Title | Authors |
8 November 2022 | 15.20-15.40 | Neural Conversational Speech Synthesis With Flexible Control of Emotion Dimensions | Hiroki Mori; Hironao Nishino |
15.40-16.00 | Temporal Feedback Convolutional Recurrent Neural Networks for Speech Command Recognition | Taejun Kim; Juhan Nam | |
16.20-16.40 | Impact of Compression on the Performance of the Room Impulse Response Interpolation Approach to Spatial Audio Synthesis | Hualin Ren; Christian Ritz; Jiahong Zhao; Daeyoung Jang | |
16.40-17.00 | Machine Anomalous Sound Detection Based on Self-Supervised Classification | Shuxian Wang; Jun Du; Yajian Wang | |
17.00-17.20 | A Study on Low-Latency Recognition-Synthesis-Based Any-To-One Voice Conversion | Yi-Yang Ding; Li-Juan Liu; Yu Hu; Zhen-Hua Ling | |
17.20-17.40 | Speech Enhancement With Perceptually-Motivated Optimization and Dual Transformations | Xucheng Wan; Kai Liu; Ziqing Du; Huan Zhou | |
Session | Room | Chair | |
TuPM1-7 (SS12: Advanced signal detection and inspection technology) | Chiang Mai 5 | Settha Tangkawanit | |
Date | Time | Title | Authors |
8 November 2022 | 15.20-15.40 | Automatic Sound Detection and Notification System Using MFCC | Jaruwat Patmanee; Prapatson Kotipang; Pawarisorn Sinpeang; Surachet Kanprachar; Settha Tangkawanit |
15.40-16.00 | Sound Identification Using MFCC With Machine Learning | Pattarapong Kammee; Chairat Pinthong; Surachet Kanprachar; Settha Tangkawanit | |
16.20-16.40 | Direct-Lattice Adaptive Notch Filter for Frequency Estimation and Tracking | Prayuth Inban; Rachu Punchalard; Chawalit Benjangkaprasert | |
16.40-17.00 | Distance Estimation Between Camera and Vehicles From an Image Using YOLO and Machine Learning | Rattapoom Waranusast; Panomkhawn Riyamongkol; Pattanawadee Pattanathaburt | |
17.00-17.20 | OCR Application for Cancer Care | Settha Tangkawanit; Jiraporn Pooksook; Jirarat Ieamsaard; Panupong Sornkhom | |
17.20-17.40 | The Development of Mobile Application for Assisting COVID-19 Antigen Test Kit Results Reading | Rattapoom Waranusast; Pattanawadee Pattanathaburt | |
17.40 - 18.00 | Matched Filter Detector for Textile Fiber Classification of Signals With Near-Infrared Spectrum | Suchart Yammen; Wachira Limsripraphan | |
Session | Room | Chair | |
WedAM1-1 (SS11: Transfer Learning for Real World) | Chiang Mai 1 | Kasemsit Teeyapan | |
Date | Time | Title | Authors |
9 November 2022 | 9.00-9.20 | Semantics-Guided Knowledge Integration for Domain Adaptation Few-Shot Relation Extraction | Zeyuan Wang; Yifan Du; Guangwei Zhang; Ruifan Li; Yongping Xiong; Chuang Zhang |
9.20-9.40 | PVGCRA: Prediction Variance Guided Cross Region Domain Adaptation | Ran Xu; Yixiang Huang; Chuang Zhang | |
9.40-10.00 | Multi-Branch Network for Few-Shot Learning | Kai Ren; Zijie Guo; Zhimin Zhang; Rui Zhu; Xiaoxu Li | |
10.00-10.20 | Few-Shot Classification With Feature Reconstruction Bias | Zhen Li; Lang Wang; Shuo Ding; Xiaochen Yang; Xiaoxu Li | |
10.20-10.40 | Dual Prototypical Network for Robust Few-Shot Image Classification | Qi Song; Zebin Peng; Luchen Ji; Xiaochen Yang; Xiaoxu Li | |
10.40-11.00 | Graph Evolving and Embedding in Transformer | Jen-Tzung Chien; Chia-Wei Tsao | |
Session | Room | Chair | |
WedAM1-2 (Speech, Language, and Audio 1) | Chiang Mai 2 | Yuthapong Somchit | |
Date | Time | Title | Authors |
9 November 2022 | 9.00-9.20 | Punctuation Restoration for Singaporean Spoken Languages: English, Malay, and Mandarin | Abhinav Rao; Ho Thi-Nga; Chng Eng Siong |
9.20-9.40 | C-CycleTransGAN: A Non-Parallel Controllable Cross-Gender Voice Conversion Model With CycleGAN and Transformer | Changzeng Fu; Chaoran Liu; Carlos Toshinori Ishi; Hiroshi Ishiguro | |
9.40-10.00 | The Realization and Perception of Narrow Focus in English Sentences by Cantonese EFL Learners | Chong Cao; Aijun Li | |
10.00-10.20 | Cross-Lingual Dysarthria Severity Classification for English, Korean, and Tamil | Eun Jung Yeo; Kwanghee Choi; Sunhee Kim; Minhwa Chung | |
10.20-10.40 | 3M: An Effective Multi-View, Multi-Granularity, and Multi-Aspect Modeling Approach to English Pronunciation Assessment | Fu-An Chao; Tien-Hong Lo; Tzu-I Wu; Yao-Ting Sung; Berlin Chen | |
10.40-11.00 | I Feel Stressed Out: A Mandarin Speech Stress Dataset With New Paradigm | Shuaiqi Chen; Xiaofen Xing; Guodong Liang; Xiangmin Xu | |
Session | Room | Chair | |
WedAM1-3 ( Deep Learning: Algorithm, Implementations, and Applications) | Chiang Mai 3 | Hiroyoshi Ito | |
Date | Time | Title | Authors |
9 November 2022 | 9.00-9.20 | End-To-End Reinforcement Learning of Robotic Manipulation With Robust Keypoints Representation | Tianying Wang; En Yen Puang; Marcus Lee; Wei Jing; Yan Wu |
9.20-9.40 | BEAM - an Algorithm for Detecting Phishing Link | Sea Ran Cleon Liew; Ngai Fong Law | |
9.40-10.00 | I2CR: Improving Noise Robustness on Keyword Spotting Using Inter-Intra Contrastive Regularization | Dianwen Ng; Jia Qi Yip; Tanmay Surana; Zhao Yang; Chong Zhang; Yukun Ma; Chongjia Ni; Eng Siong Chng; Bin Ma | |
10.00-10.20 | Human-In-The-Loop Chord Progression Generator With Generative Adversarial Network | Yoshiteru Matsumoto; Hiroyoshi Ito; Hiroko Terasawa; Yuya Yamamoto; Yuzuru Hiraga; Masaki Matsubara | |
10.20-10.40 | A Resource-Limited FPGA-Based MobileNetV3 Accelerator | Yutana Jewajinda; Thanapol Thongkum | |
10.40-11.00 | CG-Net: A Compound Gaussian Prior Based Unrolled Imaging Network | Carter A Lyons; Raghu G. Raj; Margaret Cheney | |
Session | Room | Chair | |
WedAM1-4 (Signal Image and Information Processing Theory and Methods) | Board Room 2 | Sakgasit Ramingwong | |
Date | Time | Title | Authors |
9 November 2022 | 9.00-9.20 | A Policy-Based Approach to the SpecAugment Method for Low Resource E2E ASR | Rui Li; Guodong Ma; Dexin Zhao; Ranran Zeng; Xiaoyu Li; Hao Huang |
9.20-9.40 | Manifold Rewiring for Unlabeled Imaging | Valentin Debarnot; Vinith Kishore; Cheng Shi; Ivan Dokmanic | |
9.40-10.00 | CRDet: An Object-Context-Aware Detection Network for Oriented Object in Aerial Images | Lele Liang; Linghan Li; Qi Liu; Yuchao Dai; Mingyi He | |
10.00-10.20 | Effects of Incorporating a Deep-Unfolding Framework Into a Deep Neural Network: Implications for Image Restoration | Tatsuki Itasaka; Masahiro Okuda | |
10.20-10.40 | Cross-Modal Knowledge Distillation With Dropout-Based Confidence | Won Ik Cho; Jeunghun Kim; Nam Soo Kim | |
10.40-11.00 | A Multi-Objective Perceptual Aware Loss Function for End-To-End Target Speaker Separation | Zhan Jin; Bang Zeng; Fan Zhang | |
Session | Room | Chair | |
WedAM1-5 (Research Review) | Board Room 3 | Ying-Hui Lai | |
Date | Time | Title | Authors |
9 November 2022 | 9.00-9.20 | EEG-Based Anomaly Detection Model by One-Class Support Vector Machine for Dream Enactment Behavior in REM Sleep Behavior Disorder | Shumpei Date, Koichi Fujiwara, Yukiyoshi Sumi, Hiroshi Kadotani, Makoto Imai, Keiko Ogawa |
9.20-9.40 | Development of Heat Stroke Detection Model Based on Heart Rate Variability Using LSTM-AutoEncoder | Shota Saeda, Koshi Ota, Koichi Fujiwara, Takatomi Kubo, Toshitaka Yamakawa, Aozora Yamamoto, Yuki Maruno, Manabu Kano | |
9.40-10.00 | Driving Fitness Evaluation Model for Patients With Schizophrenia Based on Driving Data of Healthy Participants and Random Forest | Shuji Tsunoda, Koichi Fujiwara, Seiko Miyata, Akiko Yamaguchi, Shogo Kitagawa, Yuki Konishi, Reiji Yoshimura, Isao Taguchi, Yutaka Sawa, Kunihiro Iwamoto, Norio Ozaki | |
10.00-10.20 | Method for Estimating Test Contrast Peak Time in Computed Tomography Angiography | Toshihide Otsuki; Kazuto Sakamoto; Homare Saisho; Hiroyoshi Yokoi; Toshitaka Yamakawa | |
10.20-10.40 | Development of an Epileptic Seizure Prediction Algorithm Based on R-R Intervals With Temporal Convolutional Networks | Rikumo Ode; Koichi Fujiwara; Miho Miyajima; Toshitaka Yamakawa; Manabu Kano; Taketoshi Maehara | |
Session | Room | Chair | |
WedAM1-6 (SS17: Emerging Diseases and Smart Image Processing) | Chiang Mai 4 | Krisana Chinnasarn | |
Date | Time | Title | Authors |
9 November 2022 | 9.00-9.20 | Pre-Processing SARS-CoV-2 Sequence Data for Application of Machine Learning Techniques for Visualization and Clustering of Virus Characteristics | Juhyeon Kim; Insung Ahn |
9.20-9.40 | Educational Multi-Purpose Kit for Coding and Robotic Design | Atikhun Thongpool; Daranee Hormdee; Raksit Chutipakdeevong; Wasan Tansakul; | |
9.40-10.00 | Forecasting Dengue Fever in France and Thailand Using XGBoost | Thanin Methiyothin; Insung Ahn | |
10.00-10.20 | Fine-Tuning BERT for Question and Answering Using PubMed Abstract Dataset | Saeyeon Cheon; Insung Ahn | |
10.20-10.40 | Coarse X-Ray Lumbar Vertebrae Pose Localization Using Triangulation Correspondence | Watcharaphong Yookwan; Jiranun Sangrueng; Krisana Chinnasarn | |
10.40-11.00 | 4G Signal RSSI Recommendation System for ISP Quality of Service Improvement | Tanatpon Duangta; Watcharaphong Yookwan; Krisana Chinnasarn; Anuparp Boonsongsrikul | |
Session | Room | Chair | |
WedAM1-7 (Speech, Language, and Audio 2) | Chiang Mai 5 | Sutasinee Thovuttikul | |
Date | Time | Title | Authors |
9 November 2022 | 9.00-9.20 | SE-DPTUNet: Dual-Path Transformer Based U-Net for Speech Enhancement | Bengbeng He; Kai Wang; Wei-Ping Zhu |
9.20-9.40 | Encoder Re-Training With Mixture Signals on FastMVAE Method | Shuhei Yamaji; Taishi Nakashima; Nobutaka Ono; Li Li; Hirokazu Kameoka | |
9.40-10.00 | Unsupervised Disentanglement of Timbral, Pitch, and Variation Features From Musical Instrument Sounds With Random Perturbation | Keitaro Tanaka; Yoshiaki Bando; Kazuyoshi Yoshii; Shigeo Morishima | |
10.00-10.20 | Estimation of Transfer Coefficients and Signals of Sound-To-Light Conversion Device Blinky Under Saturation | Kosuke Nishida; Natsuki Ueno; Yuma Kinoshita; Nobutaka Ono | |
10.20-10.40 | Design and Evaluation of Instrument Sound Identification Difficulty for the Deaf and Hard-Of Hearing | Shiho Akaki; Rumi Hiraga; Keiichi Yasu; Keiji Tabuchi; Hiroko Terasawa | |
10.40-11.00 | Correcting, Rescoring and Matching: An N-Best List Selection Framework for Speech Recognition | Chin-Hung Kuo; Kuan-Yu Chen | |
Session | Room | Chair | |
WedAM1-8 (SS04: Advanced Signal Processing and Machine Learning for Audio and Speech Applications) | Board Room 4 | Shoji Makino | |
Date | Time | Title | Authors |
9 November 2022 | 9.00-9.20 | Hyperbolic Timbre Embedding for Musical Instrument Sound Synthesis Based on Variational Autoencoders | Futa Nakashima; Tomohiko Nakamura; Norihiro Takamune; Satoru Fukayama; Hiroshi Saruwatari |
9.20-9.40 | Multi-Task Adversarial Training Algorithm for Multi-Speaker Neural Text-To-Speech | Yusuke Nakai; Yuki Saito; Kenta Udagawa; Hiroshi Saruwatari | |
9.40-10.00 | Inverse-Free Online Independent Vector Analysis With Flexible Iterative Source Steering | Taishi Nakashima; Nobutaka Ono | |
10.00-10.20 | Accelerating online algorithm using geometrically constrained independent vector analysis with iterative source steering | Kana Goto; Tetsuya Ueda; Li Li; Takeshi Yamada; Shoji Makino | |
10.20-10.40 | A Dilated Inception Convolutional Neural Network for Gridless DOA Estimation Under Low SNR Scenarios | Zhi-Wei Tan; Yuan Liu; Andy W. H. Khong | |
10.40-11.00 | Efficient Low-Latency Convolution With Uniform Filter Partition and Its Evaluation on Real-Time Blind Source Separation | Yui Kuriki; Taishi Nakashima; Kouei Yamaoka; Natsuki Ueno; Yukoh Wakabayashi; Nobutaka Ono; Ryo Sato | |
Session | Room | Chair | |
WedPM1-1 (SS05: Advanced Image and Video Processing using Deep Learning) | Chiang Mai 1 | Chul Lee | |
Date | Time | Title | Authors |
9 November 2022 | 14.00-14.20 | Object Segmentation Using Parametric Representation | Hochang Rhee; Hyung Il Koo; Nam Ik Cho |
14.20-14.40 | Deep Color Constancy Using Multi-Band NIR | Jeong-Won Ha; Dong-keun Han; Min-Je Park; Jong-Ok Kim | |
14.40-15.00 | Smooth Panoramic Walkthrough for Adjacent Panoramic Viewpoints With Dense Spherical Matching Points | Kyungjune Lee; Mingyu Jang; Sanghoon Lee; Kim Taewan | |
15.00-15.20 | Region Adaptive Self-Attention for an Accurate Facial Emotion Recognition | Seongmin Lee; Jeonghaeng Lee; Minsik Kim; Sanghoon Lee | |
15.20-15.40 | Quality Enhancement of Screen Content Video Using Dual-Input CNN | Ziyin Huang; Yue Cao; Sik-Ho Tsang; Yui-Lam Chan; Kin-Man Lam | |
15.40-16.00 | Underwater Image Enhancement Using Realistic Dataset With Turbidity and Color Distortion | Eunpil Park; Eunsung Jo; Jae-Young Sim | |
Session | Room | Chair | |
WedPM1-2 (Speech, Language, and Audio 1) | Chiang Mai 2 | Ashish Panda | |
Date | Time | Title | Authors |
9 November 2022 | 14.00-14.20 | Neural Vocoder Feature Estimation for Dry Singing Voice Separation | Jaekwon Im; Soonbeom Choi; Sangeon Yong; Juhan Nam |
14.20-14.40 | Adapting GCC-PHAT to Co-Prime Circular Microphone Arrays for Speech Direction of Arrival Estimation Using Neural Networks | Jiahong Zhao; Christian Ritz | |
14.40-15.00 | A Novel Approach to Structured Pruning of Neural Network for Designing Compact Audio-Visual Wake Word Spotting System | Haotian Wang; Jun Du; Hengshun Zhou; Heng Lu; Yuhang Cao | |
15.00-15.20 | Hierarchic Temporal Convolutional Network With Attention Fusion for Target Speaker Extraction | Zihao Chen; Wenbo Qiu; Haitao Xu; Ying Hu | |
15.20-15.40 | Acoustic Model Adaption Using x-Vectors for Improved Automatic Speech Recognition | Meet Soni; Aditya Raikar; Ashish Panda; Sunil Kumar Kopparapu | |
15.40-16.00 | Acoustic Pornography Recognition Using Convolutional Neural Networks and Bag of Refinements | Lifeng Zhou; Kaifeng Wei; Yuke Li; Yiya Hao; Weiqiang Yang; Haoqi Zhu | |
Session | Room | Chair | |
WedPM1-3 ( Deep Learning: Algorithm, Implementations, and Applications) | Chiang Mai 3 | Jen-Tzung Chien | |
Date | Time | Title | Authors |
9 November 2022 | 14.00-14.20 | An Optimal Vehicle Counting Framework for Non-Canonical CCTV Placements | Ng Chin Hooi; Edwin Tan Chee Pin; Chiew Yeong Shiong; Lim Mei Kuan |
14.20-14.40 | Response Sentence Modification Using a Sentence Vector for a Flexible Response Generation of Retrieval-Based Dialogue Systems | Ryota Yahagi; Akinori Ito; Takashi Nose; Yuya Chiba | |
14.40-15.00 | End-To-End Stereo Audio Coding Using Deep Neural Networks | Wootaek Lim; Inseon Jang; Seungkwon Beack; Jongmo Sung; Taejin Lee | |
15.00-15.20 | Neural Beamformer With Automatic Detection of Notable Sounds for Acoustic Scene Classification | Sota Ichikawa; Takeshi Yamada; Shoji Makino | |
15.20-15.40 | DNN-Based Frequency-Domain Permutation Solver for Multichannel Audio Source Separation | Fumiya Hasuike; Daichi Kitamura; Rui Watanabe | |
15.40-16.00 | Detection Method From 4K Images Using SSD300 Without Retraining | Kei Irie; Kiyoshi Nishikawa | |
Session | Room | Chair | |
WedPM1-4 (Signal Image and Information Processing Theory and Methods) | Board Room 2 | Navadon Khunlertgit | |
Date | Time | Title | Authors |
9 November 2022 | 14.00-14.20 | PAformer: Visually Indistinguishable Bolt Defect Recognition Based on Bolt Position and Attributes | Wenshuo Lou; Ke Zhang; Yangjie Xiao; Xiwang Guo; Jiacun Wang |
14.20-14.40 | Adapted Spectrogram Transformer for Unsupervised Cross-Domain Acoustic Anomaly Detection | Gilles Van De Vyver; Zhaoyi Liu; Koustabh Dolui; Danny Hughes; Sam Michiels | |
14.40-15.00 | A Two-Stage Cascading Method Based on Finetuning in Semi-Supervised Domain Adaptation Semantic Segmentation | Huiying Chang; Kaixin Chen; Ming Wu | |
15.00-15.20 | Landmark Management in the Application of Radar SLAM | Shuai Sun; Beth Jelfs; Kamran Ghorbani; Glenn I. Matthews; Chris Gilliam | |
15.20-15.40 | Parameterization of Dominant Spectral Peak Trajectory for Whisper Speech Recognition | Chang Feng; Xiaolong Wu; Mingxing Xu; Thomas Fang Zheng | |
15.40-16.00 | Specific Emitter Identification at Different Time Based on Multi-Domain Migration | Jiaxu Liu; Jianqing Li; Jiao Wang; Hao Huang | |
Session | Room | Chair | |
WedPM1-5 (Research Review) | Board Room 3 | Koichi Fujiwara | |
Date | Time | Title | Authors |
9 November 2022 | 14.00-14.20 | Long-Term Prognostic Prediction of West Syndrome Based on Scalp EEG Using Convolution Neural Network Autoencoder | Tatsuki Saito; Koichi Fujiwara; Jun Natsume; Ryosuke Suzui |
14.20-14.40 | Modification of RRI Data by NBEATS Model | Hongtao Chen, Koichi Fujiwara, Manabu Kano | |
14.40-15.00 | Transformer With Noise Divider | Mun-Hyung Lee, Seon-Woo Lee, Jung-Mu Choi, Jang-Woo Kwon | |
15.00-15.20 | Schizophrenia Classification Based on the Natural Language Processing Technology-A Pilot Study | Ying Hsuan Chen; Pei-Yun Lin; Tsung-Tse Ho; Yuh-Jer Chang; Ying-Hui Lai | |
15.20-15.40 | Signed Graph Balancing Based on Spectral Clustering | Haruki Yokota, Junya Hara, Yuichi Tanaka | |
15.40-16.00 | Graph Signal Sampling for Multiple Generator Functions | Junya Hara; Yuichi Tanaka | |
Session | Room | Chair | |
WedPM1-6 (Signal Proceesing for Audio and Speech Applications) | Chiang Mai 4 | Tomoyosi Akiba | |
Date | Time | Title | Authors |
9 November 2022 | 14.00-14.20 | Semi-Supervised ASR Based on Iterative Joint Training With Discrete Speech Synthesis | Keiya Takagi; Tomoyosi Akiba; Hajime Tsukada |
14.20-14.40 | Analysis of Amplitude and Frequency Perturbation in the Voice for Fake Audio Detection | Kai Li; Yao Wang; Minh Le Nguyen; Masato Akagi; Masashi Unoki | |
14.40-15.00 | Deep Hashing for Speaker Identification and Retrieval Based on Auditory Sparse Representation | Dung Kim Tran; Masato Akagi ; Masashi Unoki | |
15.00-15.20 | Divide and Conquer: A Low-Complexity Neural Network for Monophonic Speech Enhancement | Bingxiao Fang; Liang Liu | |
15.20-15.40 | Domain Adaptation and Language Conditioning to Improve Phonetic Posteriorgram Based Cross-Lingual Voice Conversion | Pin-Chieh Hsu; Nobuaki Minematsu; Daisuke Saito | |
15.40-16.00 | Von Mises Mixture Model-Based DNN for Sign Indetermination Problem in Phase Reconstruction | Nguyen Binh Thien; Yukoh Wakabayashi; Geng Yuting; Kenta Iwai; Takanobu Nishiura | |
Session | Room | Chair | |
WedPM1-7 (Speech, Language, and Audio 2) | Chiang Mai 5 | Daranee Hormdee | |
Date | Time | Title | Authors |
9 November 2022 | 14.00-14.20 | Speaker Representation Learning via Contrastive Loss With Maximal Speaker Separability | Zhe Li; Man Wai Mak |
14.20-14.40 | Design of Discriminators in GAN-Based Unsupervised Learning of Neural Post-Processors for Suppressing Localized Spectral Distortion | Riku Ogino; Kohei Saijo; Tetsuji Ogawa | |
14.40-15.00 | Simultaneous Frequency Estimation for Three or More Sinusoids Based on Sinusoidal Constraint Differential Equation | Kenta Yamada, Yoshiki Masuyama, Yukoh Wakabayashi, Nobutaka Ono | |
15.00-15.20 | Do You Know How Humans Sound? Exploring a Qualification Test Design for Crowdsourced Evaluation of Voice Synthesis Quality | Moe Yaegashi; Susumu Saito; Teppei Nakano; Tetsuji Ogawa | |
15.20-15.40 | Exploring the Gender Difference on Mandarin Tone Realization in Lombard Speech | Weizhong Zhang; Jian Gong; Kai Sheng; Yuhong Sun; William Bellamy; Xiaoli Ji | |
Session | Room | Chair | |
WedPM1-8 (Data Analytics and Machine Learning) | Board Room 4 | Chern Hong Lim | |
Date | Time | Title | Authors |
9 November 2022 | 14.00-14.20 | Improving Co-SVD for Cold-Start Recommendations Using Sparsity Reduction | Low Jia Ming; Chern Hong Lim; Ian K. T. Tan |
14.20-14.40 | Epoch-Wise Double Descent Triggered by Learning a Single Sample | Aoshi Kawaguchi; Hiroshi Kera; Toshihiko Yamasaki | |
14.40-15.00 | Current Source Localization Using Deep Prior With Depth Weighting | Hajime Yano; Rio Yamana; Ryoichi Takashima; Tetsuya Takiguchi; Seiji Nakagawa | |
15.00-15.20 | A Proposal for Emotion-Expressive Editor:EmoEditor by Font Changing | Yuki Shimamura; Michiharu Niimi | |
15.20-15.40 | Traceback Memory Reduction for Three-Sequence Alignment Algorithm With Affine Gap Models | Rui-Ting Chien; Mao-Jan Lin; Yang-Ming Yeh; Yi-Chang Lu | |
15.40-16.00 | Acceleration of Subspace Learning Machine via Particle Swarm Optimization and Parallel Processing | Hongyu Fu; Yijing Yang; Yuhuai Liu; Joseph Lin; Ethan Harrison; Vinod K. Mishra; C.-C. Jay Kuo | |
Session | Room | Chair | |
WedPM2-1 (SS05: Advanced Image and Video Processing using Deep Learning) | Chiang Mai 1 | Chul Lee | |
Date | Time | Title | Authors |
9 November 2022 | 16.20-16.40 | Enhanced Bidirectional Motion Estimation Using Feature Refinement for HDR Imaging | An Gia Vien; Truong Thanh Nhat Mai; Seonghyun Park; Gahyeon Kim; Chul Lee |
16.40-17.00 | Fast Asymmetric Bilateral Motion Estimation for Video Frame Interpolation | Jintae Kim; Junheum Park; Chang-Su Kim | |
17.00-17.20 | Future Object Localization in Autonomous Driving Using Ego-Centric Images and Motions | Seoyoung Jo; Jung-Kyung Lee; Je-won Kang | |
17.20-17.40 | Restoration of High-Frequency Components in Under Display Camera Images | Youngjin Oh; Gu Yong Park; Nam Ik Cho | |
17.40-18.00 | Non-Intrusive Speech Intelligibility Estimation Using Deep Learning With Speech Enhancement and Convolutional Layers | Kazushi Nakazawa; Kazuhiro Kondo | |
18.00-18.20 | Unified Angle Adjustment Network for Image Composition Enhancement | Jinwon Ko; Nyeong-Ho Shin; Seonho Lee; Chang-Su Kim | |
Session | Room | Chair | |
WedPM2-2 (Speech, Language, and Audio 1) | Chiang Mai 2 | Kasemsit Teeyapan | |
Date | Time | Title | Authors |
9 November 2022 | 16.20-16.40 | Automated Audio Captioning With Epochal Difficult Captions for Curriculum Learning | Andrew Koh; Soham Tiwari; Chng Eng Siong |
16.40-17.00 | Application of Deep Learning-Based Single-Channel Speech Enhancement for Frequency-Modulation Transmitted Speech | Ying Ma; Xueliang Zhang | |
17.00-17.20 | An Empirical Study of Training Mixture Generation Strategies on Speech Separation: Dynamic Mixing and Augmentation | Shukjae Choi; Younglo Lee; Jihwan Park; Hyung Yong Kim; Byeong-Yeol Kim; Zhong-Qiu Wang; Shinji Watanabe | |
17.20-17.40 | Speech Intelligibility Prediction for Hearing Aids Using an Auditory Model and Acoustic Parameters | Benita Angela Titalim; Candy Olivia Mawalim; Shogo Okada; Masashi Unoki | |
17.40-18.00 | Predicting Speech Fluency in Children Using Automatic Acoustic Features | Lionel Fontan; Shinyoung Kim; Verdiana De Fino; Sylvain Detey | |
18.00-18.20 | TC-SKNet With GridMask for Low-Complexity Classification of Acoustic Scene | Luyuan Xie; Yan Zhong; Lin Yang; Zhaoyu Yan; Zhonghai Wu; Junjie Wang | |
Session | Room | Chair | |
WedPM2-3 ( Deep Learning: Algorithm, Implementations, and Applications) | Chiang Mai 3 | Masaomi Kimura | |
Date | Time | Title | Authors |
9 November 2022 | 16.20-16.40 | Design and Control of a Muscle-Skeleton Robot Elbow Based on Reinforcement Learning | Jianyin Fan; Haoran Xu; Yuwei Du; Jing Jin; Qiang Wang |
16.40-17.00 | Non-Autoregressive Speech Recognition With Error Correction Module | Yukun Qian; Xuyi Zhuang; Zehua Zhang; Lianyu Zhou; Xu Lin; Mingjiang Wan | |
17.00-17.20 | A Method for Adversarial Example Generation by Perturbing Selected Pixels | KAMEGAWA Tomoki; KIMURA Masaomi | |
17.20-17.40 | A Title Generation Method With Transformer for Journal Articles | MATSUMOTO Riku; KIMURA Masaomi | |
17.40-18.00 | Catastrophic Forgetting Avoidance Method for a Classification Model by Model Synthesis and Introduction of Background Data | HIRAYAMA Akari; KIMURA Masaomi | |
18.00-18.20 | Consistency Regularization for GAN-Based Neural Vocoders | Kotaro Onishi; Toru Nakashika | |
18.20-18.40 | Parallel Training of TN and ITN Models Through CycleGAN for Improved Sequence to Sequence Learning Performance | Md. Mizanur Rahaman Nayan; Mohammad Ariful Haque | |
Session | Room | Chair | |
WedPM2-4 (SS14:Emerging Signal Processing Technology for Medical Applications/ Biomedical Signal Processing and Systems) | Board Room 2 | Yuttapong Jiraraksopakun | |
Date | Time | Title | Authors |
9 November 2022 | 16.20-16.40 | Laparoscope Manipulating Robot (LMR) Navigation Using Deep Learning-Based Surgical Instruments Detection | Nyi Nyi Myo; Apiwat Boonkong; Daranee Hormdee; Suphachoke Sonsilphong; Amornthep Sonsilphong; Kovit Khampitak |
16.40-17.00 | Human-Machine Interface Device Using Piezoelectric Sensors Based on Facial Muscle Movements for Wheelchair Control | Charoenporn Bouyam; Theerat Saichoo; Nannaphat Siribunyaphat; Yunyong Punsawad | |
17.00-17.20 | Obstructive Sleep Apnea Classification Using Snore Sounds Based on Deep Learning | Apichada Sillaparaya; Apichai Bhatranand; Chudanat Sudthongkhong; Kosin Chamnongthai; Yuttapong Jiraraksopakun | |
17.20-17.40 | Heart Rate Estimation of Car Driver Using Radar Sensors and Blind Source Separation | Keito Murata; Daichi Kitamura; Ryo Saito; Daichi Ueki | |
17.40-18.00 | Total Variation Algorithms for PAT Image Reconstruction | Mary Anjaley Josy John; Imad Barhumi | |
18.00-18.20 | Visual Function and Emotional Regulation in Achromatic Color and Chromatic Color Using Low Resolution Brain Electromagnetic Tomography Analysis (LORETA) | Watchara Sroykham; Yodchanan Wongsawat | |
18.20-18.40 | Effect of Electrooculography on Electroencephalography Classifying Accuracy in Deep Learning and Reducing Number of Channels in Motor-Imagery Brain-Computer Interface | Musashi Ino; Yoshihiro Kono; Nobuaki Kobayashi | |
Session | Room | Chair | |
WedPM2-5 (SS16: Emerging Techniques in Multimedia Data Analytics and Codings) | Board Room 3 | Patiwet Wuttisarnwattana/ Kampol Woradit | |
Date | Time | Title | Authors |
9 November 2022 | 16.20-16.40 | Optimal Deep Multi-Route Self-Attention for Single Image Super-Resolution | Nisawan Ngambenjavichaikul; Sovann Chen; Supavadee Aramvith |
16.40-17.00 | Object Detection in Aerial Images With Attention-Based Regression Loss | Chandler Timm C. Doloriel; Rhandley D. Cajote | |
17.00-17.20 | Performance Analysis of JPEG XR With Deep Learning-Based Image Super-Resolution | Taingliv Min; Supavadee Aramvith | |
17.20-17.40 | MCSNet: Multi-Channel Sharing Network for Single Image Super-Resolution | Wazir Muhammad; Supavadee Aramvith; Watchara Ruangsang | |
17.40-18.00 | DCAN: Deep Consecutive Attention Network for Video Super Resolution | Talha Saleem; Sovann Chen; Supavadee Aramvith | |
18.00-18.20 | Wiener Filter-Based Color Attribute Quality Enhancement for Geometry-Based Point Cloud Compression | Jinrui Xing; Hui Yuan; Chen Chen; Wei Gao | |
18.20-18.40 | Mixed Context Techniques in the Adaptive Arithmetic Coding Process for DC Term and Lossless Image Encoding | Evan Shih; Jian-Jiun Ding | |
Session | Room | Chair | |
WedPM2-6 (Signal Proceesing for Audio and Speech Applications) | Chiang Mai 4 | Sunao Hara/Sutasinee Thovuttikul | |
Date | Time | Title | Authors |
9 November 2022 | 16.20-16.40 | Prediction Method of Soundscape Impressions Using Environmental Sounds and Aerial Photographs | Yusuke Ono; Sunao Hara; Masanobu Abe |
16.40-17.00 | Robust Speech Dereverberation Based on Adaptive Weighted Prediction Error Algorithm With Eigenvector Extraction | Yitong Chen; Wen Zhang | |
17.00-17.20 | Multi-Task Learning for Speech Emotion and Emotion Intensity Recognition | Pengcheng Yue; Leyuan Qu; Shukai Zheng; Taihao Li | |
17.20-17.40 | Karaoke Generation From Songs: Recent Trends and Opportunities | Preet Patel; Ansh Ray; Khushboo Thakkar; Kahan Sheth; Sapan H Mankad | |
17.40-18.00 | Multi-Branch Learning for Noisy and Reverberant Monaural Speech Separation | Chao Ma; Dongmei Li | |
18.00-18.20 | Significance of Quadrature and In-Phase Components for Synthetic Spoofed Speech Detection | Priyanka Gupta; Piyushkumar K. Chodingala; Hemant A. Patil | |
Session | Room | Chair | |
WedPM2-7 (SS20: High Performance Intelligent Technologies for Image and Video Applications) | Chiang Mai 5 | Sansanee Auephanwiriyakul | |
Date | Time | Title | Authors |
9 November 2022 | 16.20-16.40 | Mammography Quality Evaluation and Model Interpretation Based on CNN-Based Inframammary Fold Classification | Yi-Chong Zeng; Yu-Cheng Wu; Chen-Yen Yeh; Shu-Chi Li; Tzu-Han Chou; Yi-Wen Huang; Giu-Cheng Hsu; Hsian-He Hsu |
16.40-17.00 | Hybrid Image Compression Framework Based on Single Image Training | Tien-Ying Kuo; Yu-Jen Wei; Kuan-Yu Su | |
17.00-17.20 | Highly Robust Action Retrieval Using View-Invariant Pose Feature and Simple Yet Effective Query Expansion Method | Noboru Yoshida; Jianquan Liu | |
17.20-17.40 | A Unified Compression and Watermarking Scheme for MT-BTC Images | Jing-Ming Guo; Sankarasrinivasan Seshathiri | |
17.40-18.00 | Fusion With Hierarchical Graphs for Multimodal Emotion Recognition | Shuyun Tang; Zhaojie Luo; Guoshun Nan; Jun Baba; Yuichiro Yoshikawa; Hiroshi Ishiguro | |
18.00-18.20 | Multi-Stage Superpixel-Based Segmentation Algorithm Using Fully Convolutional Networks and Discriminative Features | Pei-Chi Huang; Jian-Jiun Ding | |
18.20-18.40 | Deep Learning Acceleration Design Based on Low-Rank Approximation | Yi-Hsiang Chang*, Gwo Giun (Chris) Lee*, Shiu-Yu Chen* | |
Session | Room | Chair | |
WedPM2-8 (Data Analytics and Machine Learning) | Board Room 4 | Wanus Srimaharaj | |
Date | Time | Title | Authors |
9 November 2022 | 16.20-16.40 | Internet of Behavior and Brain Response Identification for Cognitive Performance Analysis | Wanus Srimaharaj; Roungsan Chaisricharoen |
16.40-17.00 | Refinement of Utterance Fluency Feature Extraction and Automated Scoring of L2 Oral Fluency With Dialogic Features | Ryuki Matsuura; Shungo Suzuki; Mao Saeki; Tetsuji Ogawa; Yoichi Matsuyama | |
17.00-17.20 | A Vision Transformer-Based Approach to Bearing Fault Classification via Vibration Signals | Abid Hasan Zim; Aeyan Ashraf; Aquib Iqbal; Asad Malik; Minoru Kuribayashi | |
17.20-17.40 | Analysis Method for Motion Factors Related to Joint Contact Forces at the Knee During Walking Using Grad-CAM | Satoshi Suwa; Koh Inoue; Ryo Matsuoka | |
17.40-18.00 | A Dataset and a Lightweight Object Detection Network for Thermal Image-Based Home Surveillance | Zhengqiang Shao; Longbin Yan; Jie Chen; Jingdong Chen | |
18.00-18.20 | SCQ: Self-Supervised Cross-Modal Quantization for Unsupervised Large-Scale Retrieval | Fuga Nakamura; Ryosuke Harakawa; Masahiro Iwahashi | |
Session | Room | Chair | |
ThAM1-1 (Image Video Multimedia) | Chiang Mai 1 | Masaaki Ikehara | |
Date | Time | Title | Authors |
10 November 2022 | 9.00-9.20 | Single Image Raindrop Removal Using a Non-Local Operator and Feature Maps in the Frequency Domain | Shinya Ezumi; Masaaki Ikehara |
9.20-9.40 | Dual-Teacher Distillation for Low-Light Image Enhancement | Jeong-Hyeok Park; Tae-Hyeon Kim; Jong-Ok Kim | |
9.40-10.00 | Automatic Data Augmentation Method With Improved Interpretability for Image Classification in Computer Vision Applications | Dair Ungarbayev; Osman Demirel; Muhammad Tahir Akhtar | |
10.00-10.20 | Learning to Sharpen Partially Blurred Image via Iterative Blurred Region Mining and Recovery | Jung Yeh; Wen-Li Wei; Duan-Yu Chen; Jen-Chun Lin | |
10.20-10.40 | Shape-Bias Evaluation of Pretrained Models Using Image Decomposition | Akinori Iwata; Masahiro Okuda | |
10.40-11.00 | Proposal of Associative Watermarking Method | Ryoto Kanegae; Masaki Kawamura | |
Session | Room | Chair | |
ThAM1-2 (Speech, Language, and Audio 1) | Chiang Mai 2 | Toshio Irino | |
Date | Time | Title | Authors |
10 November 2022 | 9.00-9.20 | DMF-Net: A Decoupling-Style Multi-Band Fusion Model for Full-Band Speech Enhancement | Guochen Yu; Yuansheng Guan; Weixin Meng; Chengshi Zheng; Hui Wang; Yutian Wang |
9.20-9.40 | Speak Like a Dog: Human to Non-Human Creature Voice Conversion | Kohei Suzuki; Shoki Sakamoto; Tadahiro Taniguchi; Hirokazu Kameoka | |
9.40-10.00 | Pre-Trained Multimodal End-To-End Network for Spoken Language Assessment Incorporating Prompts | Binghuai Lin; Liyuan Wang | |
10.00-10.20 | Gated Fusion of Handcrafted and Deep Features for Robust Automatic Pronunciation Assessment | Binghuai Lin; Liyuan Wang | |
10.20-10.40 | Effective Data Screening Technique for Crowdsourced Speech Intelligibility Experiments: Evaluation With IRM-Based Speech Enhancement | Ayako Yamamoto; Toshio Irino; Shoko Araki; Kenichi Arai; Atsunori Ogawa; Keisuke Kinoshita; Tomohiro Nakatani | |
Session | Room | Chair | |
ThAM1-3 (Deep Learning: Algorithm, Implementations, and Applications) | Chiang Mai 3 | Kasemsit Teeyapan | |
Date | Time | Title | Authors |
10 November 2022 | 9.00-9.20 | Leveraging Pre-Trained Acoustic Feature Extractor for Affective Vocal Bursts Tasks | Bagus Tris Atmaja; Akira Sasou |
9.20-9.40 | Flow-Based Variational Sequence Autoencoder | Jen-Tzung Chien; Tien-Ching Luo | |
9.40-10.00 | Speech Intelligibility Prediction Through Direct Estimation of Word Accuracy Using Conformer | Naoyuki Kamo; Kenichi Arai; Atsunori Ogawa; Shoko Araki; Tomohiro Nakatani; Keisuke Kinoshita; Marc Delcroix; Tsubasa Ochiai; Toshio Irino | |
10.00-10.20 | DNN-Rule Hybrid Dyna-Q for Sample-Efficient Task-Oriented Dialog Policy Learning | Mingxin Zhang; Takahiro Shinozaki | |
10.20-10.40 | MoCoVC: Non-Parallel Voice Conversion With Momentum Contrastive Representation Learning | Kotaro Onishi; Toru Nakashika | |
10.40-11.00 | Controllable Voice Conversion Based on Quantization of Voice Factor Scores | Takumi Isako; Kotaro Onishi; Takuya Kishida; Toru Nakashika | |
Session | Room | Chair | |
ThAM1-4 (Biomedical Signal Processing and Systems) | Board Room 2 | Daranee Hormdee | |
Date | Time | Title | Authors |
10 November 2022 | 9.00-9.20 | Deep Adaptive Denoising Auto-Encoder Networks for ECG Noise Cancelation via Time-Frequency Domain | Amir Mohammadisarab; Poorya Aghaomidi; Jalil Mazloum; Mohammad Ali Akbarzadeh; Mahdi Orooji; Nader Mokari; Halim Yanikomeroglu |
9.20-9.40 | User-Item Recommendation Approaches to Detect Genomic Variant Interactions | Emma Andrade; Nicholas Tom; Mario Banuelos | |
9.40-10.00 | Teager Energy Cepstral Coefficients for Classification of Dysarthric Speech Severity-Level | Aastha Kachhi; Anand Therattil; Ankur T. Patil; Hardik B. Sailor; Hemant A. Patil | |
10.00-10.20 | Decoding Emotional Valence from EEG in Immersive Virtual Reality | Guanxiong Pei; Bingjie Li; Taihao Li; Ruohao Xu; Jianmin Dong; Jia Jin | |
10.20-10.40 | Design of A Wearable System for Hypoxic Training Management Using Blood Oxygenation and Heart Rate | Takuma Kitagawa; Toshitaka Yamakawa | |
10.40-11.00 | MedBERT: A Pre-Trained Language Model for Biomedical Named Entity Recognition | Charangan Vasantharajan; Kyaw Zin Tun; Ho Thi-Nga; Sparsh Jain; Tong Rong; Chng Eng Siong | |
Session | Room | Chair | |
ThAM1-5 (SS21: Recent Advances and Applications in Encrypted Domain) | Board Room 3 | Simying Ong | |
Date | Time | Title | Authors |
10 November 2022 | 9.00-9.20 | Encrypted JPEG Image Retrieval via Huffman-Code Based Self-Attention Networks | Zhixun Lu; Qihua Feng; Peiya Li |
9.20-9.40 | Reversible Data Hiding in Encrypted Text Using Paillier Cryptosystem | Asad Malik; Aeyan Ashraf; Hanzhou Wu; Minoru Kuribayashi | |
9.40-10.00 | Scrambling-Embedding in Partially-Encrypted Images | Koi Yee Ng, Simying Ong | |
10.00-10.20 | Image Classification Using Vision Transformer for EtC Images | Genki HAMANO; Shoko IMAIZUMI; Hitoshi KIYA | |
10.20-10.40 | Image Watermarking Based on Saliency Detection and Multiple Transformations | Ahmed Khan; KokSheik Wong; Vishnu Monn Baskaran | |
Session | Room | Chair | |
ThAM1-6 (SS19: Towards real-world human-centric acoustic signal processing) | Chiang Mai 4 | Sakgasit Ramingwong | |
Date | Time | Title | Authors |
10 November 2022 | 9.00-9.20 | A Fast Converge Spectral Modulation Sensitive Active Noise Control System | Kah-Meng Cheong; Yih Liang Shen; Tai-Shih Chi |
9.20-9.40 | Multimodal Forgery Detection Using Ensemble Learning | Ammarah Hashmi; Sahibzada Adil Shahzad; Wasim Ahmad;Chia Wen Lin;Yu Tsao;Hsin-Min Wang | |
9.40-10.00 | Speech Enhancement-Assisted Voice Conversion in Noisy Environments | Yun-Ju Chan; Chiang-Jen Peng; Syu-Siang Wang; Hsin-Min Wang; Yu Tsao; Tai-Shih Chi | |
10.00-10.20 | Effect of Noise on the Perceptual Contribution of Cochlea-Scaled Entropy and Speech Level in Mandarin Sentence Understanding | Weikang Wu; Shangdi Liao; Fei Chen | |
10.20-10.40 | EEG-Based Auditory Attention Detection With Estimated Speech Sources Separated From an Ideal-Binary-Masking Process | Lei Wang; Fei Chen | |
10.40-11.00 | Automatic Step Detection of Tandem Gait Test in Patients With Vestibular Hypofunction Using Wearable Sensors | Yi-Ju Huang; Chien-Pin Liu; Kuan-Chung Ting; Chia-Yeh Hsieh; Kai-Chun Liu; Chia-Tai Chan | |
Session | Room | Chair | |
ThAM1-7 (SS22: Recent Advances in Biometrics and Security) | Chiang Mai 5 | Koichi Ito | |
Date | Time | Title | Authors |
10 November 2022 | 9.00-9.20 | Continuous Authentication for Smartphones Using Face Images and Touch-Screen Operation | Shuto Kinoshita; Yuka Watanabe; Yasushi Yamazaki |
9.20-9.40 | Spoofing Attack Detection in Face Recognition System Using Vision Transformer With Patch-Wise Data Augmentation | Kota Watanabe; Koichi Ito; Takafumi Aoki | |
9.40-10.00 | A Simple and Accurate CNN for Iris Recognition | Shokei Kawakami; Hiroya Kawai; Koichi Ito; Takafumi Aoki; Yoshiko Yasumura; Masakazu Fujio; Yosuke Kaga; Kenta Takahashi | |
10.00-10.20 | Eyeglass Frame Segmentation for Face Image Processing | Kanta Miura; Takamichi Miyamoto; Kazuyuki Sakurai; Koichi Ito; Takafumi Aoki | |
10.20-10.40 | A Fair Model is Not Fair in a Biased Environment | Yuya Sato; Soshi Maeda; Muku Akasaka; Masakatsu Nishigaki; Tetsushi Ohki | |
Session | Room | Chair | |
ThAM1-8 (Other related speech processing) | Board Room 4 | Sansanee Auephanwiriyakul | |
Date | Time | Title | Authors |
10 November 2022 | 9.00-9.20 | Intelligibility Prediction of Enhanced Speech Using Recognition Accuracy of End-To-End ASR System | Kenichi Arai; Atsunori Ogawa; Shoko Araki; Keisuke Kinoshita; Tomohiro Nakatani; Naoyuki Kamo; Toshio Irino |
9.20-9.40 | Hi, KIA: A Speech Emotion Recognition Dataset for Wake-Up Words | Taesu Kim; SeungHeon Doh; Gyunpyo Lee; Hyeongseok Jeon; Juhan Nam; Hyeon-Jeong Suk | |
9.40-10.00 | Improving Speech Emotion Recognition via Fine-Tuning ASR With Speaker Information | Bao Thang Ta, Tung Lam Nguyen, Dinh Son Dang, Nhat Minh Le, Van Hai Do | |
10.00-10.20 | 3CMLF: Three-Stage Curriculum-Based Mutual Learning Framework for Audio-Text Retrieval | Yi-Wen Chao; Dongchao Yang; Rongzhi Gu; Yuexian Zou | |
Session | Room | Chair | |
ThPM1-1 (Image Video Multimedia) | Chiang Mai 1 | Masaki Kawamura | |
Date | Time | Title | Authors |
10 November 2022 | 12.30-12.50 | Neural Network Based Watermarking Trained With Quantized Activation Function | Shingo Yamauchi; Masaki Kawamura |
12.50-13.10 | A Multiframe Super-Resolution Pipeline for Sub-Image-Typed Light Field Data | Chien-Han Hsu; Yi-Hsien Lin; Yen-Po Lin; Yi-Chang Lu | |
13.10-13.30 | Restoring Edge and Color Using Weighted Near-Infrared Image and Color Transmission Maps for Robust Haze Removal | Onhi Kato; Akira Kubota | |
13.30-13.50 | Dense View Interpolation of 4D Light Fields for Real-Time Augmented Reality Applications | Hidemichi Yoshino; Kazuya Kodama; Takayuki Hamamoto | |
13.50-14.10 | Bolt Looseness Identification Using Faster R-CNN and Grid Mask Augmentation | Natchapon Panmatharit; Yuttapong Jiraraksopakun; Anek Siripanichgorn; Punnarai Siricharoen | |
14.10-14.30 | Large-Scale Blind Face Super-Resolution via Edge Guided Frequency Aware Generative Facial Prior Networks | Xi Cheng; Wan-Chi Siu; Jian Yang | |
Session | Room | Chair | |
ThPM1-2 (Speech, Language, and Audio 1) | Chiang Mai 2 | Takanobu Nishiura | |
Date | Time | Title | Authors |
10 November 2022 | 12.30-12.50 | Language-Based Audio Retrieval With Converging Tied Layers and Contrastive Loss | Andrew Koh; Chng Eng Siong |
12.50-13.10 | D²Net: A Denoising and Dereverberation Network Based on Two-Branch Encoder and Dual-Path Transformer | Liusong Wang; Wenbing Wei; Yadong Chen; Ying Hu | |
13.10-13.30 | Direct Speech-Reply Generation From Text-Dialogue Context | Kenichi Fujita; Yusuke Ijima; Hiroaki Sugiyama | |
13.30-13.50 | Sequence-Wise Optimization for Quasi-Harmonic Speech Waveform Modeling | Shaowen Chen; Tomoki Toda | |
13.50-14.10 | Lattice-Based Data Augmentation for Code-Switching Speech Recognition | Roland Hartanto; Kuniaki Uto; Koichi Shinoda | |
14.10-14.30 | Phase-Aware Audio Super-Resolution for Music Signals Using Wasserstein Generative Adversarial Network | Yanqiao Yan; Binh Thien Nguyen; Yuting Geng; Kenta Iwai; Takanobu Nishiura | |
Session | Room | Chair | |
ThPM1-3 (Deep Learning: Algorithm, Implementations, and Applications) | Chiang Mai 3 | Jen-Chun Lin | |
Date | Time | Title | Authors |
10 November 2022 | 12.30-12.50 | Speech Emotion Recognition Based on the Reconstruction of Acoustic and Text Features in Latent Space | Jennifer Santoso; Rintaro Sekiguchi; Takeshi Yamada; Kenkichi Ishizuka; Taiichi Hashimoto; Shoji Makino |
12.50-13.10 | A Light CNN With Split Batch Normalization for Spoofed Speech Detection Using Data Augmentation | Haojian Lin; Yang Ai; Zhenhua Ling | |
13.10-13.30 | On the Optimal Classifier for Affective Vocal Bursts and Stuttering Predictions Based on Pre-Trained Acoustic Embedding | Bagus Tris Atmaja; Zanjabila; Akira Sasou | |
13.30-13.50 | Nonlinear Residual Echo Suppression Based on Gated Dual Signal Transformation LSTM Network | Kai Xie; Ziye Yang; Jie Chen | |
13.50-14.10 | Adaptive End-To-End Text-To-Speech Synthesis Based on Error Correction Feedback From Humans | Kazuki Fujii; Yuki Saito; Hiroshi Saruwatari | |
14.10-14.30 | Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-To-Speech | Byoung Jin Choi; Myeonghun Jeong; Minchan Kim; Sung Hwan Mun; Nam Soo Kim | |
Session | Room | Chair | |
ThPM1-4 (SS07: Latest Wireless Technologies for Sensing and Communications) | Board Room 2 | Osamu Takyu | |
Date | Time | Title | Authors |
10 November 2022 | 12.30-12.50 | Performance Evaluation of FISTA With Constant Inertial Parameter | Kaito Kameda; Ryo Hayakawa; Kazunori Hayashi; Youji Iiguni |
12.50-13.10 | An Approximated ADMM Based Algorithm for \(\ell_1-\ell_2\) Optimization Problem | Rui Lin; Kazunori Hayashi | |
13.10-13.30 | Antenna Beamforming Selection With Low Complexity and High Exploitation of White Space in Frequency Spectrum Sharing | Kizuku Kawamura; Kohei Akimoto; Osamu Takyu | |
13.30-13.50 | Individual Memory Driven Transformer Deep Learning Model for Multi-Cell Massive MIMO Beam Prediction | Taisei Urakami; Haohui Jia; Na Chen; Minoru Okada | |
13.50-14.10 | Deep Unfolding-Aided Sum-Product Algorithm for Error Correction of CRC Coded Short Message | Qilin Zhang; Shinsuke Ibi; Takumi Takahashi; Hisato Iwai | |
14.10-14.30 | Successive Interference Cancellation for Signal Demodulation of Multiple LPWA Systems | Shinichiro Kakuda; Takeo Fujii; Shusuke Narieda | |
Session | Room | Chair | |
ThPM1-5 (SS08: Digital Convergence of 5G/B5G, AIoT and Security) | Board Room 3 | Kampol Woradit | |
Date | Time | Title | Authors |
10 November 2022 | 12.30-12.50 | Evaluation of Voice Service in LEO Communication With 3GPP PUSCH Repetition Enhancement | Shou-Hong Liu; Chun-Tai Liu; Wei-Hung Chou; JenYi Pan |
12.50-13.10 | Modeling of Malware Diffusion With Mobile Devices in Intermittently Connected Networks | Hideyoshi Miura; Shoya Abukawa; Tomotaka Kimura; Kouji Hirata | |
13.10-13.30 | Software Defined Radio Access Network Sharing by Multi-Operator Core Networks | Wen-Ping Lai; Wen-Ru Chen; Ming-Jay Lai; Hong-Lun Lai; Chia-Ying Lin; Po-Chen Tseng | |
13.30-13.50 | Machine Learning Based End-To-End Constellation Training for Communication Systems | Po-Chiang Lin | |
13.50-14.10 | Flow-Based DDoS Detection Using Deep Neural Network With Radial Basis Function Neural Network | Ting-Chung Leung; Lee Chung-Nan | |
14.10-14.30 | Implement a Continuous Learning Model to Detect Different Types of DDoS Attacks With Hierarchical Temporal Memory | Hung Manh Nguyen; Yu-Kuen Lai | |
Session | Room | Chair | |
ThPM1-6 (SS23: Selected Papers from APSIPA Workshop in Hanoi, Vietnam) | Chiang Mai 4 | Nguyen Linh Trung | |
Date | Time | Title | Authors |
10 November 2022 | 12.30-12.50 | Dynamic Hand Gesture Recognition From Egocentric Videos Based on SlowFast Architecture | Ha-Dang Ho, Hong-Quan Nguyen, Thuy-Binh Nguyen, Sinh-Thuong Vu, Thi-Lan Le |
12.50-13.10 | Deep Learning-Based Signal Detection for Dual-Mode Index Modulation 3D-OFDM | Dang-Y Hoang, Tien-Hoa Nguyen, Vu-Duc Ngo, Trung Tan Nguyen†, Nguyen Cong Luong, Thien Van Luong | |
13.10-13.30 | A Comparison of Feature Selection and Feature Extraction in Network Intrusion Detection Systems | Tuan-Cuong Vuong, Hung Tran, Mai Xuan Trang, Vu-Duc Ngo, Thien Van Luong | |
13.30-13.50 | Deep Neural Network-Based Detector for Single-Carrier Index Modulation NOMA | Toan Gian, Vu-Duc Ngo,Tien-Hoa Nguyen, Trung tan Nguyen, Thien Van Luong | |
13.50-14.10 | Vibration Measurement Using Spatial Shifting Coherent Digital Holography | Long Hai Ngo; Quang Duc Pham | |
14.10-14.30 | Robust Online Tucker Dictionary Learning From Multidimensional Data Streams | Le Trung Thanh; Tran Trong Duy; Karim Abed-Meraim; Nguyen Linh Trung; Adel Hafiane | |
Session | Room | Chair | |
ThPM1-7 (SS06: Adversarial Attacks and Defense) | Chiang Mai 5 | Minoru Kuribayashi | |
Date | Time | Title | Authors |
10 November 2022 | 12.30-12.50 | Survey on Vision Based Fake News Detection and Its Impact Analysis | Mehul S Raval; Mohendra Roy; Minoru Kuribayashi |
12.50-13.10 | StyleGAN Encoder-Based Attack for Block Scrambled Face Images | AprilPyone MaungMaung; Hitoshi Kiya | |
13.10-13.30 | On the Adversarial Transferability of ConvMixer Models | Ryota Iijima; Miki Tanaka; Isao Echizen; Hitoshi Kiya | |
13.30-13.50 | Detection and Correction of Adversarial Examples Based on JPEG-Compression-Derived Distortion | Kenta Tsunomori; Yuma Yamasaki; Minoru Kuribayashi; Nobuo Funabiki; Isao Echizen | |
13.50-14.10 | Defense Against Adversarial Examples Using Beneficial Noise | Param Raval; Harin Khakhi; Minoru Kuribayashi; Mehul S. Raval | |
14.10-14.30 | Privacy Protection Against Automated Tracking System Using Adversarial Patch | Hiroto Takiwaki; Minoru Kuribayashi; Nobuo Funabiki; Mehul Shirishchandra Raval | |
Session | Room | Chair | |
ThPM1-8 (Industrial Forum "New era opened by AI-based image processing) | Board Room 4 | Jangwoo Kwon | |
Date | Time | Title | Authors |
10 November 2022 | 12.30-14.30 | Towards Best Possible Deep Learning Acceleration on the Edge – A Compression-Compilation Co-Design Framework | Yanzhi Wang, Northeastern University, Chairman and former CEO of CoCoPIE Inc., USA |
Empowering Future Pathology with Artificial Intelligence | Shuhao Wang, Co-founder and CTO of Thorough Future, China | ||
Session | Room | Chair | |
ThPM2-1 (Image Video Multimedia) | Chiang Mai 1 | Nam Ik Cho | |
Date | Time | Title | Authors |
10 November 2022 | 15.00-15.20 | Syllable Analysis Data Augmentation for Khmer Ancient Palm Leaf Recognition | Nimol Thuon; Jun Du; Jianshu Zhang |
15.20-15.40 | Multi-Class Vehicle Counting System for Multi-View Traffic Videos | Wichukorn Kuntintara; Kanokphan Lertniphonphan; Punnarai Siricharoen | |
15.40-16.00 | Table Structure Recognition Based on Grid Shape Graph | Eunji Lee; Junhyeong Kwon; Haeyoon Yang; Jaewoo Park; Soonyoung Lee; Hyung Il Koo; Nam Ik Cho | |
16.00-16.20 | Feature Distillation Network for Multi-Band NIR Colorization | Tae-Sung Park; Tae-Hyeon Kim; Jong-Ok Kim | |
16.20-16.40 | Blur Detection for Surveillance Camera System | Yikun Pan, Sik-Ho Tsang, Yui-Lam Chan, Daniel P.K. Lun | |
16.40-17.00 | Lip Sync Matters: A Novel Multimodal Forgery Detector | Sahibzada Adil Shahzad; Ammarah Hashmi; Sarwar Khan; Yan-Tsung Peng; Yu Tsao; Hsin-Min Wang | |
Session | Room | Chair | |
ThPM2-2 (Speech, Language, and Audio 1) | Chiang Mai 2 | Kittichai Wantanajittikul/patiwet | |
Date | Time | Title | Authors |
10 November 2022 | 15.00-15.20 | Frame-Level Matching Scheme Using Posteriorgram Probability Distance of Spoken Data to Improve Search Accuracy of Spoken Term Detection | Reo Minakawa; Kazunori Kojima; Shi-wook Lee; Yoshiaki Itoh |
15.20-15.40 | Empirical Study Incorporating Linguistic Knowledge on Filled Pauses for Personalized Spontaneous Speech Synthesis | Yuta Matsunaga; Takaaki Saeki; Shinnosuke Takamichi; Hiroshi Saruwatari | |
15.40-16.00 | Using Perceptual Quality Features in the Design of the Loss Function for Speech Enhancement | Nicholas Eng; Yusuke Hioka; Catherine I Watson | |
16.00-16.20 | Correlation Loss for MOS Prediction of Synthetic Speech | Beibei Hu; Qiang Li | |
16.20-16.40 | Back-Translation-Style Data Augmentation for Mandarin Chinese Polyphone Disambiguation | Chunyu Qiang; Peng Yang; Hao Che; Jinba Xiao; Xiaorui Wang; Zhongyuan Wang | |
16.40-17.00 | Classification of Short Audio Acoustic Scenes Based on Data Augmentation Methods | Xuan Zhang; Yunfei Shao; Junjie Xu; Yong Ma; Wei-Qiang Zhang | |
Session | Room | Chair | |
ThPM2-3 (Deep Learning: Algorithm, Implementations, and Applications) | Chiang Mai 3 | Kasemsit Teeyapan | |
Date | Time | Title | Authors |
10 November 2022 | 15.00-15.20 | Improving Unsupervised Anomalous Sound Detection Performance of Autoencoder and Its Variant With Pretrained Deep Belief Network | Yufeng Deng; Jia Liu; Wei-Qiang Zhang |
15.20-15.40 | ASGAN-VC: One-Shot Voice Conversion With Additional Style Embedding and Generative Adversarial Networks | Wei-Cheng Li; Tzer-Jen Wei | |
15.40-16.00 | Fusing Multiple Bandwidth Spectrograms for Improving Speech Enhancement | Hao Shi; Yuchun Shu; Longbiao Wang; Jianwu Dang; Tatsuya Kawahara | |
16.00-16.20 | End-To-End Two-Dimensional Sound Source Localization With Ad-Hoc Microphone Arrays | Yijun Gong; Shupei Liu; Xiao-Lei Zhang | |
16.20-16.40 | Exploring Speaker Age Estimation on Different Self-Supervised Learning Models | Tuan Duc Truong; Tran The Anh; Eng-Siong Chng | |
16.40-17.00 | Mandarin Singing Voice Synthesis With Denoising Diffusion Probabilistic Wasserstein GAN | Yin-Ping Cho; Yu Tsao; Hsin-Min Wang; Yi-Wen Liu | |
Session | Room | Chair | |
ThPM2-4 (SS18: Metaverse: Future of Internet) | Board Room 2 | Navadon Khunlertgit | |
Date | Time | Title | Authors |
10 November 2022 | 15.00-15.20 | Physiological Study on the Effect of Game Events in Response to Player's Laughter | Mikito Fukuda; Yoshiko Arimoto |
15.20-15.40 | Development of a Virtual Telecommunication System Research Laboratory | Siwanart Jearavongtakul; Imran Saeed Mirza; Lunchakorn Wuttisittikulkij; Pruk Sasithong; Suebphong Noisri; Pisit Vanichchanunt | |
15.40-16.00 | Camera-Based Log System for Human Physical Distance Tracking in Classroom | Somrudee Deepaisarn; Angkoon Angkoonsawaengsuk; Charn Arunkit; Chayud Srisumarnk; Krongkan Nimmanwatthana; Nanmanas Linphrachaya; Nattapol Chiewnawintawat; Rinrada Tanthanathewin; Sivakorn Seinglek; Suphachok Buaruk; Virach Sornlertlamvanich | |
16.00-16.20 | Detecting Replay Attacks Using Single-Channel Audio: The Temporal Autocorrelation of Speech | Shih-Kuang Lee; Yu Tsao; Hsin-Min Wang | |
Session | Room | Chair | |
ThPM2-5 ( Wireless Communication and networking) | Board Room 3 | Poompat Saengudomlert | |
Date | Time | Title | Authors |
10 November 2022 | 15.00-15.20 | Automatic Detection of Dimmable Pulse Position Modulation for Visible Light Communication | Poompat Saengudomlert; Karel Sterckx |
15.20-15.40 | Estimation of Angular Power Spectrum Using Multikernel Adaptive Filtering | Eiji Ninomiya; Masahiro Yukawa; Renato L. G. Cavalcante; Lorenzo Miretti | |
15.40-16.00 | Novel Smart Sectoring and Beam Designs in mmWave Broadcast Channels | Yan-Yin He; Shang-Ho (Lawrence) Tsai; Jen-Ming Wu | |
16.00-16.20 | New Methods for Fast Detection for Embedded Cognitive Radio | Grégoire de Broglie; Louis Morge-Rollet; Denis Le Jeune; Frédéric Le Roy; Christian Roland; Charles Canaff; Jean-Philippe Diguet | |
Session | Room | Chair | |
ThPM2-6 (SS23: Selected Papers from APSIPA Workshop in Hanoi, Vietnam) | Chiang Mai 4 | Nguyen Linh Trung | |
Date | Time | Title | Authors |
10 November 2022 | 15.00-15.20 | Needle Localization and Segmentation for Radiofrequency Ablation of Liver Tumors Under CT Image Guidance | Le Quoc Anh; Luu Manh Ha; Theo van Walsum; Adriaan Moelker; Dao Viet Hang; Pham Cam Phuong; Vu Duy Thanh |
15.20-15.40 | End-To-End Visual-Guided Audio Source Separation With Enhanced Losses | Duc-Huy Pham; Quang-Anh Do; Thanh Thi-Hien Duong; Thi-Lan Le; Phi Le Nguyen | |
15.40-16.00 | Automated Classification of Lung Injury From X-Ray Images Using Deep Learning Network | Huy Le; Thanh-Ha Do | |
16.00-16.20 | AI-Based Video Analysis for Traffic Monitoring | Bui Son Tung; Phung The Ngoc; Do Duy Thanh; Nguyen Hong Thinh | |
16.20-16.40 | Adaptive Filtering-Based Heavy-Noise Removal in Born Iterative Method | Tran Quang-Huy; Luong Thi Theu; Nguyen Canh Minh; Duc-Nghia Tran; Duc-Tan Tran | |
16.40-17.00 | A Novel Deep Learning-Based Approach for Sleep Apnea Detection Using Single-Lead ECG Signals | Anh-Tu Nguyen; Thao Nguyen; Huy-Khiem Le; Huy-Hieu Pham; and Cuong Do | |
Session | Room | Chair | |
ThPM2-7 (SS15: Advanced Sensing Technologies using Wireless Signal) | Chiang Mai 5 | Kampol Woradit | |
Date | Time | Title | Authors |
10 November 2022 | 15.00-15.20 | Multi-Resolution GPR Clutter Suppression Method Based on Low-Rank and Sparse Decomposition | Yanjie Cao; Xiaopeng Yang; Tian Lan |
15.20-15.40 | Indoor Human Motion Recognition Method Based on Kernel-Distance Doppler Velocity Estimation and Lightweight Network | Weicheng Gao; Xiaopeng Yang; Xiaodong Qu; Jiancheng Liao; Zixiang Yin; Ding Zhang | |
15.40-16.00 | Mainlobe Interference Suppression Method Based on Blocking Matrix Preprocessing With Low Sidelobe Constraint | Meng Haoyu; Qu Xiaodong; Zhang Xingyu; Li Wolin; Zhang Zhengyan; Yang Xiaopeng | |
16.00-16.20 |