Tuan Anh Nguyen Dang

I'm a Research Scientist in Computer Science specializing in Document AI and Machine Learning at various research institutions. I previously did research work presented at major conferences including EMNLP, ICDAR, BMVC, and ICPR, collaborating with researchers from multiple institutions to advance the field of document intelligence.

My work focuses on developing novel deep learning architectures for information extraction from business documents, with particular emphasis on handling visually-rich documents like forms, invoices, and receipts. I've published several influential papers on end-to-end information extraction methods, including the Multi-Stage Attentional U-Net architecture which achieved state-of-the-art results while using 40% fewer parameters than previous approaches.

Recently, I've been working on safety-critical AI applications, developing the IncidentAI dataset for preventing industrial failures through better natural language processing of incident reports. My research also explores confidence measurement in AI systems, hierarchical relation extraction, and improving document understanding with limited training data through reinforcement learning techniques. I am additionally interested in large language models, agent systems, and applying them to real-world business problems.

Publications

Towards Safer Operations: An Expert-involved Dataset of High-Pressure Gas Incidents for Preventing Future Failures

Towards Safer Operations: An Expert-involved Dataset of High-Pressure Gas Incidents for Preventing Future Failures

Shumpei Inoue, Minh-Tien Nguyen, Hiroki Mizokuchi, Tuan-Anh Dang Nguyen, Huu-Hiep Nguyen, Dung Tien Le

Conference on Empirical Methods in Natural Language Processing 2023

Improving Document Image Understanding with Reinforcement Finetuning

Improving Document Image Understanding with Reinforcement Finetuning

Bao-Sinh Nguyen, Dung Tien Le, Hieu M. Vu, Tuan-Anh Dang Nguyen, Minh Le Nguyen, Hung Le

International Conference on Neural Information Processing 2022

HYCEDIS: HYbrid Confidence Engine for Deep Document Intelligence System

HYCEDIS: HYbrid Confidence Engine for Deep Document Intelligence System

Bao-Sinh Nguyen, Q. Tran, Tuan-Anh Dang Nguyen, D. Nguyen, H. Le

International Conference on Neural Information Processing 2022

Jointly Learning Span Extraction and Sequence Labeling for Information Extraction from Business Documents

Nguyen Hong Son, Hieu M. Vu, Tuan-Anh Dang Nguyen, Minh Le Nguyen

IEEE International Joint Conference on Neural Network 2022

End-to-End Information Extraction by Character-Level Embedding and Multi-Stage Attentional U-Net

End-to-End Information Extraction by Character-Level Embedding and Multi-Stage Attentional U-Net

Tuan-Anh Dang Nguyen, Dat Nguyen Thanh

British Machine Vision Conference 2021

A Span Extraction Approach for Information Extraction on Visually-Rich Documents

Tuan-Anh Dang Nguyen, Hieu M. Vu, Nguyen Hong Son, Minh-Tien Nguyen

ICDAR Workshops 2021

End-to-End Hierarchical Relation Extraction for Generic Form Understanding

Tuan-Anh Dang Nguyen, Duc Thanh Hoang, Q. Tran, Chih-Wei Pan, T. Nguyen

International Conference on Pattern Recognition 2021

PCA-based 3D Facial Reenactment From Single Image

PCA-based 3D Facial Reenactment From Single Image

N. T. Dat, Tuan-Anh Dang Nguyen, D. V. Sang

International Conference on Multimedia Analysis and Pattern Recognition 2020