About Me

Hi! I am currently working as an AI Research Engineer at Upstage. Also I am a MS/Ph.D student in the Department of Computer Science and Engineering at Korea University. My advisor is Prof. Heuiseok Lim, and I am part of the Natural Language Processing & Artificial Intelligence Lab. I am the founder and leader of KU-NMT Group. I received Naver Ph.D. Fellowship 2021. See CV for more information.

Research Interest

Neural Machine Translation, Domain Specialized Neural Machine Translation, Ancient Korean Neural Machine Translation, Simultaneous Speech Translation, Automatic Post Editing, Quality Estimation, Korean Spelling Correction, Grammar Error Correction, Speech Recognition Post-Processor, Dialogue Summarization, AI for Accessibility, AI for Education, Language Resource and Evaluation, Semantic Search and Information Retrieval, CX (Customer Experience), UX (User Experience), Quality Assurance

Professional Experience

2022.03 - Now: Upstage AI Research Engineer
2018.09 - 2019.07: SYSTRAN Research Engineer
2018.06 - 2018.08: SYSTRAN Research Engineer Intern

Academic Services

Virtual Social Chair: COLING 2022
Program committee (Reviewer): ACL 2022-Insights, ARR (ACL 2022, NAACL 2022, NAACL 2022-Industry Track)

Publications

Preprints

  1. Empirical study on BlenderBot 2.0’s Errors Analysis in terms of Model, Data and User-Centric Approach
    Jungseob Lee, Suhyune Son, Midan Shim, Yujin Kim, Chanjun Park, Heuiseok Lim (Equal Contribution(First Co-Author))
    arxiv, 2022

  2. A Self-Supervised Automatic Post-Editing Data Generation Tool
    Hyeonseok Moon, Chanjun Park, Sugyeong Eo, Jaehyung Seo, SeungJun Lee, Heuiseok Lim
    arxiv, 2021

  3. PicTalky: Augmentative and Alternative Communication Software for Language Developmental Disabilities
    Chanjun Park, Yoonna Jang, Seolhwa Lee, Jaehyung Seo, Kisu Yang, Heuiseok Lim
    arxiv, 2021

Top Conference (Main / Workshop)

  1. A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation
    Jaehyung Seo, Seounghoon Lee, Chanjun Park, Yoonna Jang, Hyeonseok Moon, Sugyeong Eo, Seonmin Koo, Heuiseok Lim
    NAACL 2022 - Findings, 2022

  2. Priming Ancient Korean Neural Machine Translation
    Chanjun Park, Seolhwa Lee, Hyeonseok Moon, Sugyeong Eo, Jaehyung Seo, Heuiseok Lim
    LREC 2022, 2022 - (Oral presentation)

  3. FreeTalky: Don’t Be Afraid! Conversations Made Easier by a Humanoid Robot using Persona-based Dialogue
    Chanjun Park, Yoonna Jang, Seolhwa Lee, Sungjin Park, Heuiseok Lim
    LREC 2022, 2022 - (Poster)

  4. Empirical Analysis of Synthetic Data Generation Using Noising Strategies for Automatic Post-editing
    Hyeonseok Moon, Chanjun Park, Seolhwa Lee, Jaehyung Seo, Jeongsub Lee, Sugyeong Eo, Heuiseok Lim
    LREC 2022, 2022 - (Poster)

  5. FreeTalky: Don’t Be Afraid! Conversations Made Easier by a Humanoid Robot using Persona-based Dialogue
    Chanjun Park, Yoonna Jang, Seolhwa Lee, Sungjin Park, Heuiseok Lim
    AAAI 2022 -Artificial Intelligence for Education(AI4EDU), 2022

  6. How should human translation coexist with NMT? Efficient tool for building high quality parallel corpus
    Chanjun Park, Seolhwa Lee, Hyeonseok Moon, Sugyeong Eo, Jaehyung Seo, Heuiseok Lim
    NeurIPS 2021 - Data-centric AI (DCAI) workshop, 2021

  7. A New Tool for Efficiently Generating Quality Estimation Datasets
    Sugyeong Eo, Chanjun Park, Jaehyung Seo, Hyeonseok Moon, Heuiseok Lim
    NeurIPS 2021 - Data-centric AI (DCAI) workshop, 2021

  8. Automatic Knowledge Augmentation for Generative Commonsense Reasoning
    Jaehyung Seo, Chanjun Park, Sugyeong Eo, Hyeonseok Moon, Heuiseok Lim
    NeurIPS 2021 - Data-centric AI (DCAI) workshop, 2021

  9. Syntax-enhanced Dialogue Summarization using Syntax-aware information
    Seolhwa Lee, Kisu Yang, Chanjun Park, João Sedoc, Heuiseok Lim
    NeurIPS 2021 - Women in Machine Learning (WiML 2021) workshop, 2021 - (Contributed Talk / Oral presentation)

  10. Towards Syntax-Aware Dialogue Summarization using Multi-task Learning
    Seolhwa Lee, Kisu Yang, Chanjun Park, João Sedoc, Heuiseok Lim
    EMNLP 2021 -Widening NLP (WiNLP2021) workshop, 2021 - (Poster)

  11. Two Heads are Better than One? Verification of Ensemble Effect in Neural Machine Translation
    Chanjun Park, Sungjin Park, Seolhwa Lee, Taesun Whang, Heuiseok Lim
    EMNLP 2021 -The Second Workshop on Insights from Negative Results in NLP, 2021 - (Oral presentation)

  12. BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text
    Chanjun Park, Jaehyung Seo, Seolhwa Lee, Chanhee Lee, Hyeonseok Moon, Sugyeong Eo, Heuiseok Lim
    ACL 2021 -WAT(Workshop on Asian Translation) 2021 Workshop, 2021 - (oral presentation)

  13. Dealing with the Paradox of Quality Estimation
    Sugyeong Eo, Chanjun Park, Jaehyung Seo, Hyeonseok Moon, Heuiseok Lim (Equal Contribution(First Co-Author))
    MT Summit 2021 - LoResMT, 2021 - (Oral presentation)

  14. Should we find another model?: Improving Neural Machine Translation Performance with ONE-Piece Tokenization Method without Model Modification
    Chanjun Park, Sugyeong Eo, Hyeonseok Moon, Heuiseok Lim
    NAACL-HLT 2021 Industry Track, 2021- (Poster/Oral presentation)

International Journal (SCI/SCIE)

  1. Empirical Analysis of Korean Public AI Hub Parallel Corpora and in-depth Analysis using LIWC
    Chanjun Park, Midan Shim, Sugyeong Eo, Seolhwa Lee, Jaehyung Seo, Hyeonseok Moon, Heuiseok Lim
    Applied Sciences, 2022

  2. AI for Patents: A Novel yet Effective and Efficient Framework for Patent Analysis
    Junyoung Son, Hyeonseok Moon, Jeongwoo Lee, Seolhwa Lee, Chanjun Park, Wonkyung Jung, Heuiseok Lim
    IEEE Access, 2022

  3. AI student: A Machine Reading Comprehension System for the Korean College Scholastic Ability Test
    Gyeongmin Kim, Soomin Lee, Chanjun Park, Jaechoon Jo
    Mathematics, 2022

  4. Return on Advertising Spend Prediction with Task Decomposition based LSTM Model
    Hyeonseok Moon, Taemin Lee, Jaehyung Seo, Chanjun Park, Sugyeong Eo, Imatitikua D. AIyanyo, Jeongbae Park, Aram So, Kyoungwha Ok, Kinam Park
    Mathematics, 2022

  5. Word-level Quality Estimation for Korean-English Neural Machine Translation
    Sugyeong Eo, Chanjun Park, Hyeonseok Moon, Jaehyung Seo, Heuiseok Lim (Equal Contribution(First Co-Author))
    IEEE Access, 2022

  6. Dense-to-Question and Sparse-to-Answer: Hybrid Retriever System for Industrial Frequently Asked Questions
    Jaehyung Seo, Taemin Lee, Hyeonseok Moon, Chanjun Park, Sugyeong Eo, Imatitikua D AIyanyo, Kinam Park, Aram So, Sungmin Ahn, Jeongbae Park
    Mathematics, 2022

  7. Mimicking Infants’ Bilingual Language Acquisition for Domain Specialized Neural Machine Translation
    Chanjun Park, Woo-Young Go, Sugyeong Eo, Hyeonseok Moon, Seolhwa Lee, Heuiseok Lim
    IEEE Access, 2022

  8. An Automatic Post Editing with Efficient and Simple Data Generation Method
    Hyeonseok Moon, Chanjun Park, Jaehyung Seo, Sugyeong Eo, Heuiseok Lim (Equal Contribution(First Co-Author))
    IEEE Access, 2022

  9. Who speaks like a style of Vitamin: Towards Syntax-Aware Dialogue Summarization using Multi-task Learning
    Seolhwa Lee, Kisu Yang, Chanjun Park, João Sedoc, Heuiseok Lim
    IEEE Access, 2021

  10. Grounded Vocabulary for Image Retrieval Using a Modified Multi-Generator Generative Adversarial Network
    Kuekyeng Kim, Chanjun Park, Jaehyung Seo, Heuiseok Lim
    IEEE Access, 2021

  11. An Empirical Study on Automatic Post Editing for Neural Machine Translation
    Hyeonseok Moon, Chanjun Park, Sugyeong Eo, Jaehyung Seo, Heuiseok Lim (Equal Contribution(First Co-Author))
    IEEE Access, 2021

  12. Variational Reward Estimator Bottleneck: Towards Robust Reward Estimator for Multi-Domain Task-Oriented Dialogue
    Jeiyoon Park, Chanhee Lee, Chanjun Park, Kuekyeng Kim, Heuiseok Lim
    Applied Sciences, 2021

  13. Comparative Analysis of Current Approaches to Quality Estimation for Neural Machine Translation
    Sugyeong Eo, Chanjun Park, Hyeonseok Moon, Jaehyung Seo, Heuiseok Lim (Equal Contribution(First Co-Author))
    Applied Sciences, 2021

  14. Exploring the Data Efficiency of Cross-Lingual Post-Training in Pretrained Language Models
    Chanhee Lee, Kisu Yang, Taesun Whang, Chanjun Park, Andrew Matteson, Heuiseok Lim
    Applied Sciences, 2021

  15. Decoding Strategies for Improving Low-Resource Machine Translation
    Chanjun Park, YeongWookYang, Kinam Park, Heuiseok Lim
    Electronics, 2020

  16. Ancient Korean Neural Machine Translation
    Chanjun Park, Chanhee Lee, YeongWookYang, Heuiseok Lim
    IEEE Access, 2020

  17. Comparison of the evaluation metrics for Neural Grammatical Error Correction with Overcorrection
    Chanjun Park, YeongWookYang, Chanhee Lee, Heuiseok Lim
    IEEE Access, 2020

  18. Neural Spelling Correction: Translating Incorrect sentences to Correct sentences for Multimedia
    Chanjun Park, Kuekyeng Kim, YeongWookYang, Minho Kang, Heuiseok Lim
    Multimedia Tools and Applications, 2020

International & Domestic Conference

Domestic Journal (KCI): 20 papers, Domestic Conference: 32 papers, International Conference: 23 papers

Book Chapters

  1. Natural Langugae Processing Bible
    HeuiSeok Lim, Korea University NLP&AI Lab
    Human Science

International Patents

  1. METHOD FOR GENERATING TRAINING DATA AND METHOD FOR POST-PROCESSING OF SPEECH RECOGNITION USING THE SAME
    HeuiSeok Lim, Chanjun Park
    Apply for a patent

  2. METHOD OF BUILDING TRAINING DATA OF MACHINE TRANSLATION
    HeuiSeok Lim, Chanjun Park
    Apply for a patent

Domestic Patents

  1. DEVICE AND METHOD FOR GENERATING OF TRAINING DATA FOR QUALITY ESTIMATION IN MACHINE TRANSLATION
    HeuiSeok Lim, Sugyeong Eo, Chanjun Park, Hyeonseok Moon
    Apply for a patent

  2. APPRATUS FOR CORPUS PROCESSING, APPARATUS AND METHOD AND MATHINE TRANSLATION
    Chanjun Park, HeuiSeok Lim
    Apply for a patent

  3. DEVICE AND METHOD FOR GENERATING TRAINING DATA FOR AUTOMATIC POST EDITING
    HeuiSeok Lim, Hyeonseok Moon, Chanjun Park, Sugyeong Eo
    Apply for a patent

  4. DEVICE AND METHOD FOR GENERATING OPTIMAL TRANSLATION SUBTITLE USING QUALITY ESTIMATION
    HeuiSeok Lim, Chanjun Park
    Apply for a patent

  5. Improving speech recognition performance using TTS in domain-specific environment
    HeuiSeok Lim, Chanjun Park
    Apply for a patent

  6. Method For Generating Training Data And Method For Post-Processing Of Speech Recognition Using The Same
    HeuiSeok Lim, Chanjun Park
    Apply for a patent

  7. METHOD OF BUILDING TRAINING DATA OF MACHINE TRANSLATION
    HeuiSeok Lim, Chanjun Park
    Apply for a patent

  8. Correction performance evaluation metrics of neural network machine translation and method of constructing the same
    HeuiSeok Lim, Chanjun Park
    Granted Patent (10-2390154)

  9. APPARATUS AND METHOD FOR OUTPUTTING IMAGE CORRESPONDING TO LANGUAGE
    HeuiSeok Lim, Chanjun Park, Yanghee Kim
    Apply for a patent

  10. METHOD OF TRANSLATING ANCIENT KOREAN USING MACHINE TRANSLATION
    HeuiSeok Lim, Chanjun Park
    Apply for a patent

  11. Device and method for correcting Korean spelling
    HeuiSeok Lim, Chanjun Park
    Apply for a patent

Teaching

  1. Introduction to Natural Language Processing in Big Data (BDC101), Teaching Assistant, Korea Univ. (Autumn 2021)
  2. Introduction to Natural Language Processing in Big Data (BDC101), Head Teaching Assistant, Korea Univ. (Autumn 2020)
  3. Natural Language Processing for Digital Finance Engineering (DFE610), Head Teaching Assistant, Korea Univ. (Autumn 2020)
  4. Natural Language Processing (COSE461), Teaching Assistant, Korea Univ. (Spring 2020)
  5. Artificial Intelligence and Natural Language Processing (DFC615), Teaching Assistant, Korea Univ. (Spring 2020)

Honors & Awards

Year Award
2021.12 Naver Ph.D. Fellowship 2021
2021.10 Best Paper Award, The 33rd Annual Conference on Human & Cognitive Language Technology (HCLT2021) - NLP Application 2 Section
2021.10 Best Paper Award, The 33rd Annual Conference on Human & Cognitive Language Technology (HCLT2021) - Language Resource Section
2021.10 Best Paper Award, The 33rd Annual Conference on Human & Cognitive Language Technology (HCLT2021) - QA and Speech Section
2020.11 1st Place in Flitto Hackathon (Team Lead)
2020.10 Best Paper Award, The 32nd Annual Conference on Human & Cognitive Language Technology (HCLT2020)
2020.05 Best practices for using NIA AI training data(Korean-English Neural Machine Translation model), NIA
2019.10 Best Paper Award, The 31st Annual Conference on Human & Cognitive Language Technology (HCLT2019)
2019.10 1st Place Microsoft AI Accessibility Hackathon in Korea (Team Lead), Microsoft
2019.03 Graduate School Associate Scholarship, Sungkyunkwan University
2018.10 Next Generation Information Processing NLP Competition 2018: Participation Award, Next-generation information computing technology development business
2017.06 Bit Computer Excellence Award (President Award), Bit Computer
2017.12 Scholarship for academic excellence, Sooyoungro Church
2016.12 Scholarship for academic excellence, Sooyoungro Church
2015.03 Full Scholarship, BUFS

Invited Talk

Year Place Contents
2022.01 Dongguk University Artificial intelligence and Machine Translation
2021.07 Busan Social Welfare Development Group Attending advisory meetings and Focus Group Interview
2020.03 LLsoLLu Latest natural language processing Research
2020.02 NC SOFT Technology Transfer Seminar
2020.01 Dongguk University A.I - NLP - MT for Liberal Arts
2019.10-2019.11 SKC Text Preprocessing, Machine Translation, Language Embedding
2019.08 SK T Academy Machine Translation for everyone
2019.08 NAVER Machine Translation for everyone