Chanjun Park is a researcher in the field of Natural Language Processing (NLP), with a focus on Data-centric AI and Machine Translation. He is currently working as an AI Research Engineer at Upstage and is expected to earn his Ph.D in the Department of Computer Science and Engineering at Korea University in August 2023, under the supervision of Professor Heuiseok Lim. From 2018 to 2019, he worked at SYSTRAN as a Research Engineer. Chanjun is the founder and leader of the KU-NMT Group, and has received the Naver Ph.D. Fellowship in 2021. He served as the Virtual Social Chair at COLING 2022, and is currently serving as the Program Chair for the WiNLP Workshop,. He has published more than 140 papers in the field of natural language processing, and has registered 5 patents and applied for 13 more. See CV for more information.
Data-centric AI, Machine Translation, Language Resource and Evaluation, Real-World-centric AI
DMOps: Data Management Operation and Recipes
Eujeong Choi, Chanjun Park (Corresponding Author)
arxiv, 2023
Language Chameleon: Transformation analysis between languages using Cross-lingual Post-training based on Pre-trained language models
Suhyune Son, Chanjun Park, Jungseob Lee, Midan Shim, Chanhee Lee, Yoonna Jang, Jaehyung Seo, Heuiseok Lim (Equal Contribution(First Co-Author))
arxiv, 2022
There is no rose without a thorn: Finding weaknesses on BlenderBot 2.0 in terms of Model, Data and User-Centric Approach
Jungseob Lee, Suhyune Son, Midan Shim, Yujin Kim, Chanjun Park, Heuiseok Lim (Equal Contribution(First Co-Author))
arxiv, 2022
PicTalky: Augmentative and Alternative Communication for Language Developmental Disabilities
Chanjun Park, Yoonna Jang, Seolhwa Lee, Jaehyung Seo, Kisu Yang, Heuiseok Lim
AACL-IJCNLP 2022 - Demo Track, 2022
KU X Upstage’s submission for the WMT22 Quality Estimation: Critical Error Detection Shared Task
Sugyeong Eo, Chanjun Park, Hyeonseok Moon, Jaehyung Seo, Heuiseok Lim
WMT 2022 - EMNLP 2022, 2022
QUAK: A Synthetic Quality Estimation Dataset for Korean-English Neural Machine Translation
Sugyeong Eo, Chanjun Park, Hyeonseok Moon, Jaehyung Seo, Gyeongmin Kim, Jungseob Lee, Heuiseok Lim
COLING 2022, 2022
Focus on FoCus: Is FoCus focused on Context, Knowledge and Persona?
SeungYoon Lee, Jungseob Lee, Chanjun Park, Sugyeong Eo, Hyeonseok Moon, Jaehyung Seo, Jeongbae Park, Heuiseok Lim
COLING 2022 - The 1st Workshop on Customized Chat Grounding Persona and Knowledge , 2022
A Self-Supervised Automatic Post-Editing Data Generation Tool
Hyeonseok Moon, Chanjun Park, Sugyeong Eo, Jaehyung Seo, Seungjun Lee, Heuiseok Lim
ICML 2022 - DataPerf workshop, 2022
A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation
Jaehyung Seo, Seounghoon Lee, Chanjun Park, Yoonna Jang, Hyeonseok Moon, Sugyeong Eo, Seonmin Koo, Heuiseok Lim
NAACL 2022 - Findings, 2022
Priming Ancient Korean Neural Machine Translation
Chanjun Park, Seolhwa Lee, Hyeonseok Moon, Sugyeong Eo, Jaehyung Seo, Heuiseok Lim
LREC 2022, 2022 - (Oral presentation)
FreeTalky: Don’t Be Afraid! Conversations Made Easier by a Humanoid Robot using Persona-based Dialogue
Chanjun Park, Yoonna Jang, Seolhwa Lee, Sungjin Park, Heuiseok Lim
LREC 2022, 2022 - (Poster)
Empirical Analysis of Synthetic Data Generation Using Noising Strategies for Automatic Post-editing
Hyeonseok Moon, Chanjun Park, Seolhwa Lee, Jaehyung Seo, Jeongsub Lee, Sugyeong Eo, Heuiseok Lim
LREC 2022, 2022 - (Poster)
FreeTalky: Don’t Be Afraid! Conversations Made Easier by a Humanoid Robot using Persona-based Dialogue
Chanjun Park, Yoonna Jang, Seolhwa Lee, Sungjin Park, Heuiseok Lim
AAAI 2022 -Artificial Intelligence for Education(AI4EDU), 2022
How should human translation coexist with NMT? Efficient tool for building high quality parallel corpus
Chanjun Park, Seolhwa Lee, Hyeonseok Moon, Sugyeong Eo, Jaehyung Seo, Heuiseok Lim
NeurIPS 2021 - Data-centric AI (DCAI) workshop, 2021
A New Tool for Efficiently Generating Quality Estimation Datasets
Sugyeong Eo, Chanjun Park, Jaehyung Seo, Hyeonseok Moon, Heuiseok Lim
NeurIPS 2021 - Data-centric AI (DCAI) workshop, 2021
Automatic Knowledge Augmentation for Generative Commonsense Reasoning
Jaehyung Seo, Chanjun Park, Sugyeong Eo, Hyeonseok Moon, Heuiseok Lim
NeurIPS 2021 - Data-centric AI (DCAI) workshop, 2021
Syntax-enhanced Dialogue Summarization using Syntax-aware information
Seolhwa Lee, Kisu Yang, Chanjun Park, João Sedoc, Heuiseok Lim
NeurIPS 2021 - Women in Machine Learning (WiML 2021) workshop, 2021 - (Contributed Talk / Oral presentation)
Towards Syntax-Aware Dialogue Summarization using Multi-task Learning
Seolhwa Lee, Kisu Yang, Chanjun Park, João Sedoc, Heuiseok Lim
EMNLP 2021 -Widening NLP (WiNLP2021) workshop, 2021 - (Poster)
Two Heads are Better than One? Verification of Ensemble Effect in Neural Machine Translation
Chanjun Park, Sungjin Park, Seolhwa Lee, Taesun Whang, Heuiseok Lim
EMNLP 2021 -The Second Workshop on Insights from Negative Results in NLP, 2021 - (Oral presentation)
BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text
Chanjun Park, Jaehyung Seo, Seolhwa Lee, Chanhee Lee, Hyeonseok Moon, Sugyeong Eo, Heuiseok Lim
ACL 2021 -WAT(Workshop on Asian Translation) 2021 Workshop, 2021 - (oral presentation)
Dealing with the Paradox of Quality Estimation
Sugyeong Eo, Chanjun Park, Jaehyung Seo, Hyeonseok Moon, Heuiseok Lim (Equal Contribution(First Co-Author))
MT Summit 2021 - LoResMT, 2021 - (Oral presentation)
Should we find another model?: Improving Neural Machine Translation Performance with ONE-Piece Tokenization Method without Model Modification
Chanjun Park, Sugyeong Eo, Hyeonseok Moon, Heuiseok Lim
NAACL-HLT 2021 Industry Track, 2021- (Poster/Oral presentation)
A Survey on Evaluation Metrics for Machine Translation
Seungjun Lee, Jungseob Lee, Hyeonseok Moon, Chanjun Park, Jaehyung Seo, Sugyeong Eo, Seonmin Koo, Heuiseok Lim
Mathematics, 2023
K-NCT: Korean Neural Grammatical Error Correction Gold-Standard Test Set Using Novel Error Type Classification Criteria
Seonmin Koo, Chanjun Park, Jaehyung Seo, Seungjun Lee, Hyeonseok Moon, Jungseob Lee, Heuiseok Lim (Equal Contribution(First Co-Author))
IEEE Access, 2022
Plain Template Insertion: Korean-Prompt-based Engineering for Few-shot Learners
Jaehyung Seo, Hyeonseok Moon, Chanhee Lee, Sugyeong Eo, Chanjun Park, Jihoon Kim, Changwoo Chun, Heuiseok Lim
IEEE Access, 2022
The ASR post-processor performance challenges of BackTranScription (BTS) : Data-Centric and Model-Centric Approaches
Chanjun Park, Jaehyung Seo, Seolhwa Lee, Chanhee Lee, Heuiseok Lim
Mathematics, 2022
PU-GEN: Enhancing Generative Commonsense Reasoning for Language Models with Human-Centered Knowledge
Jaehyung Seo, Dongsuk Oh, Sugyeong Eo, Chanjun Park, Kisu Yang, Hyeonseok Moon, Kinam Park, Heuiseok Lim
Knowledge-Based Systems, 2022
Utilization Strategy of User Engagements in Korean Fake News Detection
Myunghoon Kang, Jaehyung Seo, Chanjun Park, Heuiseok Lim
IEEE Access, 2022
BERTOEIC: Solving TOEIC Problems Using Simple and Efficient Data Augmentation Techniques with Pretrained Transformer Encoders
Jeongwoo Lee, Hyeonseok Moon, Chanjun Park, Jaehyung Seo, Sugyeong Eo, Heuiseok Lim
Applied Sciences, 2022
Empirical Analysis of Parallel Corpora and in-depth Analysis using LIWC
Chanjun Park, Midan Shim, Sugyeong Eo, Seolhwa Lee, Jaehyung Seo, Hyeonseok Moon, Heuiseok Lim
Applied Sciences, 2022
AI for Patents: A Novel yet Effective and Efficient Framework for Patent Analysis
Junyoung Son, Hyeonseok Moon, Jeongwoo Lee, Seolhwa Lee, Chanjun Park, Wonkyung Jung, Heuiseok Lim
IEEE Access, 2022
AI student: A Machine Reading Comprehension System for the Korean College Scholastic Ability Test
Gyeongmin Kim, Soomin Lee, Chanjun Park, Jaechoon Jo
Mathematics, 2022
Return on Advertising Spend Prediction with Task Decomposition based LSTM Model
Hyeonseok Moon, Taemin Lee, Jaehyung Seo, Chanjun Park, Sugyeong Eo, Imatitikua D. AIyanyo, Jeongbae Park, Aram So, Kyoungwha Ok, Kinam Park
Mathematics, 2022
Word-level Quality Estimation for Korean-English Neural Machine Translation
Sugyeong Eo, Chanjun Park, Hyeonseok Moon, Jaehyung Seo, Heuiseok Lim (Equal Contribution(First Co-Author))
IEEE Access, 2022
Dense-to-Question and Sparse-to-Answer: Hybrid Retriever System for Industrial Frequently Asked Questions
Jaehyung Seo, Taemin Lee, Hyeonseok Moon, Chanjun Park, Sugyeong Eo, Imatitikua D AIyanyo, Kinam Park, Aram So, Sungmin Ahn, Jeongbae Park
Mathematics, 2022
Mimicking Infants’ Bilingual Language Acquisition for Domain Specialized Neural Machine Translation
Chanjun Park, Woo-Young Go, Sugyeong Eo, Hyeonseok Moon, Seolhwa Lee, Heuiseok Lim
IEEE Access, 2022
An Automatic Post Editing with Efficient and Simple Data Generation Method
Hyeonseok Moon, Chanjun Park, Jaehyung Seo, Sugyeong Eo, Heuiseok Lim (Equal Contribution(First Co-Author))
IEEE Access, 2022
Who speaks like a style of Vitamin: Towards Syntax-Aware Dialogue Summarization using Multi-task Learning
Seolhwa Lee, Kisu Yang, Chanjun Park, João Sedoc, Heuiseok Lim
IEEE Access, 2021
Grounded Vocabulary for Image Retrieval Using a Modified Multi-Generator Generative Adversarial Network
Kuekyeng Kim, Chanjun Park, Jaehyung Seo, Heuiseok Lim
IEEE Access, 2021
An Empirical Study on Automatic Post Editing for Neural Machine Translation
Hyeonseok Moon, Chanjun Park, Sugyeong Eo, Jaehyung Seo, Heuiseok Lim (Equal Contribution(First Co-Author))
IEEE Access, 2021
Variational Reward Estimator Bottleneck: Towards Robust Reward Estimator for Multi-Domain Task-Oriented Dialogue
Jeiyoon Park, Chanhee Lee, Chanjun Park, Kuekyeng Kim, Heuiseok Lim
Applied Sciences, 2021
Comparative Analysis of Current Approaches to Quality Estimation for Neural Machine Translation
Sugyeong Eo, Chanjun Park, Hyeonseok Moon, Jaehyung Seo, Heuiseok Lim (Equal Contribution(First Co-Author))
Applied Sciences, 2021
Exploring the Data Efficiency of Cross-Lingual Post-Training in Pretrained Language Models
Chanhee Lee, Kisu Yang, Taesun Whang, Chanjun Park, Andrew Matteson, Heuiseok Lim
Applied Sciences, 2021
Decoding Strategies for Improving Low-Resource Machine Translation
Chanjun Park, YeongWookYang, Kinam Park, Heuiseok Lim
Electronics, 2020
Ancient Korean Neural Machine Translation
Chanjun Park, Chanhee Lee, YeongWookYang, Heuiseok Lim
IEEE Access, 2020
Comparison of the evaluation metrics for Neural Grammatical Error Correction with Overcorrection
Chanjun Park, YeongWookYang, Chanhee Lee, Heuiseok Lim
IEEE Access, 2020
Neural Spelling Correction: Translating Incorrect sentences to Correct sentences for Multimedia
Chanjun Park, Kuekyeng Kim, YeongWookYang, Minho Kang, Heuiseok Lim
Multimedia Tools and Applications, 2020
METHOD FOR GENERATING TRAINING DATA AND METHOD FOR POST-PROCESSING OF SPEECH RECOGNITION USING THE SAME
HeuiSeok Lim, Chanjun Park
Apply for a patent (17/739,383)
METHOD OF BUILDING TRAINING DATA OF MACHINE TRANSLATION
HeuiSeok Lim, Chanjun Park
Apply for a patent (PCT/KR2021/012195)
DEVICE AND METHOD FOR GENERATING OF TRAINING DATA FOR QUALITY ESTIMATION IN MACHINE TRANSLATION
HeuiSeok Lim, Sugyeong Eo, Chanjun Park, Hyeonseok Moon
Apply for a patent (10-2021-0156657)
APPRATUS FOR CORPUS PROCESSING, APPARATUS AND METHOD AND MATHINE TRANSLATION
Chanjun Park, HeuiSeok Lim
Apply for a patent (10-2021-0127438)
DEVICE AND METHOD FOR GENERATING TRAINING DATA FOR AUTOMATIC POST EDITING
HeuiSeok Lim, Hyeonseok Moon, Chanjun Park, Sugyeong Eo
Apply for a patent (10-2021-0118924)
DEVICE AND METHOD FOR GENERATING OPTIMAL TRANSLATION SUBTITLE USING QUALITY ESTIMATION
HeuiSeok Lim, Chanjun Park
Apply for a patent (10-2021-0117011)
Improving speech recognition performance using TTS in domain-specific environment
HeuiSeok Lim, Chanjun Park
Apply for a patent (10-2021-0028816)
Method For Generating Training Data And Method For Post-Processing Of Speech Recognition Using The Same
HeuiSeok Lim, Chanjun Park
Apply for a patent (10-2021-0060914)
METHOD OF BUILDING TRAINING DATA OF MACHINE TRANSLATION
HeuiSeok Lim, Chanjun Park
Granted Patent (10-2409667)
Correction performance evaluation metrics of neural network machine translation and method of constructing the same
HeuiSeok Lim, Chanjun Park
Granted Patent (10-2390154)
APPARATUS AND METHOD FOR OUTPUTTING IMAGE CORRESPONDING TO LANGUAGE
HeuiSeok Lim, Chanjun Park, Yanghee Kim
Granted Patent (10-2476497)
METHOD OF TRANSLATING ANCIENT KOREAN USING MACHINE TRANSLATION
HeuiSeok Lim, Chanjun Park
Granted Patent (10-2425922)
Device and method for correcting Korean spelling
HeuiSeok Lim, Chanjun Park
Granted Patent (10-2430918)
Year | Award |
---|---|
2023.02 | Best Paper Award, Korea University |
2023.02 | Research Encouragement Scholarship, Korea University |
2022.12 | 1st place in Quality Estimation Shared Task 2022 - Sentence-level “Critical Error Detection”, WMT 2022 (EMNLP 2022) |
2022.10 | Best Paper Award, The 34th Annual Conference on Human & Cognitive Language Technology (HCLT2022) |
2021.12 | Naver Ph.D. Fellowship 2021 |
2021.10 | Best Paper Award, The 33rd Annual Conference on Human & Cognitive Language Technology (HCLT2021) - NLP Application 2 Section |
2021.10 | Best Paper Award, The 33rd Annual Conference on Human & Cognitive Language Technology (HCLT2021) - Language Resource Section |
2021.10 | Best Paper Award, The 33rd Annual Conference on Human & Cognitive Language Technology (HCLT2021) - QA and Speech Section |
2021.07 | Ranked 4th on the CommonGen 1.1 Leaderboard (Nov. 2022 Ranked 7th, CommonGen 1.1) |
2020.11 | 1st Place in Flitto Hackathon (Team Lead) |
2020.10 | Best Paper Award, The 32nd Annual Conference on Human & Cognitive Language Technology (HCLT2020) |
2020.05 | Best practices for using NIA AI training data(Korean-English Neural Machine Translation model), NIA |
2019.10 | Best Paper Award, The 31st Annual Conference on Human & Cognitive Language Technology (HCLT2019) |
2019.10 | 1st Place Microsoft AI Accessibility Hackathon in Korea (Team Lead), Microsoft |
2019.03 | Graduate School Associate Scholarship, Sungkyunkwan University |
2018.10 | Next Generation Information Processing NLP Competition 2018: Participation Award, Next-generation information computing technology development business |
2017.06 | Bit Computer Excellence Award (President Award), Bit Computer |
2017.12 | Scholarship for academic excellence, Sooyoungro Church |
2016.12 | Scholarship for academic excellence, Sooyoungro Church |
2015.03 | Full Scholarship, BUFS |
Year | Place | Contents |
---|---|---|
2023.02 | AI·DATA SUMMIT 2023 | Real-World Centric AI, (Video) |
2022.12 | Sunmoon University | Real-World Centric AI |
2022.08 | Kyungsung University | Language and Information Studies and the Future of Artificial Intelligence |
2022.07 | Hankuk University of Foreign Studies | Basic practice of natural language processing for everyone |
2022.01 | Dongguk University | Artificial intelligence and Machine Translation |
2021.07 | Busan Social Welfare Development Group | Attending advisory meetings and Focus Group Interview |
2020.03 | LLsoLLu | Latest natural language processing Research |
2020.02 | NC SOFT | Technology Transfer Seminar |
2020.01 | Dongguk University | A.I - NLP - MT for Liberal Arts |
2019.10-2019.11 | SKC | Text Preprocessing, Machine Translation, Language Embedding |
2019.08 | SK T Academy | Machine Translation for everyone |
2019.08 | NAVER | Machine Translation for everyone |