Xuan-Hieu Phan (pxhieu at gmail dot com), Graduate School of Information Sciences, Tohoku University

JTextPro: A Java-based Text Processing toolkit that currently includes important processing steps for natural language/text processing as follows:

This project aims at integrating essential processing steps for developing higher-level applications in natural language processing, text/web data mining, and information extraction. This tool frees developers from time-consuming preprocessing work in order to focus on developing their own features. In the future, we continue to improve the performance of the toolkit and update new features while keeping the source code well-organized and easy to incorporating into other applications.


Researches using this tool for running experiments should include the following citation:

Xuan-Hieu Phan, "JTextPro: A Java-based Text Processing Toolkit",, 2006.

We would like to thank professor Tu-Bao Ho for providing us Penn Treebank data for training the POS tagging and chunking models. We would also like to thank for hosting this project.

