-
-
-
D. V. Khmelev and W. J. Teahan.
Verification of text collections for text categorization and natural language processing.
Technical Report AIIA 03.1, School of Informatics,
University of Wales, Bangor, 2003.
-
-
W. J. Teahan and D. J. Harper.
Using compression based language models for text categorization.
In J. Callan, B. Croft and J. Lafferty, editors, Workshop on Language Modeling
and Information Retrieval, pages 83-88. ARDA,
Carnegie Mellon University, 2001.
-
-
W. J. Teahan and D. J. Harper.
Combining PPM models using a text mining approach.
In J. A. Storer and M. Cohn, editors, Proceedings of the IEEE
Data Compression Conference, pages 153-162. IEEE Computer Society,
Snowbird, UT, 2001.
ISBN 0-7695-0594-5.
-
-
W. J. Teahan, Y. Wen*, R. McNab*, and I. H. Witten*.
A Compression-based Algorithm for Chinese Word
Segmentation.
Computational Linguistics, 26(3):375-393, 2000.
ISSN 0891-2017.
-
-
W. J. Teahan.
Text Classification and Segmentation Using Minimum
Cross-Entropy.
In Proceedings of the International Conference on Content-based
Multimedia Information Access (RIAO 2000), pages 943-961. C.I.D.-C.A.S.I.S,
Paris, France, 2000.
ISBN 2-905450-07-X.
-
-
W. J. Teahan.
An improved interface for probabilistic models of
text.
In Technical Report.
School of Computer and Mathematical Sciences, The Robert Gordon University, 2000.
-
-
J. G. Cleary* and W. J. Teahan.
An open interface for probabilistic models of
text.
In Proceedings of the IEEE Data Compression Conference. IEEE
Computer Society Press, 1999.
-
-
I. H. Witten*, Z. Bray*, M. Mahoui*, and W. J. Teahan.
Text mining: A new frontier for lossless compression.
In IEEE Data Compression Conference. IEEE Computer Society
Press, 1999.
-
-
I. H. Witten, Z. Bray*, M. Mahoui*, and W. J. Teahan.
Using language models for generic entity extraction.
In ICML-99 Workshop on Machine learning in text data analysis.
Stockholm, Sweden, 1999.
-
-
William John Teahan.
Modelling English Text.
PhD thesis, University of Waikato, 1998.
-
-
W. J. Teahan, S. Inglis*, J. G. Cleary*, and G. Holmes*.
Correcting English text using PPM models.
In IEEE Data Compression Conference, pages 43-52. IEEE
Computer Society, Snowbird, UT, 1998.
ISBN 0-8186-8406-2.
-
-
W. J. Teahan and J. G. Cleary*.
Tag-based models of English text.
In IEEE Data Compression Conference, pages 43-52. IEEE
Computer Society, Snowbird,UT, 1998.
ISBN 0-8186-8406-2.
-
-
J. G. Cleary* and W. J. Teahan.
Unbounded length contexts for
PPM.
Computer Journal, 40(2/3):67-75, 1997.
ISSN 1460-2067.
Invited Paper.
-
-
W. J. Teahan and J. G. Cleary*.
Models of English text.
In IEEE Data Compression Conference. IEEE Computer Society
Press, 1997.
-
-
W. J. Teahan and J. G. Cleary*.
The entropy of English using PPM-based models.
In IEEE Data Compression Conference. IEEE Computer Society
Press, 1996.
-
-
J. G. Cleary* and W. J. Teahan.
Unbounded length contexts for PPM.
In IEEE Data Compression Conference. IEEE Computer Society
Press, 1995.
-
-
J. G. Cleary* and W. J. Teahan.
Experiments on the zero frequency problem.
In IEEE Data Compression Conference. IEEE Computer Society
Press, 1995.
-
-
W. J. Teahan.
Probability estimation for PPM.
In Proceedings of the New Zealand Computer Science Research Students' Conference. University of Waikato, Hamilton, New Zealand,
1995.
Bill Teahan (wjt@informatics.bangor.ac.uk)