Master's thesis proposal: Natural Language Processing of Textual Use Cases

Advisor: Vladimir Mencl
Student: Jaroslav Dra┼żan

The design of a software system or component starts with specifying its requirements; traditionally, use cases written in natural language (English) are used for this task. Based on the simple and uniform sentence structure used in textual use cases [10], a conversion scheme [1, 3] has been proposed in the Procasor project [12] to derive behavior specifications from textual use cases. The scheme has been implemented in a prototype tool, employing a suite of readily available natural language processing tools [7, 8, 9].

In this preliminary work, certain issues remain open, such as evaluating the quality of the parse tree provided by the linguistic tools. Recent advances in the natural language processing tools [5] permit to obtain several possible parse trees for a sentence; furthermore, there are several different parsers available which may yield different parse trees.

The goal of the thesis is to build on the conversion scheme described in [1, 3] and propose metric to evaluate the quality of a parse tree. The thesis should address the issue of evaluating several parse trees of a sentence specifying a use case step and possibly also the issue of combining the information available in the parse trees. The thesis should also address the issue of constructing matching event tokens for complementary send / receive actions in use case models of communicating entities. The thesis should be supported by a proof-of-the-concept implementation.

References

[1] Mencl, V.: Deriving Behavior Specifications from Textual Use Cases, in Proceedings of Workshop on Intelligent Technologies for Software Engineering (WITSE04, Sep 21, 2004, part of ASE 2004), Linz, Austria, ISBN 3-85403-180-7, pp. 331-341, Oesterreichische Computer Gesellschaft, Sep 2004

[2] Plasil, F., Mencl, V.: Getting "Whole Picture" Behavior in a Use Case Model, in Transactions of the SDPS: Journal of Integrated Design and Process Science, vol. 7, no. 4, pp. 63-79, Dec 2003, ISSN-1092-0617, publisher: Society for Design and Process Science, Grandview, Texas, slightly modified version of paper published in Proceedings of IDPT 2003, Dec 2003

[3] Mencl, V.: Converting Textual Use Cases into Behavior Specifications, Tech. Report No. 2004/5, Dept. of SW Engineering, Charles University, Prague, Aug 2004

[4] Plasil, F., Mencl, V.: Use Cases: Assembling "Whole Picture" Behavior, Technical Report 02/11, Department of Computer Science, University of New Hampshire, NH, U.S.A., Nov 2002

[5] Bikel, D. M. : Design of a Multi-lingual, Parallel-processing Statistical Parsing Engine, in Proceedings of HLT 2002, http://www.cis.upenn.edu/~dbikel/software.html#stat-parser

[6] Eugene Charniak: Statistical Techniques for Natural Language Parsing, AI Magazine 18(4): 33-44, 1997,

[7] Michael Collins: A New Statistical Parser Based on Bigram Lexical Dependencies., Proceedings of 34th Annual Meeting of the Association for Computational Linguistics, ACL 1996, 24-27 June 1996, University of California, Santa Cruz, California, USA, Morgan Kaufmann Publishers, 1996, http://www.cis.upenn.edu/~mcollins/

[8] Adwait Ratnaparkhi: A Maximum Entropy Part-Of-Speech Tagger, Proceedings of the Empirical Methods in Natural Language Processing Conference, May 17-18, 1996. University of Pennsylvania, 1996, http://www.cis.upenn.edu/~adwait/statnlp.html

[9] Minnen, G., Carroll J., Pearce, D.: Applied morphological processing of English, Natural Language Engineering, 7(3), pp. 207-223, 2001, http://www.informatics.susx.ac.uk/research/nlp/carroll/abs/01mcp.html

[10] Cockburn, A.: Writing Effective Use Cases, Addison-Wesley Pub Co, ISBN: 0201702258, 1st edition, Jan 2000

[11] List of related work on requirement specifications and use cases, http://nenya.ms.mff.cuni.cz/related.phtml?p=reqspecuc

[12] Procasor project, http://nenya.ms.mff.cuni.cz/procasor