In this unit, you will be introduced to the foundational concepts of NLP such as history, goals and intersection with other fields of inquiry.
This unit covers speech technology, state-of-the-art developments, status and recent trends in speech technology as well as a glance at speech synthesis.
This unit covers some aspects of using NLP in morphology and morphological analysis.
This unit covers interesting topics related to text corpora, and the linguistic principles which have to be observed during designing and manipulating such corpora.
This course introduces you to the foundations of Natural Language Processing (NLP). It presents theoretical and practical issues in NLP and suggests mechanisms of addressing such issues.
This course is about Natural Language Processing, the study of using computers in generating and understanding natural languages.
It is perferred to have some computer science background or linguistics background. This course is suitable for linguistics students and computer science students who want to specialize in natural language processing.
Study the materials, pass the exams and submit your assignments.Submitting assignments is an optional. You can consider submitting assignments if you want to improve your skills in research and have feedback on your writings.
In order to obtain a certificate, you should score at least 50 out of 100.
Yes you can retake as many time as you can until you get a certificate.
This course offers a full-time 3 credits. It is equivalent to a paper in a Masters of Arts degree specialization.
The marks are calculated as follows:
1. First Internal Assessment: 20 Marks.
2. Second Internal Assessment: 20 Marks.
3. Final Exam: 60 Marks.
Definition of NLP. History of NLP. Goals of NLP. NLP from an artificial intelligence perspective. NLP as an interdisciplinary field. Open problems in NLP: speech processing, morphological processing, semantics and pragmatics.
Speech technology. State-of-the-art speech technology. Current trends in speech technology: memory cost and parallel processing, intelligent processing. Speech synthesis: frontend and backend components.
Morphological processing. Definition of morphology, root, stem and base. Types of morphology: inflectional morphology, derivational morphology and non-concatenative morphology. Tokenization. Part-of-speech tagging: POS taggers, tagset, rule-based taggers, statistical taggers, Hidden Markov Models, maximum entropy models. Stemming and lemmatization.
Text corpora. Definition of text corpus. Chomskyan linguistics vs. corpus linguistics. Corpus design: proportional sampling and stratified sampling. Diversity and size in corpus design: number of texts, number of samples and number of words. Methods of corpus linguistics: annotation, abstraction and analysis. Parallel corpora, uses and examples.
Steven Bird, et. al. (2009). Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. Bangalore: Shroff.