Module 3: Natural Language Processing | Yixin Zhu

Overview

This week focuses on Natural Language Processing (NLP), a crucial field in AI that deals with the interaction between computers and humans using natural language. We’ll explore fundamental concepts, modern architectures, and practical applications of NLP.

Instructor

Zilong Zheng, BIGAI

Topics Covered

From PyTorch to Transformers
Tokenization, Embedding, and Attention mechanisms
Encoder-Decoder and Transformer architectures
Introduction to LangChain and LlamaIndex (Optional)

Assignments

Practice Assignment:

Implement a Transformer-based model for the MultiNLI task task. Use the Hugging Face Transformers library and PyTorch.
Fine-tune the model on the MultiNLI dataset, and include the loss curve in your report.
Evaluate the model’s performance using appropriate metrics (accuracy, F1 score), and include the metrics score in your report.
Provide analysis of the results, including examples of correct and incorrect predictions.

Written Assignment: For Written Assignment, you only need to submit a PDF report (written in LaTeX) in your Github classroom repo. The report should include the implementation approach for the Practice Assignment (excluding the Bonus part)

Assignment: Natural Language Processing

Additional Resources

Notes

This module builds upon your PyTorch knowledge and introduces NLP-specific concepts and libraries.
Pay special attention to the attention mechanism and its implementation in Transformers.
For the practice assignment, consider experimenting with pre-trained models and fine-tuning them for your specific task.
As always, document your code thoroughly and use version control (Git) for your project.
Submit your assignments on GitHub Classroom.