Overview
This week focuses on Natural Language Processing (NLP), a crucial field in AI that deals with the interaction between computers and humans using natural language. We’ll explore fundamental concepts, modern architectures, and practical applications of NLP.
Instructor
Zilong Zheng, BIGAI
Topics Covered
- From PyTorch to Transformers
- Tokenization, Embedding, and Attention mechanisms
- Encoder-Decoder and Transformer architectures
- Introduction to LangChain and LlamaIndex (Optional)
Assignments
Practice Assignment:
- Implement a Transformer-based model for the MultiNLI task task. Use the Hugging Face Transformers library and PyTorch.
- Fine-tune the model on the MultiNLI dataset, and include the loss curve in your report.
- Evaluate the model’s performance using appropriate metrics (accuracy, F1 score), and include the metrics score in your report.
- Provide analysis of the results, including examples of correct and incorrect predictions.
Written Assignment: For Written Assignment, you only need to submit a PDF report (written in LaTeX) in your Github classroom repo. The report should include the implementation approach for the Practice Assignment (excluding the Bonus part)
Assignment: Natural Language Processing
Additional Resources
- Hugging Face Transformers Documentation
- Attention Is All You Need (Transformer paper)
- LangChain Documentation
- LlamaIndex Documentation
Notes
- This module builds upon your PyTorch knowledge and introduces NLP-specific concepts and libraries.
- Pay special attention to the attention mechanism and its implementation in Transformers.
- For the practice assignment, consider experimenting with pre-trained models and fine-tuning them for your specific task.
- As always, document your code thoroughly and use version control (Git) for your project.
- Submit your assignments on GitHub Classroom.