MSU AI Club
Transformers
In this series of workshops, we learned about a relatively new kind of neural network called the Transformer. Like an RNN, these are great for working with sequences of data, like text, time series, audio, and even video and images.
In recent months, transformers have been at the core of some of the greatest breakthroughs in language models. If you've used ChatGPT, Bing Chat, you've seen what transformers are capable of.
This workshop series followed the two-week format.
Part A: Intro to Transformers
In the first week, members learned how transformers use the attention mechanism to model complex relationships between tokens in a sequence and can process an entire input in one forward pass, letting it train faster.
In particular, we explored how transformers are used in language modelling and text generation, to translate, generate, and classify text. Members trained their own transformer to classify product reviews, tweets, and other sentences as positive or negative.
Members learned that the process they used is called fine tuning.
There are many publicly available machine learning models that were trained on massive datasets and perform well on general language tasks. A common approach to creating machine learning models, is to start with an existing model as a base and train it on smaller dataset to achieve a specific task.
Slides
Workshop
Part B: Fine-tuning a Text Generator
In the second week, students used GPT-2 from Hugging Face to train their own text generator fine-tuned to write like Shakespeare.
Slides
In this workshop, instead of following a notebook, members worked from scratch, with the help of Chat GPT! If you would like to try it yourself, take a look at the slideshow:
Join the Conversation
If you want to learn more about transformers, see what students made in these workshops, or stay up to date with the AI Club community, join our Discord server.