top of page
  • Writer's pictureMSU AI Club

Transformers

In this series of workshops, we learned about a relatively new kind of neural network called the Transformer. Like an RNN, these are great for working with sequences of data, like text, time series, audio, and even video and images.


In recent months, transformers have been at the core of some of the greatest breakthroughs in language models. If you've used ChatGPT, Bing Chat, you've seen what transformers are capable of.


This workshop series followed the two-week format.


Part A: Intro to Transformers

In the first week, members learned how transformers use the attention mechanism to model complex relationships between tokens in a sequence and can process an entire input in one forward pass, letting it train faster.


In particular, we explored how transformers are used in language modelling and text generation, to translate, generate, and classify text. Members trained their own transformer to classify product reviews, tweets, and other sentences as positive or negative.


Members learned that the process they used is called fine tuning.

There are many publicly available machine learning models that were trained on massive datasets and perform well on general language tasks. A common approach to creating machine learning models, is to start with an existing model as a base and train it on smaller dataset to achieve a specific task.


Slides

Workshop





Part B: Fine-tuning a Text Generator

In the second week, students used GPT-2 from Hugging Face to train their own text generator fine-tuned to write like Shakespeare.


Slides

In this workshop, instead of following a notebook, members worked from scratch, with the help of Chat GPT! If you would like to try it yourself, take a look at the slideshow:



Join the Conversation

If you want to learn more about transformers, see what students made in these workshops, or stay up to date with the AI Club community, join our Discord server.


73 views0 comments
bottom of page