BERT

Bidirectional Encoder Representations from Transformers

Based on transformers, a deep learning model in which every output element is connected to every input element, and the weightings between them are dynamically calculated based upon their connection.

Pre-training of Deep Bidirectional Transformers for language understanding
Performance depends on how big we want BERT to be:
- BERT Base
  
  12 Encoders and 110M Parameters
- BERT Large
  
  24 Encoders and 340M Parameters
Major application in Feature Engineering and creating Dynamic / Contextual Embeddings

Dynamic Embeddings: Embeddings differ with respect to the context surrounding the word
Trains model in both directions, forward and backwards. Thanks to Self Attention

This allows BERT to truly capture the Bi-directional semantics of the language and understand it better, a level to which even LSTM and BiLSTM were not able to interpret
BERT explained: Training, Inference, BERT vs GPT/LLamA, Fine tuning, [CLS] token

Explorer

Recent Notes

Visionary GenAI

Industry and Competitive Analysis

BERT

Graph View