Step through Word2Vec skip-gram — watch a center word predict context words through embedding lookup, dot products, and softmax.
Shared Embedding Matrix
Word2Vec trains a shallow net to predict context from a center word. The weight matrix rows become the embeddings — similar words end up geometrically close.