Mapping Layer Similarity in LLMs with RSA & CKA

Ritik Bompilwar

2025 · Representational similarity analysis

Overview

This project applies Representational Similarity Analysis (RSA) and Centered Kernel Alignment (CKA) to study layer-wise similarity patterns in decoder-only LLMs across parallel English–Hindi prompts. For each layer, activations are standardized, a Gram matrix is built and double-centered, then flattened and L2-normalized into a CKA signature; cosine similarity between signatures recovers linear CKA. Stacking signatures and eigendecomposing their similarity matrix yields a 3D joint projection that makes cross-lingual alignment directly visible.

Cross-lingual alignment (EN↔HI)

Model	CKA	3D distance	Layers
GPT-OSS-20B	0.698	0.407	24
Llama-3.2-1B	0.684	0.519	16
Llama-3.2-3B	0.695	0.556	28
Llama-3.1-8B	0.653	0.603	32

A single language axis emerges in the joint projection (corr(PC2, language) ≈ 0.93). GPT-OSS-20B shows the strongest cross-lingual alignment — highest CKA and lowest 3D distance — while larger Llama models exhibit increased language-specific drift in deeper layers.

Citation

@misc{bompilwar2025mapping,
  title  = {Mapping Layer Similarity in Large Language Models with RSA & CKA},
  author = {Bompilwar, Ritik},
  year   = {2025},
  howpublished = {\url{https://github.com/RITIK-12/Llama_RSA}}
}