Abstract
It has been argued that BERT “rediscovers the traditional NLP pipeline”, with lower layers extracting morphosyntactic features and higher layers creating holistic sentence-level representations. In this paper, we critically examine this assumption through a principle-component-guided analysis, extracing sets of inputs that correspond to specific activation patterns in BERT sentence representations. We find that even in higher layers, the model mostly picks up on a variegated bunch of low-level features, many related to sentence complexity, that presumably arise from its specific pre-training objectives.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 15th International Conference on Computational Semantics |
| Place of Publication | Nancy, France |
| Publisher | Association for Computational Linguistics |
| Pages | 99-105 |
| Publication status | Published - Jun 2023 |
| Event | The 15th International Conference on Computational Semantics - Nancy, France Duration: 21 Jun 2023 → 23 Jun 2023 |
Conference
| Conference | The 15th International Conference on Computational Semantics |
|---|---|
| Abbreviated title | IWCS |
| Country/Territory | France |
| City | Nancy |
| Period | 21/06/23 → 23/06/23 |