Filipe Azevedo is a contributor at Blindsight, writing on AI misalignment, reward hacking, and interpretability.