The challenges of measuring the interpretability of