600 million to 13 billion parameters? Those are very small models... Most major LLMs are at least 600 billion, if not getting into the trillion parameter territory.
Not particularly surprising given you don't need a huge amount of data to fine tune those kinds of models anyway.
Still cool research and poisoning is a real problem. Especially with deceptive alignment being possible. It would be cool to see it tested on a larger model but I guess it would be super expensive to train one only for it to be shit because you deliberately poisoned it. Safety research isn't going to get the same kind of budget as development. :(