Naive Safety
Jailbreak T2T Benchmark v0.5
The AILuminate Jailbreak benchmark evaluates AI system resistance to jailbreaking attempts across different attack scenarios. These results show the AILuminate Safety benchmark as the foundational, "naïve Safety" reference and the post-prompt injection attack safety rating as the Jailbreak score.
MLCommons applied the AILuminate v0.5 Jailbreak benchmark to a variety of publicly available AI systems from leading vendors. Results have been de-identified for the v0.5 release.
Benchmark Highlights
- As with the AILuminate v1.0 benchmark, no SUT received a safety grade of Excellent. Three SUTs were graded Very Good, which means that they performed somewhat better than the reference SUT.
- No SUT received a security grade better than Good.
- Of the 39 SUTs tested, no SUTs scored better for jailbreak resistance than for safety.
- Out of 39 SUTs tested, only four SUTs did not receive a lower grade for jailbreak resistance than for safety.
- Of 35 SUTs that were graded lower for jailbreak resistance than for safety, 29 were reduced by one grade level and 6 were reduced by two grade levels (five from Good to Poor and one from Very Good to Fair).
De-Identified System 1
Jailbreak
Score %
De-Identified System 2
Naive Safety
Jailbreak
Score %
De-Identified System 3
Naive Safety
Jailbreak
Score %
De-Identified System 4
Naive Safety
Jailbreak
Score %
De-Identified System 5
Naive Safety
Jailbreak
Score %
De-Identified System 6
Naive Safety
Jailbreak
Score %
De-Identified System 7
Naive Safety
Jailbreak
Score %
De-Identified System 8
Naive Safety
Jailbreak
Score %
De-Identified System 9
Naive Safety
Jailbreak
Score %
De-Identified System 10
Naive Safety
Jailbreak
Score %
De-Identified System 11
Naive Safety
Jailbreak
Score %
De-Identified System 12
Naive Safety
Jailbreak
Score %
De-Identified System 13
Naive Safety
Jailbreak
Score %
De-Identified System 14
Naive Safety
Jailbreak
Score %
De-Identified System 15
Naive Safety
Jailbreak
Score %
De-Identified System 16
Naive Safety
Jailbreak
Score %
For support questions, contact: ailuminate-support@mlcommons.org