top of page

AIX

Public·7 members

INVISIBLE PROMPT INJECTIONS



WHAT IS IT?

  • Prompt manipulation that uses Unicode characters that are NOT visible on a user interface but still interpretable by an LLM

  • Once interpreted, the LLM can respond to them


HOW DOES IT HAPPEN?

  • Unicode tag set ranges from E0000 to E007F

  • English letters, digits, and common punctuation marks can correspond to a “tagged” version by adding E0000 to an original Unicode point, making it to generate a malicious prompt - this can happen via a direct prompt or even if the LLM absorbs knowledge base files with invisible characters

  • Some LLMs split tag Unicode characters into identifiable tokens and if they interpret the original message prior to interpreting the tagged prompt, the LLM becomes susceptible to the invisible prompt injection


WHAT TO DO ABOUT IT?

  • Check if an LLM can respond to invisible Unicode characters

  • Check for invisible characters in documents before pasting them into a prompt or absorbing them into the knowledge base

  • Use AI protection solutions and run risk assessments for quality checks, eg. Saif Check's Data and LLM risk assessments

31 Views
bottom of page