TokChat: Tokenization of Text for Secured Peer-to-Peer Communication
Praneeth Vadlapati
VIT-AP University, India
praneeth.18bce7147@vitap.ac.in
ORCID: 0009-0006-2592-2564
Abstract: This paper presents TokChat, a system that utilizes a new approach designed for enhanced security in peer-to-peer communication using text messages. To improve the security of messages in a conversation, the system converts a message into tokens by leveraging existing pre-trained tokenizers that are commonly used across numerous natural language processing (NLP) tasks. The tokens represent a compact numerical form of text that captures the original message. A new private key is generated for every new conversation to ensure the confidentiality of the messages in the conversation. The tokens are encrypted using an existing AES-based cryptographic encryption approach, ensuring they can be securely transmitted to the recipient. The receiver receives the encrypted tokens, which are decrypted using the conversation-specific key. The decrypted tokens are converted to readable text using the tokenizer. The usage of the system ensures that the messages fail to be converted into readable text by successful attackers, even with the usage of the private conversation-specific key, without being aware of the tokenization process. The system offers a higher security standard for secure messaging in the use cases in which the confidentiality of the messages is crucial. The system has been tested using multiple text values and successfully tokenized the text, encrypted the tokens, decrypted the transmitted ciphertext, and reproduced the original text. The code is available at github.com/Pro-GenAI/TokChat.
Keywords: natural language processing (NLP), tokenization, cryptographic messaging, secure communication
DOI: 10.55041/IJSREM10800