

My understanding is that tokens are basically words, and that when you ask a question it charges for all the tokens it consumes, produces, or processes. There’s a lot of internal processing for each request, where the input text is summarized in different ways and combined with previous parts of the conversation, so it’s not as straightforward as “word count of what you say plus what it says”.


Polymorphic malware is probably one of the easier things to do with LLMs, so static scanners seem of limited use.