{ "info": { "author": "AI21 Labs", "author_email": null, "bugtrack_url": null, "classifiers": [ "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.10", "Programming Language :: Python :: 3.11", "Programming Language :: Python :: 3.12", "Programming Language :: Python :: 3.8", "Programming Language :: Python :: 3.9" ], "description": "
\n A SentencePiece based tokenizer for production uses with AI21's models\n
\n\n\n\n---\n\n## Installation\n\n### pip\n\n```bash\npip install ai21-tokenizer\n```\n\n### poetry\n\n```bash\npoetry add ai21-tokenizer\n```\n\n## Usage\n\n### Tokenizer Creation\n\n### Jamba Tokenizer\n\n```python\nfrom ai21_tokenizer import Tokenizer, PreTrainedTokenizers\n\ntokenizer = Tokenizer.get_tokenizer(PreTrainedTokenizers.JAMBA_INSTRUCT_TOKENIZER)\n# Your code here\n```\n\nAnother way would be to use our Jamba tokenizer directly:\n\n```python\nfrom ai21_tokenizer import JambaInstructTokenizer\n\nmodel_path = \"