Kong AI Gateway Demo: Streamlining Multi-Provider LLM Usage, Security & Monitoring
Try Kong AI Gateway: https://bit.ly/3xhKw49
Timestamps:
0:00 Introduction
0:25 Recap of Kong 3.6 features
1:35 Unified inference format across multiple providers (OpenAI, Anthropic, etc.)
3:51 Plugins for enhancing and restricting usage of language models
6:02 Detailed contextual usage statistics
7:02 Augmenting existing APIs with LLM features
10:26 Response streaming and token streaming support
11:40 OpenAI SDK support
12:52 Token-based rate limiting (Enterprise feature)
15:12 Azure Content Safety introspection (Enterprise feature)
16:54 Upcoming features and ideas
This demo showcases the latest features and enhancements in Kong's AI Gateway, as of the June 2024 release. This demo covers:
- Unified inference format across multiple providers (OpenAI, Anthropic, etc.) for easy compatibility and developer experience.
- Plugins for enhancing and restricting usage of language models: Decorator for appending or prepending context to chats, Templator for creating APIs with static parameters for controlled usage, and Prompt Guardian for allowing or blocking prompts based on regex patterns.
- Detailed contextual usage statistics, including token counts, latency information, and logging of input/output chats.
- Augmenting existing APIs with LLM features using request and response transformers.
- Response streaming and token streaming support for various providers.
- OpenAI SDK support with transparent model selection delegation to the Gateway.
- Token-based rate limiting for restricting access to specific models or providers based on token limits.
- Azure Content Safety introspection for moderation and safety checks across providers.
- Upcoming features and ideas, such as cooldown periods for users breaching safety limits and attaching specific roles to consumers for granular access control.