Kong AI Gateway Demo: Streamlining Multi-Provider LLM Usage, Security & Monitoring

Kong

Jun 14, 2024

Kong

Timestamps:

0:00 Introduction

0:25 Recap of Kong 3.6 features

1:35 Unified inference format across multiple providers (OpenAI, Anthropic, etc.)

3:51 Plugins for enhancing and restricting usage of language models

6:02 Detailed contextual usage statistics

7:02 Augmenting existing APIs with LLM features

10:26 Response streaming and token streaming support

11:40 OpenAI SDK support

12:52 Token-based rate limiting (Enterprise feature)

15:12 Azure Content Safety introspection (Enterprise feature)

16:54 Upcoming features and ideas

This demo showcases the latest features and enhancements in Kong's AI Gateway, as of the June 2024 release. This demo covers:

Unified inference format across multiple providers (OpenAI, Anthropic, etc.) for easy compatibility and developer experience.
Plugins for enhancing and restricting usage of language models: Decorator for appending or prepending context to chats, Templator for creating APIs with static parameters for controlled usage, and Prompt Guardian for allowing or blocking prompts based on regex patterns.
Detailed contextual usage statistics, including token counts, latency information, and logging of input/output chats.
Augmenting existing APIs with LLM features using request and response transformers.
Response streaming and token streaming support for various providers.
OpenAI SDK support with transparent model selection delegation to the Gateway.
Token-based rate limiting for restricting access to specific models or providers based on token limits.
Azure Content Safety introspection for moderation and safety checks across providers.
Upcoming features and ideas, such as cooldown periods for users breaching safety limits and attaching specific roles to consumers for granular access control.

Monthly Archive